5.4 How do I…Detect unbalanced indexes?Problem
I have indexes on tables that are heavily inserted to, updated, or deleted from. I know that it is important to periodically examine all of my indexes to determine if they have become unbalanced. How do I detect unbalanced indexes?
Technique
An unbalanced index has many records clumped close together on the index tree due to their similar indexed values. When an index is unbalanced, parts of an index are accessed more frequently than others, from which disk contention may occur, creating a bottleneck in performance. Statistics can be gathered on an index by issuing ANALYZE INDEX INDEX_NAME COMPUTE|ESTIMATE STATISTICS. The statistics appear in the USER_INDEXES, ALL_INDEXES_, and DBA_INDEXES data dictionary views. Also, additional statistics can be placed into the INDEX_STATS virtual table by using the VALIDATE INDEX INDEX_NAME statement.
Steps
1. Connect to SQL*Plus as the WAITE user account. If you have not created the EMPLOYEE05 table, run the CHP5_1.SQL script to create it. Also, if you have not created the SSN_INDEX on the EMPLOYEE05 table, run the CHP5_2.SQL script to create it.
Initially, the EMPLOYEE05 table will have no records, and neither will the index have index entries. To determine statistics, you must either validate the index or generate statistics. This must be done manually, unless a program is written to automatically gather statistics. This How-To will demonstrate both methods.
2. Gather statistics on the SSN_INDEX. Run the CHP5_6.SQL script to generate statistics. When prompted for an owner, enter WAITE; and when prompted for an index, enter SSN_INDEX. This script may be reused for any index in the database. For large indexes (over one hundred thousand records in the underlying table), use ESTIMATE instead of COMPUTE for faster response time.
SQL> GET CHP5_6
1 ANALYZE INDEX &owner..&index_name
2 COMPUTE STATISTICS
SQL> /
Enter value for owner: WAITE
Enter value for index_name: SSN_INDEX
Index Analyzed.
SQL>
3. At this point, statistics will indicate that the index is well-balanced, because there are no records in the underlying EMPLOYEE05 table. You can determine statistics by querying the DBA_INDEXES data dictionary view. Later in this How-To is the description and explanation of the DBA_INDEXES view. Run the CHP5_7.SQL script to gather statistics on how balanced the index is. Enter SSN_INDEX for the INDEX_NAME, and WAITE for the owner.
SQL> GET CHP5_7.sql
1 SELECT BLEVEL, DECODE(BLEVEL,0,'OK BLEVEL',1,'OK BLEVEL',
2 2,'OK BLEVEL',3,'OK BLEVEL',
3 4,'OK BLEVEL','BLEVEL IS TOO HIGH') OK
4 FROM DBA_INDEXES
5 WHERE INDEX_NAME='&index_name' AND
6 OWNER='&owner'
SQL> /
Enter name for index_name: SSN_INDEX
Enter name from owner: WAITE
HEIGHT OK
------ ---------
0 OK BLEVEL
1 row selected.
You can see that the BLEVEL 0 is okay. The BLEVEL (or branch level) is part of the B-Trieve index format and relates to the number of times Oracle has to narrow its search on the index while searching for a particular record. In some cases, a separate disk hit is requested for each BLEVEL. If the BLEVEL were to be more than four, it is recommended to rebuild the index, as shown in How-To 5.5.
4. At this point, only the BLEVEL has been investigated. To further determine if an index is unbalanced, further statistics should be gathered. To do this, use the VALIDATE INDEX command by running CHP5_8.SQL. Enter SSN_INDEX for the INDEX_NAME and WAITE for the OWNER.
SQL> GET CHP5_8.SQL
1 VALIDATE INDEX &owner..&index_name
SQL> /
Enter value for owner: WAITE
Enter value for index_name: SSN_INDEX
Index analyzed.
SQL>
5. At this point, the INDEX_STATS virtual table is populated with current statistics for the SSN_INDEX index. Query information on the deleted leaf rows, as compared to the total rows by running the CHP5_9.SQL script. Again, enter SSN_INDEX for index_name. The INDEX_STATS table does not contain an OWNER column and assumes you are looking for statistics for indexes created by your active session only.
SQL> GET CHP5_9.sql
The PCT_DELETED column shows what percent of leaf entries (index entries) have been deleted and remain unfilled. The more deleted entries that exist on an index, the more unbalanced the index becomes. If the PCT_DELETED is 20% or higher, the index is a candidate for rebuilding. If you are more stringent and have time to rebuild indexes more frequently, then do so if the value is higher than 10%.1 SELECT DEL_LF_ROWS*100 /
2 DECODE((DEL_LF_ROWS+LF_ROWS),0,
3 1,(DEL_LF_ROWS+LF_ROWS)) PCT_DELETED,
4 (LF_ROWS-DISTINCT_KEYS)*100 /
5 DECODE(LF_ROWS,0,1,LF_ROWS) DISTINCTIVENESS
6 FROM INDEX_STATS
7 WHERE NAME = '&index_name'
SQL> /
Enter value for index_name: SSN_INDEX
PCT DELETED DISTINCTIVENESS
-------------- -----------------
0 0
SQL>
The distinctiveness column shows how often a value for the column(s) of the index is repeated on average. For example, if a table has 10000 records and 9000 distinct SSN values, the formula would result in (10000 – 9000) ¥ 100 / 10000 = 10. This shows a good distribution of values. If, however, the table has 10000 records and only two distinct SSN values, the formula would result in (10000 – 2) ¥ 100 / 10000 = 99.98. This shows that there are very few distinct values as a percentage of total records in the column. Such columns are not candidates for a rebuild but are good candidates for bitmapped indexes. A full discussion of bitmapped indexes is detailed in How-To 5.8.
How It Works
Step 1 creates the sample table and index used in this How-To. Step 2 gathers statistics on the SSN_INDEX index and places them in the DBA_INDEXES data dictionary view. Step 3 analyzes the height of the index to determine if the index is unbalanced. Step 4 validates the index, putting additional statistics into the INDEX_STATS virtual table. Step 5 reviews the statistics and determines if the index is unbalanced in a different way from Step 3. It also determines if the index is a candidate for a bitmapped index.
Table 5.1 gives the description of the INDEX_STATS data dictionary view.
Table 5.1 Explanation of the INDEX_STATS view
Column Name Description HEIGHT Height of B-Tree BLOCKS Blocks allocated to the index NAME Index name PARTITION_NAME Partition name, if the index is partitioned LF_ROWS Number of leaf rows LF_BLKS Number of leaf blocks LF_ROWS_LEN Total length of all leaf rows LF_BLK_LEN Total usable space in leaf block BR_ROWS Number of branch rows BR_BLKS Number of branch blocks BR_ROWS_LEN Total length of all branch rows BR_BLK_LEN Total length of all branch blocks DEL_LF_ROWS Number of deleted leaf rows left unfilled DEL_LF_ROWS_LEN Total length of all unfilled deleted leaf rows DISTINCT_KEYS Number of distinct keys MOST_REPEATED_KEY Number of times the most-used key is repeated BTREE_SPACE Total B-Tree space USED_SPACE Total allocated space PCT_USED Percent of total allocated space that is being used ROWS_PER_KEY Average number of rows for keys BLKS_GETS_PER_ACCESS Average blocks read when searching for an index entry Table 5.2 shows the description of the DBA_INDEXES data dictionary view.
It is important to note that statistics are not automatically generated. You must either manually refresh table and index statistics, or set up a process that will do this automatically. It is recommended to refresh statistics after objects change more than about five percent of their data.
Column Name Description OWNER User account that created the index. INDEX_NAME Name of the index. INDEX_TYPE Type of index (NORMAL, LOB, CLUSTER, and so on). TABLE_OWNER Owner of the index’s table. TABLE_NAME Name of the index’s table. TABLE_TYPE Type of the index’s table (TABLE, CLUSTER, and so on). UNIQUENESS Uniqueness status (UNIQUE, NONUNIQUE). TABLESPACE_NAME Tablespace in which the index resides. INI_TRANS Initial number of transactions within each block. MAX_TRANS Maximum number of concurrent transactions that can use the block. INITIAL_EXTENT Size, in bytes, allocated for the first extent of the table. NEXT_EXTENT Size, in bytes, allocated for subsequent extents. MIN_EXTENTS Minimum number of extents to allocate when creating or rebuilding the index. MAX_EXTENTS Maximum number of extents. PCT_INCREASE Percent that each extent past the INITIAL and NEXT will increase in size by. PCT_THRESHOLD Threshold percentage of block space defined for each index entry. INCLUDE_COLUMN (Index organized table only) The last indexed column to be included in the top index portion. FREELISTS Number of freelist buffers allocated for the table. Set to the maximum number of concurrent insert processes you predict for the index. FREELIST GROUPS Parallel Server option to determine number of freelist groups for all instances inserting into the index. PCT_FREE Percentage of space to keep free within each block of the index. LOGGING YES or NO value to determine if logging is used. If not, the underlying table can be loaded in direct mode, created AS SELECT … FROM table_b, and other options quicker. LOGGING YES or NO value to determine if logging is used. If not, the underlying table can be loaded in direct mode, created AS SELECT … FROM table_b, and other options quicker. Column Name Description BLEVEL B-Tree level. This shows the depth of the index from branch to leaf. LEAF_BLOCKS Number of leaf blocks in the index. DISTINCT_KEYS Number of distinct keys in the index. This is important for determining if bitmapped indexes should be used. AVG_LEAF_BLOCKS_PER_KEY Average leaf blocks per key. AVG_DATA_BLOCKS_PER_KEY Average data blocks per key. CLUSTERING_FACTOR Amount of “order” in the underlying table. STATUS Index status (VALID, or INVALID, DIRECT_LOAD). NUM_ROWS Number of rows used. This does not include NULL values from the underlying table. SAMPLE_SIZE Shows what percent estimate was used in the ESTIMATE clause of the ANALYZE command (0 if COMPUTE was used). LAST_ANALYZED Date and time the table was last analyzed. Important to see if statistics need to be refreshed. DEGREE (Partitioned indexes only) Number of threads per instance. INSTANCES (Partitioned, Parallel Server indexes only) Number of instances across which the indexes are to be scanned. PARTITIONED Determines if the index is partitioned (YES or ). TEMPORARY Determines if the index is temporary (Y or N). GENERATED Did Oracle generate the index name (Y) or the user (N). BUFFER_POOL DEFAULT or other pool name: displays the default buffer pool in which the index will be placed. Comments
You can also gather index statistics by estimating or computing statistics on the table of which the index is based upon. Oracle will automatically generate statistics on the indexes of the table. Once you log out of the database, all data from the INDEX_STATS virtual table is removed.