5.8 How do I…Create a bitmapped index?Problem
I want to create an index on some columns that do not have many distinct values. Using a regular index will slow my performance in querying data. I know that bitmapped indexes are much more effective, both from a performance perspective and a storage perspective. How do I create a bitmapped index?
Technique
A bitmap index is ideal for when indexed columns have only a few distinct values, such as TRUE/FALSE. Regular indexes will worsen performance in such cases, both on selecting data and on manipulating data. The bitmapped indexes will be stored in a much more compact format than regular indexes.
Use the BITMAP clause of the CREATE INDEX command to create bitmap indexes. A bitmap index will greatly improve query performance on columns with low cardinality—columns with few distinct values as compared to the number of records. This How-To will show how to determine if it is appropriate to create a bitmap index and then will describe the steps used to create one.
Steps
1. Before creating a bitmap index, it important to know what a bitmap index is and if using one will improve performance. Bitmap indexes are ideal for columns that have few distinct values relative to the total number of records in a table. Regular indexes are the opposite; they perform better when there are more distinct values.
Due to the different internal structure of the index, it is stored in a much more compressed format. This dramatically reduces the space required to store the index when compared to traditional B-Tree indexes. Bitmap indexes usually require much less space than the data itself for storage. Bitmap indexes are, however, poor candidates for tables that frequently insert, update, or delete data in the indexed columns, as there will be more overhead in maintaining the bitmap indexes than traditional indexes.
A bitmap index is comprised of one bitmap for each distinct value the column(s) have. Each bitmap is comprised of 0 or 1, depending if the corresponding column contains the value associated with the bitmap. For example, assume a table has a column that contains the values Oracle6, Oracle7, and Oracle8. There are three records of each value. The bitmap will look something like that shown in Figure 5.18.
The first record can be seen by looking vertically at the first 0/1 bit of each line. In the previous example, the first three records have 1 by Oracle6, and 0 by Oracle7 and Oracle8. This means that the first three records contain the value Oracle6. Oracle stores the data in binary format and performs Boolean math operations on the bitmaps. This provides for dramatic performance and storage improvements. You can now also see why it takes much processing to update bitmap data when altering data. Just by adding a new distinct value, Oracle would have to create a new bitmap for all records.
2. Connect to SQL*Plus as the WAITE user account and create the PLAYER05 table by running CHP5_16.SQL, as shown in Figure 5.19.
3. The PLAYER05 table is used to keep track of major and minor league baseball players. Some of the columns have high cardinality, such as the LAST_NAME column. They are good candidates for regular indexes. Other columns have low cardinality, such as POSITION, which in baseball can be limited to: 1B, 2B, 3B, SS, RF, CF, LF, RP, LP, DH, and UT. Another candidate would be THROWING_ARM, which would be limited to L or R. The best bitmap index would be on both the POSITION and THROWING_ARM columns. By running the CHP5_17.SQL script, this index is created, as shown in Figure 5.20.
4. The bitmap PLAYER_BITMAP_INDEX is successfully created on the PLAYER05 table. It is best to create an index after data is already in the table. Queries on the players' position and throwing arm will benefit greatly from this bitmap index. For instance, to find out how many left-handed shortstops, you would issue the CHP5_18.SQL query shown in Figure 5.21. There is no reason to run the query in the How-To as there are no records in the table; if there were many records in the database, you would notice a performance improvement with the query for the bitmap index.
Queries like the one shown in Figure 5.21 are used often in data warehouse applications. Typically, bitmap indexes are used more often in query-intensive applications such as data warehouses and data mining, and are used less in OLTP applications where data is updated often.
How It Works
Step 1 describes when it is appropriate to use bitmap indexes, and explains their difference from regular indexes. Step 2 creates the PLAYER05 table, which is then used to show good and bad column candidates for bitmap indexes. Step 3 creates a bitmap index, PLAYER_BITMAP_INDEX, on the concatenation of POSITION and THROWING_ARM. Step 4 shows a sample query on the PLAYER05 table that will utilize the bitmap index.
The CHP5_21.SQL query shown in Figure 5.21 will perform faster than a regular index because Oracle will search for records with the SS position bit set to 1, and with the L throwing arm bit set to 1. This is a quick Boolean math function handled internally by the Oracle engine.\
Comments
You can create local bitmap indexes on partitioned tables, just as you can with other types of indexes.
There is no clearly defined cardinality of which bitmap indexes should be used. This will depend on how the data is spread among the possible values, the number of distinct keys, and the number of records in the table. Testing is the best method to determine if query speed is improved. A rule of thumb is if there are over one hundred records for each possible value, then bitmap indexes will improve query performance.