Page 739
PQ can benefit any user, while the latter parallel operations tend to mostly help the DBA, because he or she is the one who does (or should do) these tasks the most.
Oracle's Clustering With Oracle, you can use an index cluster or a hash cluster to bring two or more tables together that are frequently joined or otherwise accessed together. Clustering essentially incorporates child tables into their parent table at the physical level. In other words, the primary key of a parent table not only stores its single, unique row data (as it usually would) for a particular primary key value, but it also stores the repeating groups made up of the rows of its child tables, which referenced that value through their foreign keys. Hence, the data remains normalized at the logical level but may be thought of as being denormalized at the physical, or storage, level. This is entirely acceptable and doesn't bypass or break any Relational Model principles or normalization theory. Clusters can be indexed (B*-trees) or hashed. We'll discuss this further in Chapter 32.
Oracle's Bitmapped Indexes Oracle also offers bitmapped indexes, which are normally associated with OLAP systems but are often used with other systems, DSS in particular. Bitmapped indexes are good alternatives to the conventional indexes (B*-trees) for those that are to be based on columns with a small subset of distinct values. They take up less space than regular indexes and are efficient with complex where clauses. Therefore, they are popular choices for DSS and OLAP systems, with their large amounts of data and high analytical demands. If a system is moderately or highly updated, however, bitmaps are not a good choice.
The way these indexes work is that all the distinct values down a column are inverted across an index column, with a bit (a 0 or 1) for each row to represent whether or not that column equals a given distinct value. Table 29.1 shows a simple table, STUDENT, with two columns, STUDENT_ID and SEX, and three rows. This table shows you how a bitmapped index works. We'll discuss the implementation and usage of bitmaps in Chapter 30.
Table 29.1 Bitmapped Index on the STUDENT Table
STUDENT_ID | SEX | SEX=`M' | SEX=`F' |
719250751 | M | 1 | 0 |
298674071 | F | 0 | 1 |
347691027 | M | 1 | 0 |
Massively Parallel Processor (MPP) Machines MPP machines were quickly covered in Chapter 3, as were SMP machines. Both machine types are multiprocessors and both offer vast amounts of computational power. Aside from architectural differences, the difference in power between the two is one of degree rather than of kind. Basically, MPP machines offer more processors, hundreds or thousands, whereas SMP offers a hundred or fewer. Mid-range and low-end SMPs have become commodities, while high-end SMPs and MPPs are not as prevalent.
Modern UNIX and UNIX-variant multiprocessors can safely surpass mainframe power. What this means to most database systems is added parallel computational power. This can work in
Page 740
unison with such things as PQ. Unfortunately, most database systems cannot take advantage of all this power, simply because they might be I/O bound instead of processor bound. More likely, the actual software architecture that makes up the RDBMS kernel, such as multithreaded scalability, cannot scale enough past 20 processors or so to leverage the vast power beyond that count. Currently, this appears to be true of Oracle, Sybase, and Informix.
Varying hardware solutions exist, some of which are based on relatively old technology and some of which are based on newer advances. The older camp includes RAM disks. If an OS permits this type of arrangement, what you have is a section of core (and virtual) memory that acts as a separate disk. Of course, if the power goes out, the disk is lost, unless the contents are copied to a more permanent store (namely, disk). Some DOS versions permitted this, and it worked well for smaller xBase databases, such as dBase. UNIX still allows for this sort of thing. Sun Solaris, in particular, allows for the configuration of temporary file systems that are, in effect, UNIX RAM disks.
In any case, the use of these disks is problematic for database systems, not counting just hardware crashes or power loss. You also might have a case where, for whatever reason, the RAM disk is unavailable and the RDBMS must timestamp the data located on that disk. Result: a database in need of recovery! Bad news. However, if the system proves highly reliable, in practice, I have personally seen an increase in performance by a factor of ten due to an appropriate use of such a disk. In Oracle, one possibility is to store TEMP on a RAM disk.
Solid State Disks Solid-state disks are relatively new, even though the idea and prototypes behind them have been around for 15 years or so. They just haven't been commercially available until the last five years or so. Hardware SMP vendors, such as Sun, and third-party suppliers offer such disks now.
A solid-state disk is essentially a specialized, permanent memory card. The obvious benefit is that the electromechanical disadvantage is all but eliminated because there's no disk access time to factor in. Access time is near the core memory speeds. The disadvantages are the high price, proprietary/compatibility issues, and the lack of a long production cycle behind these products. This latter issue almost always means the technology is still relatively new and unstable.
Hardware Accelerators Hardware accelerators are specialized I/O cards that supply abundant cache space, specialized firmware, and sometimes specialized microprocessors to accelerate the I/O channels they support (hence the name). They have been commercially available for a few years longer than solid-state disks but have not found much success. Their prices have been reasonable, but again there are proprietary/compatibility issues. Because they don't replace disks but only augment them, the stability of their technology is not quite as important as solid-state disks. In any case, they have proven fairly reliable when properly configured. Database Execrator (DBE) is one example.
Queuing Systems Queuing software, such as IBM's Message Queuing Middleware (MQM) series, offers asynchronous message-passing capabilities. This breed of software is also sometimes referred to as Message Oriented Middleware or MOM. What this software does is basically what a queue should do: buffer things in a First-In, First-Out (FIFO) manner. This is