Previous | Table of Contents | Next

Page 33

That said, what to do? Suppose you have 30-odd tables that make up the logical design in your database, all of which are 3NF or higher (not unrealistic). However, three of these tables are virtually always joined together when they are used. Now, join operations themselves, with or without indexes, are resource-intensive and can consume a small machine in no time. In the interest of performance, with no other performance-tuning tricks at hand, you might want to pre-join these three tables into one table for permanent use, thereby avoiding the join for future accesses. By placing two or more normalized tables back into one, you are virtually guaranteed of reducing the normalization level, or even worse, becoming fully unnormalized. Of course, if this must be done, more and more procedural and application-level data integrity controls must be written to take the place of the DBMS's automatic features.

If, for example, you opt to keep both the original tables and the newly joined one, you are faced with the unenviable task of keeping them synchronized. However, if updates are very frequent, (every hour or less), and currency of the joined data is required, other methods will have to be used.

Another method is to create a denormalized view on top of the normalized base tables. However, the performance of this option is the same or worse than simply having the original three tables. Its only advantage is user interface simplicity. So, while denormalization may help in some static data cases, it usually cannot solve many dynamic data problems. A method like Oracle's clustering might suffice in this type of situation if denormalization doesn't work. However, we'll have more to say about that later.

A final, fairly common, example of denormalization is when DBAs must deal with time series. The reason why time series often must be denormalized is because time must almost always be part of the primary key of a table. This is true because most data stored is time-dependent.

A good example is a database storing satellite feed data. This data is timestamped, which is part of the primary key on most of the tables in the database. For any given table which has the timestamp as the component, all the other columns which make up the primary key, such as satellite ID, are repeated for every different timestamp. In other words, if the database downloads 100 rows of data from a single satellite every hour, then in 8 hours, we have 800 rows stored, with all the other satellite information unnecessarily repeated 800 times. The wasted storage can be tremendous, especially considering you may have multiple satellites or more frequent sample rates.

The typical solution is to denormalize by inverting the data from row-wise to column-wise. We now have 100 timestamp columns, and we have reduced the number of rows from 800 to 8, or by a factor of the sample interval of 100. Storage reduction, especially row reduction, almost always helps search performance since there are fewer rows to scan through, either in a table or an index. However, this type of denormalization, though often necessary, results in a table which is not normal by virtue of the timestamps being stored as repeating group columns. Hence, you are then unable to use a foreign key constraint on this former primary key com-ponent. Instead, you must resort to check constraints, procedures, triggers, and possibly supplemental application integrity handling. If anything, the lesson you should learn from this particular example is that if you must denormalize for performance, you will have to pay the price in integrity management.

Page 34

Understanding the Storage Hierarchy and RAID

One of the most important things a DBA can learn about is the storage hierarchy and its associated tradeoffs. This, more than anything, helps explain a lot about physical design and performance tuning. One of the key ideas behind the storage hierarchy is the electromechanical disadvantage—anything that has a motor or moving parts is inherently slower than something that is solely electronic.

Figure 3.1 shows a modern interpretation of the storage hierarchy. Clearly, memory is faster than disk. As you go up the pyramid, speed increases (access time), cost increases (per unit), but storage decreases.

FIG. 3.1
The storage hierarchy.

The more we want to store, the more we have to pay. Especially if we want it fast. However, we can make some tradeoffs. With a historical system, we can put the older stuff nearline on optical disks or at least slower disks, and even older stuff on tape, either nearline or offline. If your particular storage pyramid has everything robotically nearline, down through tape, and you have software to access it on demand, that's a Hierarchical Storage Management (HSM) system. There are vendors who specialize in this sort of hardware and software. So historical systems can afford to gravitate toward the bottom, because speed is not of the essence, relative to the data currency. However, real-time systems, such as aircraft navigation and so forth, should place most of their storage toward the faster elements at the top. A reasonable medium for many businesses is to place important, current data on RAID or fast disk, and everything else on slower disk.

Page 35

Understanding RAID

RAID, or Redundant Array of Inexpensive Disks, is perhaps a misnomer. Since its inception, the disks that make up RAID have never really cost any less than regular SCSI (Small Computer Systems Interface) disks. In addition, RAID requires special hardware, software, or both to work, at added cost. So the I in RAID is not quite correct.

RAID is a set of disks that can work in parallel to reduce I/O time by a factor of how many disks make up the set, an important thing for databases. RAID works in parallel through a technique known as striping, as opposed to ordinary disk storage, which works in serial. This is simply writing a single file using stripes, or chunks of the file, across multiple disks in parallel. The stripes are some multiple size of a physical data block, the smallest unit of disk I/O. Typically, a data block is made up of 512 bytes. This is true for most UNIX systems, VMS, and DOS/NT.

RAID also offers real-time disk failure recovery, another important thing for databases. RAID can offer this level of availability through parity information, which is also written out to a disk or disks. Parity, checksums, and error correction are handled through certain mathematical formulae and algorithms that can reconstruct the missing data on a lost disk. And RAID can offer either small or large disks in a set, which can help different database application types. For example, DSS often performs better with larger disks, while OLTP does better with smaller ones. RAID comes in many flavors, known as levels.

The RAID levels most used in industry today are 0, 1, 3, and 5. All other levels are either not used or are some variation of these four. RAID levels are offered by vendors that number 7 and higher, but these are rarely seen in practice. RAID 0 is basic striping with no parity. That is, you get the performance advantage but no parity. This is good when pure speed is the goal and availability isn't as important.

RAID 1 is known as mirroring, or sometimes duplexing, again with no parity. In mirroring, you essentially have an even set of disks, half of which contain the real or primary data and half of which contain the copies of that data, on a disk-wise basis. Actually, both disks are written to at the same time, and sometimes this can mean a performance loss when writing. Reading, on the other hand, can actually be sped up, if software is designed to read the multiple disks and merge the streams for use.

RAID 3 is striping again, except now with a single, dedicated parity disk. You can afford to lose one data disk and still recover it using the parity disk. However, the parity disk is a single point of failure. If it's lost it could be fatal, depending on the software or hardware and how it is written to handle this event.

RAID 5 is also striping with parity, except rather than using a single, dedicated disk, it stripes the parity along with the data across all disks. Parity information is as well protected as data, and there is no single point of failure. RAID 5, like RAID 3, can tolerate and recover from the loss of a single disk. However, neither 3 nor 5 can lose two disks and recover. Too much parity information is lost.

Previous | Table of Contents | Next