Page 730
various ways, just as a network can. Under heavy load (many concurrent users and/or huge amounts of data being moved), an OS's CPU, I/O, and/or memory might be saturated.
Another way of putting this is that when a component is saturated, the system is bottlenecked on that component. Hence, if the CPU of a UNIX machine is saturated, such as with a utilization of >= 85%, the UNIX system is said to be bottlenecked on the CPU, or simply CPU bound.
Lastly, an OS might just need some re-tuning. If, for example, a DSS requires only very large files to be stored on disks, a system administrator (SA) can tune the UNIX file system so that it optimally stores very large files in contiguous chunks.
As you'll see in Chapters 31 to 33 and Appendixes A to C, tuning a database system is heavily intertwined with tuning the OS. It is a good thing to be both an SA and a DBA, because you don't have to ask permission to do this or that. You just do it. However, in practice, DBAs often have to fight battles with SAs and management over resource allocation and tuning issues. My recommendation is to gather quantifiable proof through Oracle diagnostic tools, discussed later in this chapter (see the section "Using Oracle Diagnostic Tools") and in Chapters 31 to 33, and use those figures to justify your resource needs.
Now is a good time to enumerate the major performance tuning principles you will find in almost any performance tuning reference.
What are the major performance tuning principles? They are as follows:
If these look familiar, they should. They are the physical design principles we discussed in Chapter 3, with a slight twist on numbers 2 and 3. Let's look at a few scenarios regarding how these principles apply to performance tuning per se, and not just physical database design.
Divide and conquer is the principle. In essence, you want to use parallel techniques as a remedy for a bottleneck.
Follow along with this scenario: If performance monitoring shows that the database system has a heavy concentration of physical reads and writes to one particular disk, that disk is a bottleneck. The physical and logical structures of Oracle, for example, need to be separated to allow parallel access. As often is the case, what is happening here is that an application has several users or processes trying to access the same tables or tablespaces on that disk.
Page 731
Frequently, tables that must be joined together might exist on the same disk. This makes for poor performance.
The OS must access the disk on Oracle's behalf for some blocks of the table's data, perform this cycle again for the second table, and repeat the whole process for all the blocks of both tables until Oracle has joined the two. What happens at the hardware level is that the read/write head of the disk must read the requested sectors (physical data blocks) of the first table, reposition itself to read the requested sectors of the second table until a match is found, and then repeat. Access time (in milliseconds) of a disk consists of seek time and latency (rotational delay). Seek time is the repositioning of which we spoke, when the read/write head must move inward and outward along the radius of the disk to find the track containing the requested sectors. Seek time dominates the latency, or the time it takes for the disk to spin around to access the sectors.
The important thing is that joining two or more tables on the same disk requires a high proportion of seeks to reads. As a DBA, your goal is to reduce the number of seeks. You want to minimize contention and get rid of this bottleneck. To do this, you separate the tables that are being joined onto different disks. This same methodology holds true for tables that coexist on the same disk and can be simultaneously accessed even though they aren't being joined together. The result is the same: bottleneck. The solution is the same: separate them.
By the way, RAID is susceptible to this scenario just as easily as a standard disk. At first, it might not seem possible that RAID, especially levels 1, 3, and 5 (which we discussed in Chapter 3), could suffer this same type of problem. Quite clearly, striping a single table across several disks (a RAID volume) will undoubtedly increase performance. However, if this table is stored on the same RAID volume with tables that must be joined with it or at least are accessed concurrently, we have the same problem: bottleneck. Again, the solution is the same: separate them.
The reason you still have the bottleneck is not so clear at first glance, but think about it. Chunks of all the tables (stored in the data files of their respective tablespaces) are striped across all the disks of that RAID volume. Although no single table exists in its entirety on any one disk, chunks of all tables coexist on all of the disks. Hence, we have the same problem on a smaller scale. Rather than separating joined or otherwise simultaneously accessed tables onto different disks as with standard disk setups, we separate the tables onto different RAID volumes.
Preallocate, prefetch, and precompile is the principle. Do work ahead of time whenever possible.
Follow along with this scenario: Suppose you have a database system that is particularly volatile. It grows and shrinks considerably and frequently. In fact, as an application type, it can be called a VCDB. When you access the database during the non-growth period, performance is reasonable. However, during its growth phase, performance is very poor, especially early on.
Page 732
What is likely happening here is that dynamic growth is slowing down the online system. In Oracle terms, this is known as dynamic extension. Rows are being updated or inserted such that Oracle must extend or allocate the next extent for a table, as designated by the storage clause when either created or altered, or take the tablespace default. In either case, the table had enough space at creation time up until this most recent growth period and then had to extend to accommodate it. Of course, simply having a table extend in Oracle is not the end of the world. Far from it. It is, in fact, a normal course of events.
However, when a table is frequently extending, especially for large amounts of data, and causing online access to suffer, it's a major problem. Concurrent users shouldn't have to suffer. For database systems of this type that require frequent or very large data extents, preallocation is best. You want to create your table with sufficient storage to begin withthat is, to handle the maximum expected peak size. In the storage clause of your table (or tablespace) create statement, set your INITIAL extent size, your NEXT extent size, and then set MINEXTENTS so that INITIAL + ((MINEXTENTS - 1) x NEXT) equals the maximum expected peak size of the table.
Triage is the principle. Attack the most important problems first to get the greatest return on investment.
Follow along with this scenario: You have a slow batch system. When you do production runs, the system crawls. Upon reviewing the SQL code that makes up the application, you seem to have several very inefficient programs that could be rewritten. To do so might take about two months. You then analyze performance monitoring statistics and find that rollback shows high contention, apparently because you are running several concurrent programs at once to help speed up the runs. You fix the rollback by adding more segments, but you gain little increase in your elapsed times. What to do?
No matter how well you tune the back end, the front end (or application code) dominates the total time usage of a client/server system, not accounting for networking and other issues. You must tune the application first and then tune the database. The investment of two months will be well worth it when the production runs finish faster than ever.
If this can't be done for some reason, other database efforts such as logical or physical restructuring can be done as a temporary measure. But this truly shouldn't be done at all; you'd be changing your good database design because of a poor application design. You wouldn't gain much with that approach.
Bulk, block, and batch is the principle. When appropriate, group things together that are processed together and don't compete with one another for resources.
Follow along with this scenario: Your DSS provides real-time, read-only information to external analysts via the Internet. You have a Web page as a front end to our database system. Concurrency is sometimes medium high (up to 100 users) but does not seem to cause problems. What you have had complaints about is the throughput from selecting from large tables.