Previous | Table of Contents | Next

Page 766

breaking them apart into many sequential subqueries. Also, consider abandoning the views or possibly materializing them.

Finally, don't forget to index appropriately (see previous steps) and don't index for the sake of indexing. In my experience, I have seen databases with tables having indexes on nothing, and I have seen databases with tables having indexes on everything. Indexes on nothing are typically never a good thing because you should always have a primary key index. Indexing on everything is typically not beneficial either because quite often this is symptomatic of either a poor requirements analysis, poor database design, or poor application design.

Now that the major indexing guidelines have been covered, let's go over some of the more typical problems found in application tuning.

Dealing with Typical Problems in Application Tuning

Almost all the typical problems in application tuning deal with some form of inefficient SQL coding or inappropriate use of indexes. This section discusses several of these problems.

One problem that can appear is known as the index coverage anomaly. An interesting thing about index coverage can occur. If, as it turns out, the number of rows returned by a query is some large number, as might be with a low selectivity index, then a full-table scan would be more efficient. However, if ANALYZE has not been run, the optimizer resorts to rule-based optimization and chooses the index access path over the full-table scan. Unless the DBA keeps statistics up to date, the optimizer may choose poorer access paths through no fault of its own.

Another problem that can occur is known as the unnecessary multipass or multiloop. Look at the following queries:

SQL> SELECT A FROM T WHERE A > 9;

SQL> SELECT B/A FROM T WHERE B < 3 AND A >9;

Clearly, this could done in one pass through table T:

SQL> SELECT A , B/A FROM T WHERE B < 3 AND A >9;

Using unnecessary functions or NULL comparisons is another possible difficulty. As already discussed regarding the proper use of indexes, the optimizer will not use the index in these cases, so don't do something like the following:

SQL> SELECT A FROM T WHERE X < MAX(X);

As covered earlier, you want to use literals and bound this range search to something like:

SQL> SELECT A FROM T WHERE X >= 0 AND X <999999;

Also, if possible, convert NULLs-allowed columns to a default basis:

Use

SQL> SELECT A FROM T WHERE X > 0;

Page 767

instead of

SQL> SELECT A FROM T WHERE X != 0;

or

SQL> SELECT A FROM T WHERE X IS NOT NULL;

because neither of the two latter statements use an index on the X column, if that was intended.

Don't forget to index foreign keys. This causes the child table to lock the parent table when the child table is being inserted, updated, or deleted, and vice versa. If you do index all your foreign keys, an update to the child does not lock the parent and an update to the parent only acquires a row-level lock as necessary.

Don't select information from a table when it could be derived some other way. One cardinal application-programming rule in dealing with databases is: Trading off computation for storage is almost always a good thing in a database application.

This is true because most database applications tend to be I/O bound. Compression is an example of such a benefit. As another example, suppose we are computing sales taxes based on a single, flat tax rate (or even a few levels of tax rates). The taxes table will likely be a mostly static lookup table, the rate being usually constant within a given tax period (for example, one year). The following shows how to use a function rather than access a stored, derived column unnecessarily:

Don't use:

SQL> SELECT (P.COST + (P.COST * T.RATE)) "COST WITHTAX"
       2> FROM PRICES P, TAXES T
       3>  WHERE P.PART_ID = T.PART_ID

Instead, use something like:

SQL> SELECT (COST * 1.06) "COST WITHTAX"
       2> FROM PRICES;

This says something about the original design in that you might not have needed the TAXES table. However, putting that issue aside, you certainly don't need to access the tax rate column (RATE) to be able to apply that multiplier. This is especially true because you are relying on a constant (a simple hard-coded, literal replacement). This only works well if the developers have sufficient domain (business) knowledge and if the amount of knowledge is relatively small and well known by all developers. In any case, this example shows how a simple function can replace a table access. Please be aware, however, that this technique of hard-coding is considered poor programming design, but that our aim is better application performance. Often, these two are at cross purposes, and this last example is no exception. You must decide which of all your goals is most important for you to achieve.

Don't trigger unnecessary full-table scans and cause the cumulative percentage problem. Wh y scan a table when you don't need to? Easier said than done. You must always be on the lookout for queries that might launch a full-table scan—in other words, queries that don't use an index

Page 768

as intended, or somehow cause a full-table scan in another way. Also, don't forget about the 5 percent rule. If the optimizer expects more than 5 percent (version >=7.x) to be returned from a query, it resorts to a full-table scan. This can occur when multiple WHERE conditions combine to sum up past that amount with something like:

SQL> SELECT A FROM T WHERE X = 1 AND Y = 2 AND Z =3;

If statistics show X = 1 having 3 percent, Y = 2 having 2 percent, and Z = 3 having 2 percent, the optimizer will see the cumulative percentage being 7 percent and opt for the full-table scan. If desired, you can break these two queries apart, especially for large tables, and get better performance with the same results:

SQL> SELECT A FROM T WHERE X = 1;
SQL> SELECT A FROM T WHERE Y = 2 AND Z = 3;

The optimizer uses the indexes on X, Y, and Z because the first query shows only 3 percent and the second only 4 percent.

Use DISTINCT, ORDER BY, and UNION carefully. These operations cause the creation of temporary tables and add the additional overhead of sorting on-the-fly. As a DBA, you can tolerate few of these concurrently, unless you have infinite resources. Think of many of these occurring at once as having the same performance drain incurred from issuing many CREATE INDEX statements at once. If you must use DISTINCT, use it infrequently and store the output for future use. If you must use ORDER BY, try to have an index on that column. If you must use UNION, use UNION ALL (which does not eliminate duplicates). This latter solution works in most cases, as long as you have all your primary keys defined and enabled.

NOTE
In any application tuning, consider translated SQL statements, because the application front end may be written in other SQL. ODBC driver translations do not always produce an optimized query for many reasons. Just watch for such occurrences and set the ODBC trace on so you see the translated SQL that may be preventing the use of an index.n

Rewriting Queries

Well, you've already rewritten numerous queries in previous sections. Now, let's concentrate on rewriting queries with a few examples of how you can really tune SQL to high efficiency. Aside from the many techniques discussed already, two of the best are:

The field of rewriting queries is highly mathematical, yet eminently practical. When we speak of rewriting queries, we mean rewriting them to speed them up, rather than to fix them. The latter is more along the lines of getting it right the first time and maintenance programming. What we're interested in here is raw speed, often at the expense of readability and maintain-ability. Hence, some organizational choices are to be made before venturing into this area.

Previous | Table of Contents | Next