Chapter 22
Understanding the Query Optimizer



by Orryn Sledge

The key to extracting maximum query performance is to understand SQL Server's optimizer. If you understand the optimizer, you can write queries that run faster, build better indexes, and resolve performance problems.


NOTE: SQL Server uses an intelligent cost-based optimizer. Don't be misled by the word intelligent. I have yet to meet a query optimizer that is more intelligent than a good DBA! Although SQL Server has an excellent query optimizer, there is no way it can ever understand all the nuances and intricacies of your data. Therefore, do not put blind faith in the query optimizer. Instead, try to understand how the optimizer works and how you can finesse it into delivering better performance. SQL Server's optimizer has been significantly improved in version 6.x. Following is a summary of notable enhancements:


Why SQL Server Uses a Query Optimizer

SQL Server uses a cost-based query optimizer to generate the optimal execution path for an INSERT, UPDATE, DELETE, or SELECT SQL statement. The "optimal" execution path is the path that offers the best performance. Before the query is run, the optimizer assigns a cost based on CPU and disk I/O usage for different execution paths. The optimizer then uses the least expensive execution path to process the query. See Figure 22.1 for examples of execution paths.

Figure 22.1.
Examples of execution paths.


The advantage of the query optimizer is that it relieves users from the tedious process of having to decide how their SQL statements should be constructed to use indexes and in what order the data should be accessed. The query optimizer allows users to build SQL statements that automatically take advantage of indexes and automatically determine the optimal order to process table joins.

Update Statistics

Whenever you create an index, SQL Server creates a set of statistics about the data contained within the index. The query optimizer uses these statistics to determine whether it should use the index to help process the query. Over time, you will find that your statistics will become less representative of your data in tables that are frequently modified. In turn, this will cause the optimizer to ignore useful indexes.

To keep statistics up to date, run the UPDATE STATISTICS command whenever a large percentage of the table's index keys have changed. Operations such as BCP and batch inserts, deletes, and updates can cause an index's statistics to become outdated.

UPDATE STATISTICS [[database.]owner.]table_name [index_name]

If you do not include the [index_name] parameter, all indexes attached to the table are automatically updated.


TIP: In a transaction-oriented environment, it can be advantageous to automate UPDATE STATISTICS (see Chapter 26, "Automating Database Administration Tasks," for more information on automating UPDATE STATISTICS). Doing so can help keep your index statistics current. Use the Task Scheduler, included with the Enterprise Manager, to schedule UPDATE STATISTICS. To determine when an index's statistics were last updated, use STATS_DATE() or DBCC SHOW_STATISTICS, as in the following example:

STATS_DATE (table_id, index_id)
DBCC SHOW_STATISTICS (
table_name, index_name)

Basic Query Optimization Suggestions

The following list of suggestions concentrates on the basics of query optimization. I always recommend starting with the basics when trying to improve query performance. Quite often, a minor modification to a query yields a substantial gain in performance.

Tools to Help Optimize a Query

The following optimizer tools can be used to help optimize a query:

The Showplan Tool

A showplan provides insight about how SQL Server is going to process a SQL statement. The showplan can be one of the most confusing aspects of SQL Server. Its output is cryptic and based on technical jargon. Yet, if you know how to interpret its cryptic output, it can be useful for tuning queries.

To generate a showplan, click the Query Options button within the Query dialog box from the Enterprise Manager. Select the Show Query Plan option from the Query Flags page of the Query Options dialog box (see Figure 22.2).

Figure 22.2.
Setting the Show Query Plan option.


After you execute your query, the output from the showplan appears in the Results window (see Figure 22.3).


NOTE: Gone, but not forgotten... For some strange reason, Microsoft removed the graphical showplan tool from SQL Server 6.5. I always thought the tool was pretty useful, but I guess the folks at Microsoft didn't think anyone was using it!

Figure 22.3.
A showplan.

The Statistics I/O Tool

Statistics I/O is useful in determining the amount of I/O that will occur to process a query. The less I/O you have, the faster your query will run. When tuning queries, try to minimize the amount of I/O used by the query. SQL Server uses I/O statistics to help determine the optimal query execution path.

When you generate statistics I/O, you see three types of I/O measurements: scan count, logical reads, and physical reads. The following list explains the three types of I/O measurements:


NOTE: Each time a query is run, the data used to process the query may become loaded into the data cache. This can reduce the number of physical reads required to process the query when it is run again. You can detect whether the data is loaded into the data cache by monitoring the logical reads and physical reads for a query. If physical reads is less than logical reads, some or all of the data was in the data cache.

SQL Server provides two facilities to generate I/O statistics: graphical and text-based.

To generate graphical statistics I/O, click the Display Statistics I/O button in the Enterprise Manager Query dialog box (see Figure 22.4).

Figure 22.4.
Generating graphical statistics I/O.


After you execute your query, click the Statistics I/O tab to view I/O statistics output (see Figure 22.5).

Figure 22.5.
Graphical statistics I/O.


To generate text-based statistics I/O, click the Query Options button in the Enterprise Manager Query dialog box. Select the Show Stats I/O option from the Query Flags page of the Query Options dialog box (see Figure 22.6).

Figure 22.6.
Setting the Show Stats I/O query option.


After you execute your query, the output from the statistics I/O appears in the results window (see Figure 22.7).

Figure 22.7.
Text-based statistics I/O.

The No Execute Tool

No Execute is an excellent tool for optimizing long-running queries. By using this option, you can determine the showplan for a query without having to actually run the query. When you use this option, the syntax of the query is validated, a showplan can be generated, and any error messages are returned.

To use the No Execute option, click the Query Options button in the Enterprise Manager Query dialog box. Select the No Execute option from the Query Flags page of the Query Option dialog box (see Figure 22.8).

Figure 22.8.
Setting the No Execute query option.

The Stats Time Tool

The Stats Time tool displays the time required by SQL Server to parse, compile, and execute a query.

To use the Stats Time option, click the Query Options button in the Enterprise Manager Query dialog box. Select the Show Stats Time option from the Query Flags page of the Query Option dialog box (see Figure 22.9).

Figure 22.9.
Setting the Show Stats Time query option.

Reading the Showplan

The showplan provides insight about how SQL Server is going to process a SQL statement. If you know what to look for in a showplan, the information it provides can be useful for tuning queries.


NOTE: Those new to SQL Server should not be dismayed by the jargon used in the showplan. For example, the showplan uses words such as SCALAR AGGREGATE. This is just a fancy way of saying that the query contains an aggregate function, such as AVG(), COUNT(), MAX(), MIN(), or SUM(). After you get past the lingo used by the showplan, you will find it a useful tool for optimizing queries.

A large amount of irrelevant information can be generated by a showplan. When you read a showplan, it is important to know what to look for. Use Table 22.1 to help weed out the irrelevant information.

Items in Table 22.1 that are considered irrelevant are those you cannot control. For example, if you use an aggregate function, such as COUNT(), the words SCALAR AGGREGATE or VECTOR AGGREGATE always appear in the showplan. Therefore, I consider SCALAR AGGREGATE and VECTOR AGGREGATE to be irrelevant to query tuning because no matter how you modify your indexes or revamp your query, the showplan includes these words.
Table 22.1. Relevant showplan output.
Text-Based Showplan Output Relevant? New to 6.5?
CONSTRAINT: nested iteration Yes Yes
EXISTS TABLE : nested iteration No
FROM TABLE Yes
FULL OUTER JOIN: nested iteration No
GROUP BY WITH CUBE No Yes
GROUP BY WITH ROLLUP No Yes
GROUP BY No
Index : <index name> Yes
LEFT OUTER JOIN: nested iteration No Yes
Nested iteration No
SCALAR AGGREGATE No
STEP n Yes
Table Scan Yes
The type of query is DELETE No
The type of query is INSERT No
The type of query is SELECT (into a worktable) Yes
The type of query is SELECT No
The type of query is UPDATE No
The update mode is deferred Yes
The update mode is direct Yes
This step involves sorting Yes
TO TABLE No
UNION ALL No Yes
Using Clustered Index Yes
Using Dynamic Index Yes
Using GETSORTED Yes
VECTOR AGGREGATE No
Worktable No
Worktable created for DISTINCT No
Worktable created for ORDER BY Yes
Worktable created for REFORMATTING Yes
Worktable created for SELECT_INTO No


The following sections explain showplan output.

CONSTRAINT: nested iteration

What it means: The query modified data in a table that contains a constraint.

Tip: Use this output to view constraints that are being executed behind the scenes. For example, if you use constraints to enforce declarative referential integrity (DRI), you will see this output when you use INSERT to insert a new record into a table that contains a constraint. If your query is slow to execute and you see this output, you may want to review the indexes used to support the DRI.

Relevant: Yes

EXISTS TABLE : nested iteration

What it means: The query contains an EXISTS, IN, or ANY clause.

Tip: Ignore this output.

Relevant: No

FROM TABLE

What it means: The source of the data for the query.

Tips: Use this output to determine the order in which the optimizer is joining the tables. On complex table joins (usually more than four tables), you can sometimes improve performance by rearranging the order of the tables in the FROM clause and the WHERE clause. To force the optimizer to follow the table order in the FROM clause, you must use the SET FORCEPLAN command.

Relevant: Yes

FULL OUTER JOIN: nested iteration

What it means: The query contains a FULL JOIN clause.

Tip: Ignore this output.

Relevant: No

GROUP BY WITH CUBE

What it means: The query contains a GROUP BY and a WITH CUBE clause.

Tip: Ignore this output.

Relevant: No

GROUP BY WITH ROLLUP

What it means: The query contains a GROUP BY and a ROLLUP clause.

Tip: Ignore this output.

Relevant: No

GROUP BY

What it means: The query contains a GROUP BY clause.

Tip: Ignore this output.

Relevant: No

Index: <index name>

What it means: The optimizer found a useful nonclustered index to retrieve the rows.

Tips: Generally speaking, a query that uses an index runs faster than a query that does not use an index. An exception to this rule is a table with a small number of rows. In this scenario, performing a table scan may be faster than using an index; however, the optimal access plan always depends on the number of rows and columns in the table being accessed.

Relevant: Yes


A Discussion about Compound Indexes
A compound index is an index made up of more than one column. The rules regarding compound index optimization sometimes cause confusion. The source of the confusion stems from when SQL Server takes advantage of the index and when it cannot use the index. Following is the structure of a table that will be used for this discussion:
Table:

CREATE TABLE table1
(col1 int not null,
col2 int not null,
col3 int not null,
description char(50) null)
Primary Key:
col1 + col2 + col3
Index:
CREATE UNIQUE CLUSTERED INDEX table1_idx ON table1(col1,col2,col3)
Number of rows:
1000

When working with a large table, the optimizer takes advantage of the compound index when one of the following is true:

For example, the following queries can take advantage of the compound index:

SELECT *
FROM table1
WHERE col1 = 100
and col2 = 250
and col3 = 179
SELECT *
FROM table1
WHERE col1 = 100
and col2 = 250
SELECT *
FROM table1
WHERE col1 = 100
SELECT *
FROM table1
WHERE col1 = 100
and col3 = 250

The following queries cannot take advantage of the compound index:

SELECT *
FROM table1
WHERE col2 = 100
and col3 = 250
SELECT *
FROM table1
WHERE col2 = 100
SELECT *
FROM table1
WHERE col3 = 100


LEFT OUTER JOIN: nested iteration

What it means: The query contains a LEFT JOIN clause.

Tip: Ignore this output.

Relevant: No

Nested Iteration

What it means: The default approach for queries with WHERE criteria or table joins.

Tip: Ignore this output.

Relevant: No

SCALAR AGGREGATE

What it means: The query contains an aggregate function--AVG(), COUNT(), MAX(), MIN(), or SUM()--and does not contain a GROUP BY clause.

Tip: Ignore this output.

Relevant: No

STEP n

What it means: Specifies the number of steps required to process the query. Every query has at least one step.

Tips: Fewer steps means better performance. The GROUP BY clause always requires at least two steps.

Relevant: Yes

Table Scan

What it means: Each row in the table is processed.

Tips: Look out for this plan on large tables. It may slow down your query because each row in the table is processed, which can lead to a lot of I/O. To avoid a table scan, try to build a useful index that matches the WHERE clause.

On small tables, the optimizer may choose to ignore an index and perform a table scan. For small tables, a table scan may process faster than using an index to retrieve the data. On very large tables, you want to avoid table scans.

Relevant: Yes


A Discussion about Table Scan
The table scan is dreaded when you are working with large tables in an OLTP (online transaction processing) environment. It can lead to poor performance and result in table blocking. The following example shows the difference in showplans for a retrieval based on a table scan and a retrieval that can use an index. The first listing shows the showplan for a table without an index:

Table:
CREATE TABLE sales
(sales_id int not null,
descr char(50) null)
Primary Key: sales_id
Indexes: None
Row Count: 1,000,000
Query:
SELECT * FROM sales
WHERE sales_id = 450
Showplan:
STEP 1
The type of query is SELECT
FROM TABLE
sales
Nested iteration
Table Scan

Now consider the inefficiencies involved with a table scan. The user wants only one row returned from the table, but the server had to process every row in the table (see Figure 22.10).


Figure 22.10.
A table scan on a 1,000,000-row table.


To prevent the table scan in this example, create a clustered index on the column sales_id. By creating the index, the optimizer can generate a showplan that directly accesses the data without having to look at each row of data (see Figure 22.11). This will significantly improve performance.

Figure 22.11.
Using a clustered index to find data on a 1,000,000-row table.


Index:
CREATE UNIQUE CLUSTERED INDEX sales_idx ON sales(sales_id)
Showplan:
The type of query is SELECT
FROM TABLE
sales
Nested iteration
Using Clustered Index

The type of query is SELECT

What it means: The query contains a SELECT clause.

Tip: Ignore this output.

Relevant: No

The type of query is INSERT

What it means: The query contains an INSERT clause; alternatively, a worktable must be used to process the query.

Tip: Ignore this output.

Relevant: No

The type of query is UPDATE

What it means: The query contains an UPDATE clause; alternatively, a worktable must be used to process the query.

Tip: Ignore this output.

Relevant: No

The type of query is DELETE

What it means: The query contains a DELETE clause; alternatively, a worktable must be used to process the query.

Tip: Ignore this output.

Relevant: No

The type of query is SELECT (into a worktable)

What it means: The optimizer decided that a temporary worktable should be built to efficiently process the query. A worktable is always created when a GROUP BY clause is used; a worktable is sometimes generated when an ORDER BY clause is used.

Tips: A worktable is an actual table created in the tempdb database. Worktables can degrade performance because they involve additional disk I/O. When the process is complete, the worktable is automatically deleted. A worktable is unavoidable when using a GROUP BY clause.

Relevant: Yes

The update mode is deferred

What it means: Two passes are required to update the data. The first pass generates a log of the changes and the second pass applies the changes. UPDATE, DELETE, and INSERT statements can generate this plan.

Tip: Update deferred is slower than update direct (see the following sidebar for more information).

Relevant: Yes


Update mode is deferred versus Update mode is direct
The update mode is often overlooked when people try to tune queries. By tweaking your table definition, indexes, and SQL statements, you can improve the performance of your UPDATE, INSERT, and DELETE statements. It is important to determine the type of update mode being used because a deferred update is always slower than a direct update. When a deferred update is used, SQL Server takes two passes to update the data. The first pass generates a log of the changes and the second pass applies the changes. A direct update does not generate a log; instead, it directly applies the changes. Following is a list of requirements for SQL Server to run a direct update:

The following rules apply to single-row updates:

The following rules apply to multiple-row updates:


The update mode is direct

What it means: The data can be directly updated. UPDATE, DELETE, INSERT, and SELECT INTO statements can generate this plan.

Tip: Update direct is always faster than update deferred (see the preceding sidebar, "Update mode is deferred versus Update mode is direct," for more information).

Relevant: Yes

This step involves sorting

What it means: The query contains a DISTINCT or ORDER BY clause. To process the query, a worktable is created to sort the data.

Tips: This step is unavoidable if the query contains the DISTINCT clause. If the query contains the ORDER BY clause, you may be able to eliminate this step by creating a useful index.

Relevant: Yes

TO TABLE

What it means: The target table for data modifications. UPDATE, DELETE, INSERT, and SELECT INTO statements can generate this plan.

Tip: Ignore this output.

Relevant: No

UNION ALL

What it means: The query references a view that contains a UNION ALL clause.

Tips: Ignore this output.

Relevant: No

Using Clustered Index

What it means: The optimizer decided to use a clustered index to retrieve the record.

Tips: When working with a large table and retrieving a single record, the use of a clustered index is usually the fastest and easiest strategy to implement for data retrieval.

Relevant: Yes

Using Dynamic Index

What it means: The optimizer decided to build a temporary index to help process the query. This strategy is chosen when the query contains an OR clause or an IN clause.

Tips: A dynamic index is usually faster than a table scan on a large table, but slower than using an existing index. You may be able to avoid this step by creating a permanent index. Use the OR and IN clauses judiciously on large tables--more I/O is required to process the query.

Relevant: Yes

Using GETSORTED

What it means: The query contains a DISTINCT or ORDER BY clause. To handle the sort, a worktable will be created.

Tips: This step is unavoidable if the query contains the DISTINCT clause. If the query contains the ORDER BY clause, you may be able to eliminate this step by creating a useful index.

Relevant: Yes

VECTOR AGGREGATE

What it means: The query contains an aggregate function and a GROUP BY clause. Aggregate functions are AVG(), COUNT(), MAX(), MIN(), and SUM().

Tip: Ignore this output.

Relevant: No

Worktable

What it means: The optimizer decided that a worktable must be created to process the query.

Tips: The use of a worktable requires additional overhead, which can decrease performance. A GROUP BY clause always generates a worktable, so don't spend any time trying to get rid of the worktable if your query has a GROUP BY clause.

Relevant: Yes

Worktable created for DISTINCT

What it means: The query contains the DISTINCT clause. A worktable is always used when the query contains the DISTINCT clause.

Tip: Ignore this output.

Relevant: No

Worktable created for ORDER BY

What it means: The query contains an ORDER BY clause. The optimizer could not find a suitable index to handle the sort.

Tip: Evaluate your indexing strategy. An index may help eliminate this step.

Relevant: Yes


A Discussion about Worktable Created for ORDER BY
If the ORDER BY clause is generating a worktable, you can use the following strategies to help eliminate the need for a worktable:


Worktable created for REFORMATTING

What it means: This strategy is used when large tables are joined on columns that do not have useful indexes. The table with the fewest number of rows is inserted into a worktable. Then the worktable is used to join back to the other tables in the query. This reduces the amount of I/O required to process the query.

Tips: This is an easy one to fix! Whenever the optimizer chooses this strategy, look at your indexes. Chances are good that indexes do not exist or that the statistics are out of date. Add indexes to the columns you are joining on or issue an UPDATE STATISTICS command. The optimizer uses this strategy only as a last resort. Try to avoid this strategy.

Relevant: Yes

Worktable created for SELECT_INTO

What it means: The query contains the SELECT..INTO clause. A worktable is always created when SELECT..INTO is used.

Tip: Ignore this output.

Relevant: No

Overriding the Optimizer

Use the following features to override the optimizer:

Index Hints

New to SQL Server 6.x is the capability to override the optimizer. Now you can force the optimizer into using an index or force it to not choose an index.

You usually want to let the optimizer determine how to process the query. However, you may find it beneficial to override the optimizer if you find that it is not taking advantage of useful indexes. Following is the syntax used to override the optimizer:

SELECT ...
FROM [table_name] (optimizer_hint)

In this syntax, optimizer_hint has the following format:

INDEX={index_name | index_id}

In this format, index_name is any valid name of an existing index on the table. index_id is 0 or 1; 0 forces the optimizer to perform a table scan and 1 forces the optimizer to use a clustered index.

To force the optimizer to use a clustered index, use the following command:

select *
from authors (1)
where au_id = `213-46-8915'

To force the optimizer to perform a table scan, use the following command:

select *
from authors (0)
where au_id = `213-46-8915'

To force the optimizer to use the au_fname_idx nonclustered index, use the following command:

select *
from authors (INDEX = au_fname_idx)
where au_fname = `Marjorie'


TIP: Use the capability to override the optimizer with prudence. Only in cases where SQL Server is choosing a less-than-optimal execution plan should the optimizer be overridden.

The SET FORCEPLAN ON Command

The SET FORCEPLAN ON command forces the optimizer to join tables based on the order specified in the FROM clause. Normally, you want to let the optimizer determine the order in which to join tables; however, if you think that the optimizer is selecting an inefficient join order, you can use SET FORCEPLAN ON to force the join order.

When forcing SQL Server to use a predefined join order, you usually want the table with the fewest number of qualifying rows to come first in the FROM clause (or the table with the least amount of I/O, if you are dealing with a very wide or a very narrow table). The table with the second lowest number of qualifying rows should be next in the FROM clause, and so on.

The following example shows how SET FORCEPLAN ON can impact query optimization:

SET FORCEPLAN ON
select *
from titleauthor , authors
where titleauthor.au_id = authors.au_id
SET FORCEPLAN OFF

The following is showplan output with SET FORCEPLAN ON:

STEP 1
The type of query is SELECT
FROM TABLE
titleauthor
Nested iteration
Table Scan
FROM TABLE
authors
JOINS WITH
titleauthor
Nested iteration
Table Scan

Notice that, with SET FORCEPLAN ON, the optimizer processes the titleauthor table before processing the authors table.

Following is the query:

select *
from titleauthor , authors
where titleauthor.au_id = authors.au_id

Here is the showplan output:

STEP 1
The type of query is SELECT
FROM TABLE
authors
Nested iteration
Table Scan
FROM TABLE
titleauthor
JOINS WITH
authors
Nested iteration
Table Scan

Notice that, without using SET FORCEPLAN ON, the optimizer processes the authors table before processing the titleauthor table.


TIP: Whenever you use SET FORCEPLAN ON, be sure that you turn it off by issuing SET FORCEPLAN OFF. The feature remains in effect for your current connection until the connection is broken or until it is explicitly turned off.


CAUTION: Use the SET FORCEPLAN ON option as a last resort. You usually want to let the optimizer determine the order in which to process tables.

Other Tuning Tricks

Whenever you try to optimize a query, you should be on the lookout for obstructions that can lead to poor performance. The following sections discuss common causes of poor query performance.

Are You Trying to Tune an UPDATE, DELETE, or INSERT Query?

If you are trying to tune an UPDATE, DELETE, or INSERT query, does the table have a trigger? The query may be okay, but the trigger may need improvement. An easy way to determine whether the trigger is the bottleneck is to drop the trigger and rerun the query. If query performance improves, you should tune the trigger.

Does the Query Reference a View?

If the query references a view, you should test the view to determine whether it is optimized. An easy way to test whether the view is optimized is to run a showplan on the view.

Are the Datatypes Mismatched?

If you are joining on columns of different datatypes, the optimizer may not be able to use useful indexes. Instead, it may have to choose a table scan to process the query, as in the following example:

Table:
CREATE TABLE table1
(col1 char(10) not null)
Index: CREATE INDEX col1_idx ON table1(col1)
Row Count: 1000
Table:
CREATE TABLE table2
(col1 integer not null)
Index: CREATE INDEX col1_idx ON table2(col1)
Row Count: 1000
Query:
SELECT *
FROM table1, table2
WHERE table1.col1 = convert(char(10),table2.col1)
and table1.col1 = `100'

This query results in a table scan on table2 because you are joining a char(10) column to an integer column with the convert() function. Internally, SQL Server must convert these values to process the query, which results in a table scan. To avoid this problem, maintain consistency within your database design.

Mismatched datatypes can also cause an UPDATE to be deferred instead of being direct.

Does the Query Use a Nonsearch Argument?

Nonsearch arguments force the optimizer to process the query with a table scan. This is because the search value is unknown until runtime.

Following are some common examples of queries that use nonsearch arguments and how to convert them to search arguments that can take advantage of an index:

Table:
CREATE TABLE table1
(col1 int not null)
Index: CREATE UNIQUE CLUSTERED INDEX col1_idx ON table1(col1)
Row Count: 1000 rows

Following is a nonsearch argument query:

select *
from table1
where col1 * 10 = 100

Following is a search argument query:

select *
from table1
where col1 = 100/10

Following is a nonsearch argument query:

select *
from table1
where convert(char(8),col1) = `10'

Following is a search argument query:

select *
from table1
where col1 = convert(int,'10')


TIP: One way to help reduce the use of a nonsearch argument is to keep the table column on the left side of the equation and to keep the search criteria on the right side of the equation.

Does the Query Use More than One Aggregate Function?

If your query has more than one aggregate function in the SELECT clause, it may be forced to perform a table scan regardless of available indexes. The following query uses the two aggregate functions, MIN() and MAX():

Table:
CREATE TABLE table1
(id int not null)
Index: CREATE UNIQUE CLUSTERED INDEX id_idx ON table1(id)
Query:
SELECT MIN(id),MAX(id)
FROM table1

If you find that a table scan is being performed, you can rewrite the query by using a subquery in the SELECT statement, as in the next example. This query avoids a table scan by searching the index for both the MIN and MAX aggregate requests:

SELECT MIN(ID), (SELECT MAX(ID) FROM TABLE1) FROM TABLE1

Are You Trying to Optimize a Stored Procedure?

If you are trying to optimize a stored procedure, you must keep in mind the following rules:


NOTE: A stored procedure automatically recompiles its query plan whenever you drop an index used by a table within a stored procedure.

CREATE PROCEDURE usp_example @search_name char(50) AS
SELECT au_lname
FROM authors
WHERE au_lname like @search_name + `%'
Index: CREATE INDEX au_lname_idx ON authors(au_lname)

Consider what happens when a user executes the query with `B' as a parameter:

EXEC usp_example `B'

If the table contains numerous records that have a last name beginning with the letter B, the optimizer is likely to perform a table scan.

Now consider what happens when a user executes the query with `BREAULT' as a parameter:

EXEC usp_example `BREAULT'

If the table contains only a few records that have a last name equal to BREAULT, the optimizer is likely to use an index.

As you can see, the optimizer chooses different query plans based on the value of the parameter passed into the stored procedure. The problem arises when the first query plan is stored in memory. By default, different users receive the same query plan based on the first execution of the stored procedure. If the first user of the stored procedure passed `B' as a parameter, the query plan would be very inefficient for other users who specify `BREAULT' as a parameter.

To avoid this problem, use the WITH RECOMPILE option when you execute the procedure. This forces the procedure to regenerate its query plan, as in the following:

EXEC ... WITH RECOMPILE

The revised query plan is available only for the current execution of the procedure. Subsequent executions of the procedure without the recompile option revert back to the old query plan.

Additionally, you can use the WITH RECOMPILE option when you create the procedure, as in the following:

CREATE PROCEDURE ... WITH RECOMPILE

This option forces the optimizer to recompile the query plan each time the stored procedure is executed.

Between the Lines

Following are some important notes to remember when working with the query optimizer:

Summary

A good DBA knows how the SQL Server optimizer works. This knowledge enables the DBA to turn an agonizingly slow query into a fast query. Consequently, knowledge of the optimizer can keep the DBA from creating needless indexes that are never used by the system. The next chapter discusses multi-user considerations.


DISCLAIMER


To order books from QUE, call us at 800-716-0044 or 317-361-5400.

For comments or technical support for our books and software, select
Talk to Us.

© 1997, QUE Corporation, an imprint of
Macmillan Publishing USA, a Simon and Schuster Company.