Performance tuning in the client/server world is something of a magical art. A combination of so many factors can make an application perform well, and knowing where to focus your time is what's most important.
The most critical part of optimizing performance is good documentation. Document statistically how the system works or performs before even starting any performance tuning. As the performance tuning cycle begins, you should monitor and document the effects of all changes so that it's easy to determine which changes were positive and which were negative. Never assume that all changes made for one application automatically apply to another application. Remember, you're ultimately tuning a product that a user is using, not just a database that is being accessed by some unknown client. n
As you'll see in the balance of this chapter, you need to first determine the database and transaction log sizes, then you can create the devices that will hold them. In creating the device, you'll be presented with a fundamental choice. Will you create a very large device and then place all databases on the single device, or will you create several smaller devices to support the individual databases?
At first glance, it's easier to create a single device, make it very large, and then place your different databases on the device. This may be easiest to implement, but it's not in your best interest in the long run. This is because, in order to restore a device that has become a problem, you'll be faced with taking down all production systems, rather than just the affected database and its respective device.
If you can, try to create a device per database and transaction log combination. This enables you to effectively work with each database separately and perform any tuning, updates, backup, and restore work, and so on, that needs to be performed, in many cases, without impacting the other systems.
Estimating the size of an SQL Server database is relatively straightforward and can be done with a good level of accuracy. The principle of space calculation is that all the bytes of data per table should be added together along with the associated overhead per row and page of data.
After you have this information calculated, you use it to determine the different sizing aspects of your database, including page fill percentages, table sizes, and ultimately, the size of the devices that you need to have on your system.
You use these values as a divisor to the page size (2K) to determine how many rows of data will fit in a page. The actual available space of a page is 2,016 bytes because 32 bytes are reserved for fixed overhead to manage the rows on the page. In general terms, these calculations are affected by the placement and use FILL FACTOR on indexes and also are affected by whether a clustered index is on the table. FILL FACTOR is discussed later in the chapter in the section, "Effects of FILL FACTOR."
TIP: When it comes to sizing your device to support the database, don't forget to add an allotment for the transaction log and space to grow. Your transaction log should be from 10 percent to 25 percent of the size of your resulting production database. If you leave an extra 10 percent of space available on the device after your projected maximum database and transaction log totals, you'll have the space to extend the databases should the need arise, giving you time to create new devices onto which to extend the database.
After you've created your initial database, you can use the stored procedure sp_spaceused to monitor the size of your database and tables. The syntax of sp_spaceused is:
sp_spaceused object [, @updateusage = `TRUE|FALSE'
For example, if you wanted to check both the entire Publishers (PUBs) database and the Authors table, you could submit the following query:
sp_spaceused go sp_spaceused authors
NOTE: The go is required to separate the two queries when they are submitted to SQL Server. n
When you submit this query, you get the results shown in Listing 21.1, which indicate the space used and the space remaining for each of the respective objects.
database_name database_size unallocated space ------------------------------ ------------------ ------------------ pubs 3.00 MB 1.77 MB reserved data index_size -> unused ------------------ ------------------ ------------------ ------------1262 KB 192 KB 108 KB 962 KB name rows reserved data -> index_size unused -------------------- ---------- ------------------ ------------------ ----- -> ------------ ------------------ authors 23 64 KB 2 KB
-> 8 KB 54 KB
Using this information, you can determine where you stand on utilization of existing database space.
You can also use sp_helpdb to get information about the size, owner, ID, date of creation, and current status for databases in the system. Listing 21.2 shows sample output from calling sp_helpdb.
name db_size owner dbid created status ------------------------ ------------- ------------------------ ------ --------- -----------------------feedback 10.00 MB sa 9 Sep 17 1996 no options set ipms 5.00 MB sa 7 Aug 2 1996 no options set logging 50.00 MB sa 8 Aug 2 1996 trunc. log on chkpt. marketing 15.00 MB sa 6 Aug 2 1996 no options set master 17.00 MB sa 1 Apr 3 1996 trunc. log on chkpt. model 1.00 MB sa 3 Apr 3 1996 no options set msdb 8.00 MB sa 5 Aug 2 1996 trunc. log on chkpt. pubs 3.00 MB sa 4 Apr 3 1996 trunc. log on chkpt.
tempdb 2.00 MB sa 2 Dec 11 1996 select into/bulkcopy
In the sections that follow, you'll see how to project how much space your tables will require in hopes of preventing errors if the database grows beyond its current confines.
Each SQL Server datatype consumes a certain number of bytes based on the storage of the data. Table 21.1 defines the amount of storage that each datatype uses.
Datatype | Size |
Char/Binary | The size indicated in the definition |
VarChar/VarBinary | The actual data size; use an average estimate |
Int | 4 bytes |
SmallInt | 2 bytes |
TinyInt | 1 byte |
Float | 8 bytes |
Float(b) | 4 bytes (numbers with precision of 1-7 digits) |
Float(b) | 8 bytes (numbers with precision of 8-15 digits) |
Double Precision | 8 bytes |
Real | 4 bytes |
Money | 8 bytes |
SmallMoney | 4 bytes |
Datetime | 8 bytes |
SmallDatetime | 4 bytes |
Bit | 1 byte |
Decimal/Numeric | 2-17 bytes depending on the precision |
Text/Image | 16 bytes per table row plus at least one 2K page per NOT NULL column |
Timestamp | 8 bytes |
SQL Server internally defines any NULLABLE column as a VAR datatype. So a CHAR(12) NULL column is actually a VARCHAR(12) column. Therefore, for any columns that permit NULL values, the average expected column size should be used.
Decimal and numeric precision affects the amount of storage required for these datatypes. Table 21.2 indicates the number of bytes required for each range of precision.
Numeric Precision | Size |
0-2 | 2 bytes |
3-4 | 3 bytes |
5-7 | 4 bytes |
8-9 | 5 bytes |
10-12 | 6 bytes |
13-14 | 7 bytes |
15-16 | 8 bytes |
17-19 | 9 bytes |
20-21 | 10 bytes |
22-24 | 11 bytes |
25-26 | 12 bytes |
27-28 | 13 bytes |
29-31 | 14 bytes |
32-33 | 15 bytes |
34-36 | 16 bytes |
37-38 | 17 bytes |
The method of calculating a table's space requirements differs based on whether the table has a clustered index or not. Both calculation methods are shown here, and examples will be drawn from the Pubs database to illustrate their use.
Some things to be aware of when calculating table and index sizes are:
Tables with Clustered Indexes The Publishers table has a clustered index. This example estimates the space required for 5,000,000 rows, and assumes that the average length of the VarChar columns is 60 percent of the defined length:
2 + (Sum of column sizes in bytes) = Row size
2 + (Sum of fixed-length column sizes in bytes) + (Sum of average of variable-length columns) = Subtotal Subtotal * (( Subtotal / 256) _+ 1) + (Number of variable-length columns +_ 1) + 2 = Row Size
2 + 4 + (60% of 92) = 55.2 55.2 * ((55.2/256) + 1) + 5 + 2 = 75
2016 / (Row Size) = Number of rows per page
2016 / 75 = 27
TIP: For more accurate calculations, round down any calculations for number of rows per page.
5,000,000 / 27 = 18519
TIP: For more accurate calculations, round up any calculations for number of pages required.
5 + (Sum of column sizes in bytes) = Clustered index size
5 + (Sum of fixed-length column sizes in bytes) + (Sum of average of variable-length columns) = Subtotal Subtotal * (( Subtotal / 256) _+ 1) + (Number of variable-length columns +_ 1) + 2 = Clustered index size
5 + 4 = 9
(2016 / (Clustered index size)) - 2 = Number of rows per page
(2016 / 9) - 2 = 222
(Number of data pages) / (Number of clustered index rows per page) = Number of index pages -> at index level N
18519 / 222 = 84
84 / 222 = 1
Index pages (level 1): 1
Index pages (level 0): 83
Total number of 2K pages: 19,403 (or about 38M)
Tables with Non-Clustered Indexes Tables with non-clustered indexes are calculated in size the same way as a clustered index table, except for the sizing of the index itself. In this example, assume that a non-clustered index has been added to the Roysched table on the title_id column, and that 7,000,000 rows are in the table. The following steps will help you size a non-clustered index:
7 + (Sum of fixed-length keys) = Size of index row
9 + (Sum of length of fixed-length keys) + (Sum of length of variable-length keys) + (Number of variable-length keys) + 1 = Subtotal (Subtotal) + ((Subtotal / 256) + 1) = (Size of leaf index row)
7 + 6 = 13
2016 / (Size of leaf index row) = Number of leaf rows per page
2016 / 13 = 155 (Number of rows in table) / (Number of leaf rows per page) = Number of leaf pages
e.g.: 7,000,000 / 155 = 45,162
(Size of leaf index row) + 4 = Size of nonleaf row
(2016 / Size of nonleaf row) - 2 = Number of nonleaf index rows per page
(Number of leaf pages / Number of nonleaf index rows per page) = Number of index pages at Level N
Total number of 2K pages: 45553, or about 89M
FILL FACTOR alters the number of rows that SQL Server places on a page. The most likely configuration of FILL FACTOR is to assume that the table never changes its dataset and, therefore, you set FILL FACTOR to 100% to maximize the use of data pages. This affects the calculations by increasing the number of rows that can fit on a page by two (2).
If you're sizing an index with a FILL FACTOR of 100%, don't subtract two (2) from the result of the number of rows per page because SQL Server doesn't pre-allocate these rows for page growth but, instead, puts user data there.
Any other value of FILL FACTOR alters the size of the page itself. For example, a FILL FACTOR of 70% reduces the amount of available space on the page from 2,016 bytes to 1,412 bytes.
Sizing the Procedure Cache in SQL Server is basically a case of trial and error. Microsoft documents an approximation based on the following formula:
Procedure Cache = (Maximum Concurrent Users) * (Size of Largest Plan) * 1.25
To determine the size of a plan in memory, the DBCC MEMUSAGE command should be issued in ISQL. Listing 21.3 illustrates the output from DBCC MEMUSAGE.
/*----------------------------- dbcc memusage -----------------------------*/ Memory Usage: ... Buffer Cache, Top 20: ... Procedure Cache, Top 12: ... Procedure Name: sp_help Database Id: 1 Object Id: 1888009757 Version: 1 Uid: 1 Type: stored procedure Number of trees: 0 Size of trees: 0.000000 Mb, 0.000000 bytes, 0 pages Number of plans: 1 Size of plans: 0.051249 Mb, 53738.000000 bytes, 27 pages
Assuming that sp_help was the largest procedure to be run on a server and that there were to be 150 concurrent users, the following formula should be used:
150 * 27 * 2 * 1.25 = 10125K
NOTE: Memory in the procedure cache is managed as a set of 2K pages. The number of pages reported by DBCC MEMUSAGE, therefore, is multiplied by 2K to derive the amount of memory that the plan actually consumes. n
An alternative sizing can be estimated based on the need to stop SQL Server from recompiling procedures that fall out of the cache frequently. The procedure cache, like the data cache, works on a Least Recently Used (LRU) algorithm and procedures that are used infrequently are pushed out of the cache if there's no more room to compile a procedure that is requested by a user process.
You should compile a list of the number of critical procedures, or procedures that are most frequently accessed. Execute each one, analyzing the memory used, as reported by DBCC MEMUSAGE. Based on the total memory calculated, the size of the procedure cache can be determined.
Ultimately, the only true judge of an accurate size of procedure cache is to test, test, and test an application and monitor the effects of altering the amount of cache available.
It's possible to run out of procedure cache if the number of active procedures in use and their combined plan sizes are greater than the cache available. In this case, you'll receive error 701, and the calling process that was trying to execute a procedure will be rolled back. If you receive this error message 701, you should resize the procedure cache to a higher percentage of available memory.
Windows NT's Performance Monitor is an advanced tool that provides statistics about the operation of the NT environment. One of the unique properties of the Performance Monitor is its capability to install performance heuristics and callbacks from other executables in the Windows NT system and report their statistics.
SQL Server's Performance Monitor is just a set of hooks for the core Windows NT Performance Monitor to call. SQL Server groups the statistics that can be displayed into objects. These objects group the logical similar statistics.
SQL Server gathers statistics in one of two different modes:
To configure either Direct Response or On Demand mode, follow these steps:
FIG. 21.1
The first thing you need to do is select the server you want to work with.
FIG. 21.2
The SQL Enterprise Manager's Server Configuration/Options dialog box lets you set
up the modes you want to use.
A list of the objects that SQL Server Performance Monitor can report on appears in the following sections. Some of the counters within the objects can be applied to particular instances of activity on the server. For example, the Users object's Physical I/O counter can be applied to each user connected to the system. This process of configuring instances to the particular counter enables you to customize the statistics that you want to monitor.
SQL Server Object The SQL Server object is the master object and provides a wide variety of statistics. The items that are monitored here are the high-level statistics that are very important to everyday management of your server. Normal monitoring of SQL Server typically watches the I/O and cache statistics to make sure that the physical I/O and memory subsystems aren't being flooded with requests. The statistics are grouped as follows:
SQL Server Replication-Published DB Object The Replication-Published DB object enables you to monitor the publication of transaction log information from a source or publishing database. It is highly recommended that you monitor the performance of this object if you are using replication in your environment. The following statistics help you decide how efficiently your server is publishing data:
SQL Server Replication-Subscriber Object The Replication-Subscriber object enables monitoring of the replication that is updating tables on the subscribing server/database. Like the Published object (in the preceding section) careful monitoring of this object is essential to the management of effective replication. The following statistics are provided:
SQL Server Locks Object The Locks object is provided to enable the central management of locks in the database. The counters provided in this object detail the totals of the different types of locks: Extent, Intent, Page, and Table. Additional counters are provided to indicate the number of blocking locks.
See "Understanding Transactions and Locking," for more information about locks Chapter 12
SQL Server Log Object The Log object is provided so that alerts can be placed on the size of the transaction log associated with a database in addition to the available free space on it. This enables you to dump the logs at required intervals:
SQL Server Users Object The Users object has the following counters available that track the statistics about user activity on the server:
Chart views are often the easiest statistical viewing mechanism for a database administrator. With a chart you can track periodic performance in a number of criteria and see prior history in the same view.
To use the SQL Performance Monitor to chart statistics for view, follow these steps:
FIG. 21.3
Windows NT's Performance Monitor running the SQL Server control file (SQLCTRS.PMC)
with the default objects being monitored.
FIG. 21.4
You use the Add to Chart dialog box to select the counters you want to have displayed
in Performance Monitor.
TIP: Click the Explain button to get a short explanation of each counter as you select it.
FIG. 21.5
The Add to Chart dialog box is ready to add SQL Server-Log counters for the Master
and Pubs databases.
Performance Monitor can create basic reports of data that are being gathered. These reports show the current, or most recent, values gathered from the statistics of the selected counters.
To use the SQL Performance Monitor to create a report of system statistics, follow these steps:
FIG. 21.6
The Add to Report dialog box has the SQL Server Replication-Published DB object
selected as well as all the counters to be added to the report.
Alerts are one of the most useful features of Performance Monitor. Performance Monitor can not only gather statistics from various system objects, but it can also monitor the values of those statistics. If they reach predetermined levels, Performance Monitor can execute a program that can, for example, dial a pager or alert the DBA in some other fashion.
To use the SQL Performance Monitor to create a statistical alert, follow these steps:
FIG. 21.7
The Add to Alert dialog box is ready to add an alert that will send a mail message
to the DBA if the used log space on the Pubs database exceeds 75%.
Sizing your database, and understanding how you come up with the different table sizes, is one of the most important ongoing aspects of managing your server. You'll want to manage this aspect of your system closely because if you don't, and the database or transaction logs fill up, your system will come to a grinding halt.
Be sure you do regular backups of the transaction log because this enables you to dump the log and manage the space used by it. You can also use the truncate log on checkpoint option to manage the transaction log, but it's even more imperative that you do regular backups if you use this option.
On the CD accompanying this book you'll find an Excel workbook that will help in your sizing efforts. This workbook lets you indicate the columns, their associated data type, and the indexes that you're defining against a table. If you complete one worksheet for each table in your database, you can easily add together the resulting sizes and have a good idea of the sizing requirements for your database.
If you have a dedicated server, as it's likely you will in a production environment, you should consider running the performance monitor at all times, though with an infrequent timing interval. A good interval might be in the five to ten minute range. In actual use, consider also monitoring the counter for swap file usage. Using this counter, along with the other counters outlined here that are specific to SQL Server, you can see trends in system usage that might indicate a change in usage patterns that would adversely impact your system.
In this chapter you learned how to identify and manage statistics that will help you determine the performance characteristics of your server. You also learned how to size a database and how to size the procedure cache.
Consider looking at the following chapters to further develop your SQL Server knowledge:
© Copyright, Macmillan Computer Publishing. All rights reserved.