Page 61
Also of interest are the User and Server processes, which handle end-user transactions against the database, and the Parallel Query (Pnnn) Processes, which perform parallel query operations for the database. Although these are not classified as Oracle Background processes, it is important to understand the role they play in the Oracle environment. A discussion of each of these processes follows.
SMON and PMON For one reason or another, connections into the Oracle database may crash, hang, or otherwise abnormally terminate. End-users might shut down their client machine without logging out of the database application, or a network or system failure unrelated to the database may cause an automated database job to fail. The Oracle server must be able to transparently resolve the problems resulting from these kinds of failures.
Together, SMON and PMON are the background processes responsible for automatically resolving database system problems. PMON, the Process Monitor, performs automatic cleanup of terminated or failed processes, including clearing the orphaned sessions left from an abnormally terminated process, rolling back uncommitted transactions, releasing the locks held by disconnected processes, and freeing SGA resources held by failed processes. It also monitors the server and dispatcher processes, automatically restarting them if they fail.
SMON plays a smaller but nonetheless very important role. Upon database startup, SMON is the process that performs automatic instance recovery. If the last database shutdown was not clean, SMON automatically rolls forward the operations that were in-progress, and rolls back the uncommitted transactions. SMON is also the process that manages certain database segments, reclaiming temporary segment space no longer in use, and automatically coalescing contiguous blocks of free space in data files.
NOTE |
SMON only coalesces free space in tablespaces where the default storage parameter (used when creating tablespaces or tables) pctincrease is not 0. Set the pctincrease to at least 1 if you want SMON to automatically handle this operation. |
SMON and PMON are two of the required background processes. (The database does not start if either of these two processes fail on startup.)
DBWR DBWR, or the Database Writer process, is responsible for writing the dirty blocks from the database buffer cache to the data files. Rather than write out each block as it is modified, DBWR waits until certain criteria are met, and in batch reads the dirty list and flushes all the blocks found in it to the data files. This provides a high level of performance, and minimizes the extent to which the database is I/O bound.
Page 62
DBWR flushes the dirty blocks when
Configuring the DBWR background process is fairly straightforward, and the default values for settings are in many cases sufficient for a small-or medium-sized database. Larger, more active, or specialized databases often have special needs, however, that force the manual configuration of some of the DBWR parameters.
In most installations, there is one DBWR process to handle all the write activity of the database. You can, however, start more than one DBWR process if you find DBWR is unable to keep up with the demands of the database. The init.ora parameter DB_WRITERS, which defaults to one, sets the number of DBWR processes that are created at startup. In most cases, the decision to use more than one DBWR process is only made when the OS on which the Oracle server is being run does not support asynchronous I/O. If this is the case, multiple DBWR processes should be created. Some suggest that you should use as many DBWR processes as physical disks used to store data files on; others suggest setting the number to be equal to the number of data files in the database. Experiment with adding and subtracting DBWR processes until your best performer is reached.
The database buffer cache also uses latches to control access to the memory structure. The LRU latch controls the replacement of buffers in the buffer cache. In very active servers with multiple CPUs, there may be contention for these latches. If this happens, set the parameter DB_BLOCK_LRU_LATCHES to a number equal to the number of latches to create for the buffer cache. This number cannot be greater than twice the number of CPUs, and is automatically set to the number of CPUs in the system.
Another init.ora parameter that affects DBWR behavior is DB_BLOCK_CHECKPOINT_BATCH. This parameter sets the maximum number of blocks DBWR writes with each checkpoint (see the later section on the checkpoint process, CKPT, for more information). By increasing this number, you can decrease the number of times DBWR must flush the buffer cache. Increasing this number too much, however, may cause an unacceptable delay when DBWR finally does flush the buffer.
A third parameter to keep in mind is DB_BLOCK_CHECKSUM. This is a Boolean parameter that, when enabled, causes each database block to be written with a checksum value attached. When the block is subsequently read, the checksum is computed and compared with that stored in the database. If the values are different, an error is raised. This is a valuable parameter when troubleshooting data corruption problems, but should not be enabled all the time, because of the performance hit taken from calculating and storing the checksum for each I/O operation. (See Chapter 32, "Tuning I/O," for more information on tuning the DBWR process and database I/O.)
Page 63
LGWR LGWR, or Log Writer, is the fourth and final mandatory background process. Log Writer is the process that writes redo log entries from the redo log buffer in the SGA to the online redo log files. LGWR performs this write when a commit occurs, the inactivity timeout for LGWR is reached, the redo log buffer becomes one-third full, or DBWR completes a flush of the data buffer blocks at a checkpoint. LGWR also handles multiple user commits simultaneously, if one or more users issue a commit before LGWR has completed flushing the buffer on behalf of another user's commit.
It is important to note that Oracle does not regard a transaction as being complete until LGWR has flushed the redo information from the redo buffer to the online redo logs. It is LGWR's successful writing of the redo log entries into the online redo logs and not the changing of the data in the data files, that returns a success code to the server process.
The LGWR process is rarely a source of performance problems for the database. In addition, few options are available for custom configuration of LGWR. Most of the configuration necessary involves the redo log buffer and memory structures supporting that buffer, rather than the LGWR process itself.
The exception to this, however, deals with the secondary task of the LGWR process, performing the operations necessary to conduct a database checkpoint. Remember that LGWR performs this task unless the CKPT process is activated. A checkpoint causes process and I/O time to be spent by both LGWR and DBWR. More frequent checkpoints decrease the recovery time necessary if a database failure occurs, as well as reducing the work necessary to perform each individual checkpoint. You must weigh both of these factors when deciding on the correct checkpoint interval. Several parameters govern the occurrence of database checkpoints.
LOG_CHECKPOINT_INTERVAL and LOG_CHECKPOINT_TIMEOUT are two parameters that can change the checkpoint intervalthat is, the time or situation necessary for a database checkpoint to be triggered. LOG_CHECKPOINT_INTERVAL, when set, causes a checkpoint to be triggered when a number of OS blocks (not Oracle blocks) is written to redo. LOG_CHECKPOINT_TIMEOUT, when set, causes a checkpoint to occur after the interval (in seconds) specified for the parameter.
These parameters should be used with care. If LOG_CHECKPOINT_INTERVAL is used, it should be set so that the number of OS blocks that trigger a checkpoint are relative to the size of the redo log group. Remember that when a redo log group fills, a checkpoint is triggered. Be careful that you do not set a LOG_CHECKPOINT_INTERVAL value that causes more checkpoints to occur than necessary, or forces checkpoints when they are not needed. For example, consider a redo log group of 3MB, and a LOG_CHECKPOINT_INTERVAL set to 2.5MB. When 2.5MB are written to the redo logs, the LOG_CHECKPOINT_INTERVAL value causes a checkpoint to occur. In addition, when the redo log group fills (after only 0.5MB have been written), another checkpoint occurs. In essence, two checkpoints will occur right after each other.
You can also control the frequency of checkpoints by sizing your redo log groups accordingly. If you size your logs so that a log switch occurs every hour, you only have one checkpoint an hour from redo log group switches. If your groups are sized so that checkpoints occur every five minutes, however, you waste a lot of process and I/O time performing the related checkpoints.