Page 534
A sample output of the sar command is shown later. The basic format of the command is
sar -u n t
where n is the interval for which the monitoring is to be done and t is the number of times for which the monitoring is to be performed.
$ sar -uM 2 10 HP-UX arms1 B.10.10 U 9000/819 09/04/97 19:53:57 cpu %usr %sys %wio %idle 19:53:59 0 38 9 40 12 1 43 10 37 10 system 40 10 39 11 19:54:01 0 51 14 29 6 1 45 17 33 6 system 48 15 31 6 19:54:03 0 42 8 44 6 1 44 6 42 8 system 43 7 43 7 19:54:05 0 52 10 34 4 1 34 17 42 6 system 43 14 38 6 19:54:07 0 30 14 45 11 1 37 10 46 7 system 34 12 46 9 19:54:09 0 44 10 38 8 1 44 6 43 6
If the CPU overview shows a lot of idle time, it could possibly mean that the CPU is not utilized to the fullest. If the %wio, %usr, and %sys columns have low values and the %idle column has high value, it means nothing is running on the system.
If the system time spent by the CPU is higher than the user time, this also needs to be checked. Overall, you should aim to get the user CPU time double or more than the system CPU time; otherwise, you need to probe further. In the previous output, the %wio column shows a high value; that means that in the current scenario there is no problem with the CPUs but there could be a bottleneck on the disks. You may further need to execute the sar -d command to find out which disk is facing the I/O load.
As mentioned earlier, every process that has to be executed on the system will wait in the queue called the run queue. If there are too many jobs waiting on the run queue, it is an indication there are heavy resource-intensive jobs running on the system and the CPU cannot cope with the demands of the system. The following is a sample assessment of the run queue and also another good starting point to see whether the CPUs are bogged down with processes running on the system:
Page 535
$ sar -qu 2 10 HP-UX arms1 B.10.10 U 9000/819 09/06/97 17:35:37 runq-sz %runocc swpq-sz %swpocc %usr %sys %wio %idle 17:35:39 1.0 25 0.0 0 39 11 43 6 17:35:41 0.0 0 0.0 0 46 10 40 5 17:35:43 1.0 25 0.0 0 17 5 55 24 17:35:45 1.0 25 0.0 0 38 9 42 10 17:35:47 1.0 25 0.0 0 39 9 45 8 17:35:49 0.0 0 0.0 0 36 6 52 6 17:35:51 1.0 25 0.0 0 22 10 51 16 17:35:53 1.0 25 0.0 0 44 7 46 3 17:35:55 2.0 25 0.0 0 41 8 45 6 17:35:57 0.0 0 0.0 0 25 8 50 17 Average 1.1 18 0.0 0 Average 35 8 47 10
Table 22.2 explains the meaning of the different columns in the output.
Table 22.2 Explanation of Columns in the sar -qu OutputColumn Name | Column Description |
runq-sz | The size of the run queue. It does not include processes that are sleeping or waiting for I/O to complete; it does include processes that are in memory and waiting to be run. |
%runocc | The percentage of time the run queue is occupied by processes waiting to be executed. |
swpq-sz | The average length of the swap queue during the interval the monitoring was done. Processes that are ready to be run but have been swapped out are included in this count. |
%swpocc | The percentage of time the swap queue of runnable processes (processes swapped out but ready to run) was occupied. |
Page 536
From the previous output, you can see that the size of the run queue is only 1; that means that during the monitoring interval, only one process was waiting for the CPU in the queue. During the interval, the run queue percentage is occupied only 25 percent of the time. You can conclude that at this point the CPU is very lightly loaded. Typically, be on the lookout for a run queue in excess of 5 or 6. If the queue gets larger than these values, either reduce the number of running processes, increase the number of CPUs, or upgrade the existing CPU. You can identify which processes are waiting for CPU resources or loading the CPU by using the techniques mentioned in the next section.
There are several ways to find heavy CPU users on a system. You can either use the operating system tools to identify heavy CPU usage users or use the information stored in the Oracle system tables. In this section, you examine both options. In fact, both routes could be used to tally up and find a heavy CPU user on the system.
Using the Oracle RouteThe Oracle dynamic system performance table provides a vast wealth of information about what is happening on the system. The V$SYSSTAT and V$SESSTAT views are two of the dynamic performance views that you use to find the information you require.
A general listing of the value stored in V$SYSSTAT is given in Listing 22.1. The information we are interested in here is CPU usage, but this view can be used for a number of other monitoring purposes.
Listing 22.1 Using V$SYSSTAT to Find System StatisticsSQL> select * from v$sysstat order by class,statistic#; STATISTIC# NAME CLASS VALUE ---------- ---------------------------------------- ---------- ---------- 0 logons cumulative 1 2051 1 logons current 1 68 2 opened cursors cumulative 1 72563 3 opened cursors current 1 636 4 user commits 1 289212 5 user rollbacks 1 7299 6 user calls 1 2574300 7 recursive calls 1 4726090 8 recursive cpu usage 1 1334927 9 session logical reads 1 205058382 10 session stored procedure space 1 0 12 CPU used by this session 1 4896925 13 session connect time 1 6.2794E+11 15 session uga memory 1 8599758248 16 session uga memory max 1 169781672 20 session pga memory 1 311833920 21 session pga memory max 1 324695784 101 serializable aborts 1 0 133 bytes sent via SQL*Net to client 1 206511175 134 bytes received via SQL*Net from client 1 174496695