Home

The Performance Counters

Ulysses

Ulysses displays the following performance counters:

(The 'View' menu gives you access to some settings that change the look and feel of the display.)


CPU: number and percent busy: The top of the gauge gives the CPU number, starting from zero. The bottom of the gauge shows the percentage of time that the processor is executing application or operating system processes other than Idle. This counter is a primary indicator of processor activity. It is calculated by measuring the time that the processor spends executing the thread of the Idle process in each sample interval, and subtracting that value from 100%. Each processor has an Idle thread which consumes (or tallys) cycles when no other threads are ready to run.

This measurement isn't the be-all and end of all that some think it is, but it can tell you a couple of things. If it is less than 50% when your system is misbehaving, a processor bottleneck is probably not the cause. When it gets close to 100%, the system is showing signs of stress. The better indicator of processor stress, though, is the CPU queue length.

Interrupts: This gauge shows the average number of hardware interrupts that the processor is receiving and servicing per second. This value is an indirect indicator of the activity of devices that generate interrupts, such as the system clock, the mouse, disk drivers, data communication lines, network interface cards and sound cards. These devices normally interrupt the processor when they have completed a task or require attention. Normal thread execution is suspended during interrupts. Most system clocks interrupt the processor every 10 milliseconds, creating a background of interrupt activity. Ihave not been able to measure hardware interrupts per the specific device, so this is the next best indicator of the level of IO activity>

CPU Queue: This shows the number of threads waiting in the processor queue. There is a single queue even on computers with multiple processors. Therefore, you may need to divide this value by the number of processors servicing the workload. Unlike the disk counters, this counter shows ready threads only, not threads that are running. A sustained processor queue of greater than two threads generally indicates processor congestion.

I consider this measurement as a boundry indicator. When there are entries in this queue, the processor is not keeping up. A queue length of more than two or three can indicate a CPU that is going to have trouble meeting the user's performance requirements. This is a better indicator of processor capacity issues than the percent busy.

Ram Avail: Shows the amount of physical memory, in bytes, available to processes running on the computer. It is calculated by summing adding the amount of space on the zeroed, free, and standby memory lists. Free memory is ready for use; zeroed memory consists of pages of memory filled with zeros to prevent later processes from seeing data used by a previous process; standby memory is memory that has been removed from a process's working set (its physical memory) en route to disk but is still available to be recalled.

I divide this into the physical memory size to get the fraction displayed on the pie chart. It is of no concern if the memory is heavily utilized. That's what its for. Products that purport to "Free up Ram" serve no useful purpose except to assist the credulous in emptying their wallets. Windows has sophisticated processes for determining what should be in physical memory, and it does its own 'compacting' and ram management based on system state and activity level. The system is not under stress due to lack of physical memory until the free physical memory is under 10 megabytes. The available bytes indicator will tell you if you have more than enough memory, and hence would benefit little from a memory upgrade. The rate of page faults is a better indicator of when more memory would be helpful.

Page Faults: Shows the average number of pages faulted per second. It is measured in numbers of pages faulted; because only one page is faulted in each fault operation, this is also equal to the number of page fault operations. Since a page fault takes something along the lines of 200,000 times longer than a dram access, a large number of page faults indicates that a memory upgrade would improve performance. Having said that, it is not possible to eliminate page faults completely. There will always be a baseline of such activity. I have 1 gig of ram with 700+ meg of memory available, and there are still page faults.

File Cache: This shows the percentage of read requests that were satisfied from the file cache and did not require a physical disk read. The lines between Physical RAM and Physical disk are blurred by the use of virtual memory, on the one hand, and the File Cache on the other. Windows maintains a cache for disk files in physical memory. Windows adjusts the size of this cache based on system workload and physical memory installed, and it represents another benefit from having a lot of physical memory.

The dark side of this is the "lazy-write" aspect of the file cache. If the power goes out while a disk write is in cache but has not been committed to disk, that update will be lost. This is one of the reasons to have a UPS and do the time-consuming software shutdown of Windows, as the cache is then flushed to disk.

Disk Drive: number, name, percent busy, and bytes per second: Each physical drive is numbered starting from 0, the logical drive letters it contains are shown (to the extent that they fit on the width of the gauge). Since the performance constraint in a hard drive is the physical movement of the actuator, it is of little interest to measure the activity by logical drive.

As with the processor, busyness in and of itself is not a bad thing. The busier a disk gets, however, the more likely it is that it will not be able to keep up with I/O requests. As with the CPU, this counter will tell you if your disks are not being stressed at all, but the better indicator of possible trouble is the size of the disk queue. I also show the rate of bytes per second that the disk is handling, which will give you an idea of the data volumes being moved.

Disk Drive: Disk Queue Shows the average number of write requests that were queued for the selected disk during the sample interval.This is an indicator that the disk drive is not able to keep up with the workload, and due to its slowness relative to the rest of the system, such delays are particularly costly. If queing is accompanied by a high level of page faults or file cache misses, additional memory may prove more helpful than a disk upgrade.

Net Card, Loopback: Since the network card can also consume PCI bus bandwidth, I added this gauge which measures the TCP bytes per second. There will be an indicator for each physical network card, and one for the TCP loopback device. The loopback device is not a physical device, but I wanted to see what it is doing to get a better idea of whether it impacts performance. The first line under the gauge shows what Windows has identified as the card's data rate. I have a 10/100 card, and it shows 95. I don't have any more details in how that number is calculated.