Monitoring Commands
nmon vmstat iostat sar topas nmon svmon filemon mpstat rmss netpmon
vmstat
The vmstat command is useful for obtaining an overall picture of CPU, paging, and memory usage. The following is a sample report produced by the vmstat command:# vmstat 5 2 kthr memory page faults cpu ----- ----------- ------------------------ ------------ ----------- r b avm fre re pi po fr sr cy in sy cs us sy id wa 1 1 197167 477552 0 0 0 7 21 0 106 1114 451 0 0 99 0 0 0 197178 477541 0 0 0 0 0 0 443 1123 442 0 0 99 0
Remember that the first report from the vmstat command displays cumulative activity since the last system boot. The second report shows activity for the first 5-second interval.
iostat
The iostat command is the fastest way to get a first impression, whether or not the system has a disk I/O-bound performance problem. This tool also reports CPU statistics.
Flags -a Specifies adapter throughput report. -d Specifies drive report only. -m Specifies statistics for paths. -t Specifies tty/cpu report only. -z Resets the disk input/output statistics # iostat 2 2 tty: tin tout avg-cpu: % user % sys % idle % iowait 0.0 0.8 8.4 2.6 88.5 0.5 0.0 80.2 4.5 3.0 92.1 0.5 Disks: % tm_act Kbps tps Kb_read Kb_wrtn hdisk3 0.0 0.3 0.0 258032 224266 hdisk2 0.1 1.1 0.0 258088 1658678
To display the adapter information
# iostat -t -a -D System configuration: lcpu=120 drives=27 paths=252 vdisks=0 tapes=0 Adapter: fcs0 xfer: bps tps bread bwrtn 1.1M 47.7 468.3K 668.8K Adapter: fcs1 xfer: bps tps bread bwrtn 800.2K 34.7 330.9K 469.3K
To display disk statistics including queue info in long list format
# iostat -lD System configuration: lcpu=120 drives=27 paths=252 vdisks=0 Disks: xfers read write queue -------------- -------------------------------- ------------------------------------ ------------------------------------ -------------------------------------- %tm bps tps bread bwrtn rps avg min max time fail wps avg min max time fail avg min max avg avg serv act serv serv serv outs serv serv serv outs time time time wqsz sqsz qfull hdisk1 0.6 58.6K 12.3 1.8K 56.8K 0.1 3.8 0.0 0.0 0 0 12.2 5.4 0.8 92.3 0 0 0.5 0.0 47.0 0.0 0.0 0.2 hdisk0 0.5 63.6K 12.5 6.9K 56.7K 0.3 2.5 2.6 7.1 0 0 12.2 5.0 0.3 100.0 0 0 0.4 0.0 50.3 0.0 0.0 0.2 hdisk8 5.0 908.7K 81.5 203.2K 705.5K 17.1 17.6 0.2 284.0 0 0 64.5 540.5 0.3 40.7 0 0 2.3S 0.0 26.6 215.0 36.0 21.3 hdisk2 1.8 55.4K 3.4 45.9K 9.6K 2.8 12.4 0.2 11.8 0 0 0.6 156.5 0.3 11.6 0 0 4.5 0.0 0.0 0.0 0.0 0.0 hdisk6 5.3 931.6K 85.3 254.3K 677.3K 22.5 13.9 0.2 420.5 0 0 62.8 563.9 0.2 38.0 0 0 1.2S 0.0 10.4 107.0 36.0 18.7 hdisk7 5.0 944.0K 85.5 208.7K 735.3K 17.3 17.5 0.2 313.6 0 0 68.2 497.0 0.3 43.7 0 0 2.2S 0.0 27.2 215.0 35.0 22.0
svmon
The svmon command provides a more in-depth analysis of memory usage. It is more informative, but also more intrusive, than the vmstat and ps commands. The svmon command captures a snapshot of the current state of memory.
The memory consumption is reported using the inuse, free, pin, virtual and paging space counters.
- The inuse counter represents the number of used frames.
- The free counter represents the number of free frames from all memory pools.
- The pin counter represents the number of pinned frames, that is, frames that cannot be swapped.
- The virtual counter represents the number of pages allocated in the system virtual space.
- The paging space counter represents the number of pages reserved or used on paging spaces.
Flags
-G Global report -U User report -P Process report -i To define intervel and number of intervels. eg. -i 1 5
To find out the total memory/swap and free memory/swap available in an AIX system
# svmon -G size inuse free pin virtual memory 3932160 3914793 17367 444363 1609451 pg space 1048576 6622 work pers clnt pin 444363 0 0 in use 1609451 0 2305342 PageSize PoolSize inuse pgsp pin virtual s 4 KB - 3787625 6622 370027 1482283 m 64 KB - 7948 0 4646 7948 # pagesize 4096
So, the above system have almost 16GB physical Memory and 4 GB swap
A memory leak can be detected with the svmon command, by looking for processes whose working segment continually grows. A leak in a kernel segment can be caused by an mbuf leak or by a device driver, kernel extension, or even the kernel. To determine if a segment is growing, use the svmon command with the -i option to look at a process or a group of processes and see if any segment continues to grow.
# svmon -P 13548 -i 1 2 Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd LPage 13548 pacman 8535 2178 847 8533 N N N Vsid Esid Type Description LPage Inuse Pin Pgsp Virtual 0 0 work kernel seg - 4375 2176 847 4375 48412 2 work process private - 2357 2 0 2357 6c01b d work shared library text - 1790 0 0 1790 4c413 f work shared library data - 11 0 0 11 3040c 1 pers code,/dev/prodlv:4097 - 2 0 - - ginger :svmon -P 13548 -i 1 3 Pid Command Inuse Pin Pgsp Virtual 64-bit Mthrd LPage 13548 pacman 8589 2178 847 8587 N N N Vsid Esid Type Description LPage Inuse Pin Pgsp Virtual 0 0 work kernel seg - 4375 2176 847 4375 48412 2 work process private - 2411 2 0 2411 6c01b d work shared library text - 1790 0 0 1790 4c413 f work shared library data - 11 0 0 11 3040c 1 pers code,/dev/prodlv:4097 - 2 0 - -
filemon
The filemon command monitor a trace for file system and IO system events and reports performance statistics for files, virtual memory segments, logical volumes and physical volumes. filemon is useful to those whose applications are believed to be disk-bound and want to know where and why.
filemon command shows the load on different disks, logical volumes and files in a great detail.
trcstop command is used to stop the filemon monitoring.
The syntax of filemon command is
filemon [-o output_file] [-O levels] [-u] [-v] -O [lv | pv | vm | If | all ] (If - Logical file level, vm - Virtual memory level, lv - lv level) -u Reports on files that were opened prior to the start of the trace daemon
If the output file is not specified, the output is sent to standard output.
To start the filemon monitoring for 1 min.
# filemon -uo filemon.out -O all ; sleep 60; trcstop
To find out the most active Logical Volumes
# awk '/Most Active Logical Volumes/,/^$/' filemon.out Most Active Logical Volumes ------------------------------------------------------------------------ util #rblk #wblk KB/s volume description ------------------------------------------------------------------------ 0.04 4208 0 34.9 /dev/paging00 paging 0.04 4000 0 33.2 /dev/hd6 paging 0.01 1680 11408 108.6 /dev/oralvr32 /oracle/R32 0.00 0 264 2.2 /dev/hd8 jfs2log
To find out most active Files
# awk '/Most Active Files/,/^$/' filemon.out
To find out most active physical Volumes
# awk '/Most Active Physical Volumes/,/^$/' filemon.out
rmss
The rmss command provides you with a means to simulate different sizes of real memory that are smaller than your actual machine, without having to extract and replace memory boards or reconfigure memory using logical partitions.
To change the memory size to 500 MB,
# rmss -c 500 Simulated memory size changed to 500 Mb.
To reset the memory size to the real memory size of the machine, enter:
# rmss -r
Tuning Commands
vmo ioo no nice and renice vmtune defragfs
The /etc/tunables commands
To manage its files in the /etc/tunables directory, new commands have been added to AIX. They are as follows:
tuncheck: This command validates a file either to be applied immediately or at reboot time (-r flag). It checks the ranges, dependencies, and prompts to run bosboot if required. Run this command if you copy a file to a new system, or edit it with an editor such as vi.
tunsave: This command saves all current values to a file, including optionally the
nextboot file.
nextboot file.
tunrestore: This command applies values from a file, either immediately, or at the next reboot (-r flag). With the -r flag, it validates and copies the file over the current nextboot file.
tundefault: This command resets all parameters to their default value. It can be applied at the next reboot with the -r flag.
ioo, vmo and no commands:
These commands are used to set or display current or next boot values of different tuning parameters.
- ioo for IO tuning parameters
- vmo for Virtual Memory Manager parameters
- no for network tuning parameters
These commands can also make permanent changes or defer changes until the next reboot. When a permanent change or nextboot value is changed using these commands, the/etc/tunables/nextboot file is updated automatically with the new values (if the new value is different from the default value).
The following flags are common for ioo, vmo and no commands.
-L [tunable] List the characteristics of one or all tunables -d tunable - Resets 'tunable' to default value -o [tunable] - Display the current value of 'tunable' -o tunable=<value> - Set the 'tuneble' to new value -D - Resets all tunables to their default value -p - Changes apply to both current and reboot values (/etc/tunables/nextboot file updated) -r - Changes apply to only reboot values (/etc/tunables/nextboot file updated)
Examples:
# vmo -p -o minfree=1200 -o maxfree=1280 # ioo -r -o maxpgahead=64 -o j2_minPageReadAhead=8 # no -r -o rfc1323=1 -o tcp_recvspace=262144 -o tcp_sendspace=262144
cat /etc/tunables/nextboot vmo: minfree = "1200" maxfree = "1280" minperm% = "10" maxperm% = "40" maxclient% = "40" ioo: j2_nBufferPerPagerDevice = "1024" no: tcp_recvspace = "65536" tcp_sendspace = "65536" tcp_pmtu_discover = "0" udp_pmtu_discover = "0
minfree Minimum acceptable number of real-memory page frames in the free list. When the size of the free list falls below this number, the VMM begins stealing pages. It continues stealing pages until the size of the free list reaches maxfree.
maxfree Maximum size to which the free list will grow by VMM page-stealing. The size of the free list may exceed this number as a result of processes terminating and freeing their working-segment pages or the deletion of files that have pages in memory.
minperm If the percentage of real memory occupied by file pages falls below this level, the page-replacement algorithm steals both file and computational pages, regardless of repage rates.
maxperm' If the percentage of real memory occupied by file pages rises above this level, the page-replacement algorithm steals only file pages.
maxclient If the percentage of real memory occupied by file pages is above this level, the page-replacement algorithm steals only client pages.
aio (Asynchronous IO)
AIO is an AIX software subsystem that allows processes to issue I/O operations without waiting for I/O to finish. Because both I/O operations and applications processing run concurrently, they essentially run in the background and improve performance. This is particularly important in a database environment.
- Prior to AIX 6.1, AIO is a device whose details are stored in the ODM and managed using the ‘chdev’ command.
- From AIX 6.1 and above, AIO is no longer a device, and is managed using the ‘ioo’ command.
- AIO is a prerequisite of Oracle, and must be ‘enabled’ prior to installing Oracle.
Prior to AIX 6.1, AIO is enabled as follows:
smit aio or chdev –l aio0 –aautoconfig=available -a minservers=100 -a maxservers=100 -a maxreqs=9152 mkdev aio0 chdev –l posix_aio0 –aautoconfig=available mkdev posix_aio0 aioo ### To manage aio parameters
minserver: Minimum number of kernel processes dedicated to asynchronous I/O processing
masservers: Maxiimum number of kernel processes dedicated to AIO processing
maxreqs: maximum number of asynchronous I/O requests that can be outstanding at one time
autoconfig: The state to which AIO is to be configured during system initialization. The possible values are "defined", which means that AIO cannot be used, and "available"
masservers: Maxiimum number of kernel processes dedicated to AIO processing
maxreqs: maximum number of asynchronous I/O requests that can be outstanding at one time
autoconfig: The state to which AIO is to be configured during system initialization. The possible values are "defined", which means that AIO cannot be used, and "available"
From AIX 6.1 and above, AIO is activated ‘dynamically’ as and when a program makes a call to AIO, so it is no longer necessary to manually enable AIO. The ‘ioo’ command is used change the properties of AIO only.
# ioo -a aio_active = 0 aio_maxreqs = 65536 aio_maxservers = 30 aio_minservers = 3 aio_server_inactivity = 300 j2_atimeUpdateSymlink = 0 j2_dynamicBufferPreallocation = 16 j2_inodeCacheSize = 400 j2_maxPageReadAhead = 128 j2_maxRandomWrite = 0 j2_metadataCacheSize = 400 j2_minPageReadAhead = 2 j2_nPagesPerWriteBehindCluster = 32 j2_nRandomCluster = 0 j2_syncPageCount = 0 j2_syncPageLimit = 16 lvm_bufcnt = 9 maxpgahead = 8 maxrandwrt = 0 numclust = 1 numfsbufs = 196 pd_npages = 65536 posix_aio_active = 0 posix_aio_maxreqs = 65536 posix_aio_maxservers = 30 posix_aio_minservers = 3 posix_aio_server_inactivity = 300
Disk IO pacing (High water-mark and Low Water-mark)
AIX 6.1 enables I/O pacing by default. In AIX 5.3, you needed to explicitly enable this feature.
The way it does this is by setting the sys0 settings of minpout and maxput parameters to 4096 and 8193, respectively:
The way it does this is by setting the sys0 settings of minpout and maxput parameters to 4096 and 8193, respectively:
Disk-I/O pacing is intended to prevent programs that generate very large amounts of output from saturating the systems I/O facilities and causing the response times of less-demanding programs to deteriorate.
When a process tries to write to a file that already has high-water mark pending writes, the process is put to sleep until enough I/Os have completed to make the number of pending writes less than or equal to the low-water mark. The logic of I/O-request handling does not change. The output from high-volume processes is slowed down somewhat.
The maxpout parameter specifies the number of pages that can be scheduled in the I/O state to a file before the threads are suspended. The minpout parameter specifies the minimum number of scheduled pages at which the threads are woken up from the suspended state. The default value for both the maxpout and minpout parameters is 0, which means that the I/O pacing feature is disabled. Changes to the system-wide values of the maxpout and minpout parameters take effect immediately without rebooting the system.
Changing the values for the maxpout and minpout parameters overwrites the system-wide settings. You can exclude a file system from system-wide I/O pacing by mounting the file system and setting the values for the maxpout and minpout parameters explicitly to 0. The following command is an example: mount -o minpout=0,maxpout=0 /<file system>
To change the high water-mark level
# chdev -a maxpout=20 -l sys0
Network Monitoring
netpmon Monitors activity and reports statistics on network I/O and network-related CPU usage. The netpmon command monitors a trace of system events, and reports on network activity and performance during the monitored interval. By default, the netpmon command runs in the background while one or more application programs or system commands are being executed and monitored. The netpmon command automatically starts and monitors a trace of network-related system events in real time. By default, the trace is started immediately; optionally, tracing may be deferred until the user issues a trcon command. When tracing is stopped by a trcstop command, the netpmon command generates all specified reports and exits.
# netpmon Run trcstop command to signal end of trace. Fri Mar 23 10:08:43 2012 System: AIX 6.1 Node: nbmedia200 Machine: 00C7C24E4C00 # trcstop [netpmon: Reporting started] ======================================================================== Process CPU Usage Statistics: ----------------------------- Network Process (top 20) PID CPU Time CPU % CPU % ---------------------------------------------------------- netpmon 2294244 6.8347 9.820 0.000 ps 3080288 0.0156 0.022 0.000 ps 983804 0.0146 0.021 0.000 ps 3145998 0.0143 0.021 0.000 ps 3015104 0.0129 0.019 0.000 trcstop 3080290 0.0053 0.008 0.000 ksh 1835464 0.0051 0.007 0.000 topasrec 2228292 0.0051 0.007 0.000 ---------------------------------------------------------- Total (all processes) 6.9672 10.011 0.003 Idle time 59.3796 85.319 ======================================================================== First Level Interrupt Handler CPU Usage Statistics: --------------------------------------------------- Network FLIH CPU Time CPU % CPU % ---------------------------------------------------------- data page fault 0.0495 0.071 0.000 UNKNOWN 0.0170 0.024 0.000 PPC decrementer 0.0024 0.003 0.000 external device 0.0000 0.000 0.000 queued interrupt 0.0000 0.000 0.000 ---------------------------------------------------------- Total (all FLIHs) 0.0689 0.099 0.000 ======================================================================== Network Device-Driver Statistics (by Device): --------------------------------------------- ----------- Xmit ----------- -------- Recv --------- Device Pkts/s Bytes/s Util QLen Pkts/s Bytes/s Demux ------------------------------------------------------------------------------ ethernet 0 0.69 74 0.1% 0.00 1.61 243 0.0000 ethernet 1 0.00 0 0.0% 0.00 0.23 136 0.0000 ========================================================================
0 blogger-disqus:
Post a Comment