dropdown menu

TOPAS - NMON:


TOPAS

Reports selected local and remote system statistics. The topas command requires the bos.perf.tools and perfagent.tools file sets to be installed on the system.

Navigation is possible between the columns with the narrow keys (<-, ->, ...)

------------------------

Default View (small letters):

- c n d f p – cool!!!    = CPU Network Disk Filesystem Processes

- c --> CPUs           c --> graph
                       c --> all
                       c --> off

- n --> networks       n --> totals
                       n --> each interface
                       n --> off

- d --> disks          d --> totals
                       d --> each disk
                       d --> off

- f --> filesystems    f --> totals
                       f --> each filesystems
                       f --> off

- p --> processes      p --> top 20 processes
                       p --> off

- if present: t = tape, w=WLM, @=WPARs
- a=reset all

------------------------

Detailed View (capital letters):

- D --> Disks in full detail    m --> Multi-path I/O (only if it is used)
                                d --> adapter view (scsi, vscsi)

                      in adapter view (d):
                                f --> disks (devices) attached to that adaper (first navigate to an adapter with arrows then hit f)
                                v --> virtual adapters only (vscsi)
               
- E --> Ethernet (shows adapters, SEA, Eth. Chan....) (SEA interface must be in up state)

- F --> Filesystems (more details than "f")

- L --> LPAR settings (SMT, physc, %entc..) and individual CPUs (logical CPUs) usage

- P --> Processes details (CPU%, TIME, Page space usage; page space usage is shown only here!!!)

- T --> Tape if there is a ATAPE device attached

- V --> Volume group statistics
                       in volume group view:
                                f --> LVs in the VG (first navigate to a vg then hit f)

- W -->  WLM then @ --> WPAR

------------------------

CEC or Cross Partitions View or Whole Machine

topas -C    or    topas and hit C

On HMC the "Allow performance information collection" should be enabled for the LPAR.

topas -C might not be able to locate partitions residing on other subnets. To circumvent this, create a $HOME/Rsi.hosts file containing the fully qualified host names for each partition (including domains), one host per line.



- s and d --> Shared CPU and Dedicated CPU sections
- g --> Global
- m --> Memory pool = AMS stats from Hypervisor (select 0 hit f - note:only 1 pool)
- p --> CPU pool stats
- v --> VIO Server/Client disk use f to select the VIOS

------------------------

Navigation between topas and nmon:



------------------------

NMON (Nigel's Monitor):


c, C    CPU usage (c: small view  C:large view)
l -> #  it shows physical cpu usage
m       memory and paging statistics
n       network interface view
k       kernel statistics

t       processes --> [1=Basic 2=CPU 3=Perf 4=Size 5=I/O 6=Cmds
A       AIO processes

.       displays only busy disks and processes

D       disk statistics (read/write KB/s)
d       disk statistics with graph (same as D just with graph)
a       adapter I/O statistics (read/write KB/s, %busy)
^       Fibre channel adapter statistics (fcstat, ^ then a hit e.g. space)

j       jfs view
V       volume group statistics (read/write KB/s)

p       shared processor logical partition view
O       Shared Ethernet adapter statistics ("O" means OCean, SEA=sea)


nmon -k < disklist >    Reports only the disks in the disk list. (e.g. nmon -k hdisk1,hdisk2 only with original nmon)


If you use the same set of keys every time the nmon command is started, you can place the keys in the NMON shell variable.
For example, you can run the following command:
export NMON=mcd    (it will display by default memory, CPU, disk statistics)

------------------------

Capturing NMON data to file:

nmon -f -s "seconds" -c "count"

capture of a busy hour with 10 seconds interval: nmon -f -s 10 -c 360
(60minsx60secs=3600 secs-> with a 10 seconds interval it is 360 snapshots)

capture of a day with 5 mins interval: nmon -f -s 300 -c 288
(86400 seconds in a day divided by 300 (5 mins)= 288 snapshots)

For a detailed graph you can increase the count number to 600-700, but it has no value to go above that.
(Every point in a graph takes 3 pixels, so 600x3=1800 pixels are needed on your screen to see that graph)

-m <dir>    give output directory to nmon file
-T          captures top processes with command arguments (-t captures only top processes without command arguments)
-N          add NFS stats
-^          add FC stats
-O          add VIOS SEA stats

------------------------

Some performance information (related to nmon):
https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/Power+Systems/page/nmon_FAQ

nmon reports, more than 100% CPU utilisation for a process:
Unlike AIX commands, nmon reports the CPU use of a process per CPU. If your process is, for example, taking 250% then it is using 2.5 CPUs and must be multiple threaded. This is far better than the AIX tools because the percentages on larger machines make it very hard to determine if a process is using a whole CPU. On a 64 CPU machine a single process uselessly spinning on the CPU takes up 1.56% of the total CPU - this makes it very unclear what is going on.


adapter busy goes over 100%:
there are no adapter stats in AIX. They are derived from the disk stats. The adapter busy% is simply the sum of the disk busy%.
So if the adapter busy% is, for example, 350% then you have 3.5 disks busy on that adapter. Or it could be 7 disks at 50% busy or 14 disks at 25% or ....
There is no way to determine the adapter busy, the adapter has a dedicated on-board CPU that is always busy and we don't run nmon of these adapter CPUs to find out what they are really doing!!


CPU wait is too high:
CPU "waiting for I/O" means the CPU is Idle but has a disk I/O outstanding. In history this was used to highlight that your application is being held up by slow disks or disks problems. In the Wait for I/O state the CPU is actually free to do other work and the CPU is NOT looping waiting for the disk - it in fact actioned the adapter to perform the disk I/O, put the calling process to sleep and carried on. If there is no other process it is in the same loop as in the Idle state i.e. it is available to do other things.

In benchmarks, Wait for I/O is seen positively as an opportunity - we can do throw in more work to boost throughput. In fact, faster CPUs would mean even high wait values.

free memory is near zero:
This is just how AIX works and is perfectly normal. All of memory will be soaked up with copies of filesystem blocks after a reasonable length of time and the free memory will be near zero. AIX will then use the lrud (least recently used daemon) process to keep the free list at a reasonable level. If you see the lrud process taking more than 30% of a CPU then you need to investigate and make memory parameter changes.

------------------------

16 comments:

  1. Hi,

    I've allocated 1 physical processor (core) to AIX LPAR. But when i generate nmon report using nmon analyser, it is showing as 2 CPUs.

    CPU_SUMM tab
    PCPU_ALL
    PCPU01
    PCPU02
    CPU_ALL
    CPU_01
    CPU_02

    when i ran lparstat -i
    desired or entitled capacity is only 1

    but why nmon report is showing as 2 CPUs

    please explain.

    ReplyDelete
    Replies
    1. Hi, this is because of SMT (Simultaneous Multi Threading). Here you can find some description: http://aix4admins.blogspot.hu/2011/08/commands-and-processes-process-you-use.html.

      Delete
    2. Do you have any document/procedure to refer . I actually want to configure NMON in my AIX servers.

      Delete
  2. Hi , thanks for all your great info on this Blog, really appreciate it.

    I have a question, how can i place the keys in the NMON shell variable you mentioned above in this article.

    For example, you can run the following command:
    export NMON=mcd (it will display by default memory, CPU, disk statistics)

    please help. thanks.

    ReplyDelete
    Replies
    1. Hi, I have found in .kshrc file in root home dir this export NMON... setting. You can modify there or place ther your own version.

      Delete
  3. Hi, I would like to know that what's the difference between IBM Performance Management for Power Systems and NMON data collection and the reports..

    ReplyDelete
  4. How to check memory usage in each application used.

    ReplyDelete
  5. How to check job on last days ?.

    example : I want to know job on date 7th May 2014 & time 09.00 PM till 11.00 PM

    Pla advice.

    ReplyDelete
  6. Hello, when you run the NMON shows the following error:

    # nmon
    ERROR: Assert Failure in file="nmonDisplay.c" in function="init_cpu_stats" at line=3165
    ERROR: Reason=System call returned -1
    ERROR: Expression=[[cpudata = perfstat_cpu(&perfid, p->cpus, sizeof_perfstat_cpu_t, absolute_max_cpus)]]
    ERROR: errno=109
    ERROR: errno means : Function not implemented
    ERROR: Sizeof cpu=616 cpu_total=696 disk=496 diskadapter=200 diskpath=312 disktotal=192, memory=352 netbuff=128 netif=240 netiftotal=80 paging=248 partition=656 protocol=728
    You have mail in /usr/spool/mail/root
    #

    ReplyDelete
  7. Hi,

    If i use "-d " option in nmon, whether it will give disk service time graphs or disk statistics kb/s graph?. What is command to get service time graphs in the nmon report?

    ReplyDelete
  8. I want to get the IBM power720 system info similar to VMware ESXi in CSV file. How can i get it using nmon or topas?
    nmon analyser data i dont want to use. I need raw data in some readable form.

    ReplyDelete
  9. Nmon hold all the disk when running and give disk busy error while deleting any disk. Is there any fix for this issue

    ReplyDelete
    Replies
    1. The workaround is to kill nmon process temporarily, then do necessary action (like disk driver update), then restart nmon.

      Delete
  10. Can NMON be used to display the particular application performance and not including the complete server performance. Like we have one perl script, it takes around 1 hr to calculate some simulations , we want to measure the performance. Please suggest if we can do the same using NMON?

    ReplyDelete
  11. Hi, I can capture data using nmon -f, but I cannot extract the memory usage or cpu usage from the file. Can any one help me point out in which part I can get the data ?

    ReplyDelete
  12. Do you have any document/procedure to refer . I actually want to configure NMON in my AIX servers.

    ReplyDelete