We can't tune what is not being taxed, we can't tune what can't be tracked.
(we tune the intensities not the sleepy times)
If it runs fast enough we are done (you can't tune it forever)
- date; uname -a; id; oslevel -s; lparstat -i
- check hardware (prtconf | more -- lsdev | grep vail -- lscfg | grep +)
- think to think or move the data workload
- check near-static structures (lvm, paging space, settings)
- check historical data (events, errorrs)
Memory - VMM
- shared memory segments: ipcs -bm (most of the time not all of the memory is allocated)
the given values shows what is the maximum size that mem. segments can grow (it is a major component of the computational memory)
- uptime; vmstat -s (it increments since boot)
- uptime; vmstat -v (I/O goes through fsbufs -> then pbufs (each of them can be exhausted))
- vmo -L; ioo -L
I/O - LVM:
- df -k (how much content is goverened by 1 inode)
- tech-stack map: RAIDset ->LUN->LVM(VG:lv:fs/with options) -> logical content
- iostat -a, iostat -D
- lvmo -a -v <vgname> (pbufs can be checked and increased if needed)
- iostat -AQ 2 (asynchronous I/O stats)
- uptime; ps -ekf| egrep "syncd|lrud|nfsd|biod|wait
match time of lrud with syncd: if lrud is greater, then it should grab your attention (if lower it is fine)
if lrud is high it is scanning and freeing and scanning and freeing...
lrud has high priority, so if it is running not much work can be done (reduce lrud to let other processing running)
- ps -kelmo THREAD (shows the threaded world)
- ps guww (shows in descending %CPU (RSS:in real memory SZ:in virtual memory, STIME: start time, TIME: accumulated system time))
- ps gvww (shows in ascending PID (PGIN:how many pages are moved))
- ps -ef | grep -v "Oct 20" (the day of boot has been grepped out and check what processes have been started from that time)
- ps -ef | grep -LOCAL=NO (for Oracle client sessions)
- netstat -ss (check non-zero values)
- netstat -v (queue overflow)
6 in 1 tool:
- vmstat -Iwt 2
if cpubound -> tprof is used to spot those processes which are using
if memory bound -> svmon is used to help to find what is using the most memory
if i/o bound -> filemon will help to find what is causing all of the disk activity
CPU wait is too high, how can I reduce it?
CPU in waiting for I/O mode is not a problem. The CPU is actually in Idle mode but it has noted there is disk I/O outstanding and then it is reported as Wait instead of Idle. Lots of workloads that throw data away faster than it can be read will be seen as high Wait. In Wait for I/O mode it is fully available to run more application code.
In benchmarks, Wait for I/O is seen positively as an opportunity - we can do throw in more work to boost throughput.
Any workload in which the CPU does little work compared to the volume of disk I/O is going to give you high Wait for I/O.
If this high Wait for I/O is a sudden change from the normal pattern then it needs investigating and you should make sure as many disks as possible are involved in the disk I/O.
In fact, faster CPUs would mean even high wait values.
Which process consumes most memory?
topas -P, you can tab to page space column to sort on that. It is called "page space" column, because it shows the memory usage which is backed by that amount of paging space (which is the size of the process in memory
Which process has used the DISK I/O most frequently?
Start nmon --> t for top processes -->Hit 5 to list them in I/O order, then look at the Char I/O column
Free memory is near zero, how do I free more memory?
This is just how AIX works and is perfectly normal. All memory will be soaked up with copies of filesystem blocks after a reasonable length of time and the free memory will be near zero. If your file systems cache is a large percentage of memory then you are avoiding disk I/O. This is a good thing. You should NOT try to reduce it - this could damage performance.
AIX will then use the lrud process to keep the free list at a reasonable level. If you see the lrud process taking more than 30% of a CPU then you need to investigate and make memory parameter changes.
20% paging space usage, how affect performance?
20% of paging space can be allocated but no actual I/O taking place. You need to look at the paging stats to determine, if paging I/O is actually happening. Allocating paging space would not have a performance impact.
- FS - LVM
- STORAGE - BACKUP
- UPD. - INSTALL