dropdown menu

HW - CPU, PROCESSES

CPU - PROCESSES:

Planar: It is the board where CPUs and Memory DIMMs are located. (sometimes called "Processor book", "CPU board", "CPU planar" ...)
Socket: P9 boards contain max. 4 CPU sockets, which connects CPUs with the board. (Processor modules are plugged into the sockets on the board.)
Processor Module: A single (6-7 cm rectangular shape) entity which is placed into the socket on the board and it contains the CPU cores.

Power 8 technology had 2 module types: SCM (Single Chip Module) and DCM (Dual Chip Module). A chip would mean an integrated circuit with some number of cores. E850 servers used DCM Modules, which means that 2 chips (each chip contained 6-cores) were combined within a DCM, so we could say that this module had 12 cores. E880 were SCM servers, where a 12-core module contained only 1 chip. All Power9 servers are SCM servers, and each module capable to contain 4-12 cores. (Each core has 8 hardware threads which are utilized by the SMT in AIX.)

CPU, Processor, chip and core could have different meaning which is sometimes confusing, but the main point is how many cores we have in a server.  This picture shows an E950 logical sytem diagram, where the green P9 boxes are the sockets or modules, which contains the P9 Procesors or CPUs or cores.


------------------------

Physical - Virtual - Logical CPU:

Physical Processors (PP) (or Physical cores, or PC in vmstat output) are cores which are manufactured into the machine by IBM when we buy a Power server. Virtual Processors (VP) are assigned to an LPAR manually when an LPAR is created. Logical Processors (LP) are created automatically by AIX, depending on the SMT setting.

------------------------

SMT (Simultaneous multithreading)

SMT permits multiple independent threads to be executed on a CPU core to better utilize its resources. IBM Power processors supports SMT technology since Power 5 (SMT2). SMT2 means that 2 independent threads can run simultaneously on 1 CPU core. Power 7 processors introduced SMT4 and Power 8/9 supports SMT8. This feature usually allows multithreaded applications run faster.

Within a CPU (core, cpu or processor, these are the same thing) there are multiple execution units. For example: floating point arithmetic unit, load and store execution units... A single thread would use only 1 or 2 of those units at any point in time. So most of the executional units within a core will not be utilized. With the ability of multi-threading 2, 4 or 8 threads could be running in a core at the same time. One of them will use the floating processor while the other doing load and store ... If there are collisions, one of them would be delayed, so the first (primary) thread is stronger than the others (secondary, tertiary).

The smtctl command controls the maximum number of threads configured per core. (Each thread is called a Logical CPU in AIX.)
smtctl             <--displays smt settings
smtctl -t 8        <--change smt threads to 8 (no reboot required)

smtctl changes the SMT mode of all processor cores in the partition.  The SMT mode specified is the maximum SMT level, and not a fixed level. AIX dynamically changes the SMT level up to the maximum permitted. During periods where there are few software threads available to run, the operating system can dynamically reduce the SMT mode.

Intelligent SMT threads:
AIX default behaviour is to use all the VPs for maximun performance. If workload grows it will use up all VPs (CPU cores) quickly, but AIX first uses SMT thread 1 on all CPU cores before allocating work to the 2nd, 3rd and 4th SMT threads.

SMT threads can be seen as Logical CPUs on AIX. If SMT=4 then 1 VP shows up as 4 Logical CPU. There is an SMT feature called "intelligent SMT threads". If there are not enough processes to run on all SMT threads (official mode is SMT=4) it will be dynamically switched to 2 or 1.

mpstat or topas -L is showing it:

At the column "lpa" the sign "-" will show turned off SMT threads

# mpstat 2 

cpu  min  maj  mpc  int   cs  ics   rq  mig lpa sysc us sy wa id   pc  %ec  lcs
  0    0    0    0  265   35   24    2    0 100   64 100  0  0  0 0.63 31.7   99
  1    0    0    0   12   12    0    0    0 100    9  0  0  0 100 0.12  6.1   22
  2    0    0    0    9    0    0    0    0   -    0  0  0  0 100 0.12  6.1   20  <--this SMT thread is turned off
  3    0    0    0    9    0    0    0    0   -    0  0  0  0 100 0.12  6.1   19  <--this SMT thread is turned off
  4    0    0    0  100   12    8    1    0 100    0 100  0  0  0 0.64 31.8   99
  5    0    0    0   19   59    0    0    0 100    9  0  0  0 100 0.12  6.1   69
  6    0    0    0    9    0    0    0    0   -    0  0  0  0 100 0.12  6.1    9  <--this SMT thread is turned off
  7    0    0    0    9    0    0    0    0   -    0  0  0  0 100 0.12  6.1    9  <--this SMT thread is turned off
ALL    0    0    0  432  118   32    3    0   0   82 63  0  0 37 2.00 999.8  346
--------------------------------------------------------------------------------

--------------------------------------------

Raw vs Scaled throughput mode

Raw and scaled throughput modes control how threads will be dispatched to multiple cores. In raw throughput mode, a multithreaded application will use the primary threads first on each core before utilizing the secondary, tertiary etc. threads. (In below picture thread #1 will use Logical CPU0 on the first core and thread #2 will use the Logical CPU0 on the second core.)



In scaled throughput mode the application will use all SMT threads on a core (primary, secondary etc.), before going to the next core (so less VP is utilized)

How many threads will be utilized depends on the schedo tuneable "vpm_throughput_mode". There are 5 options, 0,1,2,4 and 8 that can be set dynamically. Mode0 and 1 cause the AIX partition to run in raw throughput mode, and modes 2-8 use the scaled throughput mode. The default is raw throughput mode (vpm_throughput_mode=0). vpm_throughput_mode defines how many threads will be dispatched to the same processor (based on the max smt value configurd by smtctl) before enabling another processor.

If we have 8 cores, and an application uses 8 threads, then in raw throughput mode threads are dispatched on the primary processor threads. The secondary and tertiary processor threads are utilized only when the application thread count exceeds 8 and 16 accordingly. (In this example SMT 4 has been configured with smtctl.)


For scaled throughput mode SMT2 (vpm_throughput_mode=2), SMT4 (vpm_throughput_mode=4) and SMT8 (vpm_throughput_mode=8) can be used. These options control SMT usage on each virtual processor (core) before unfolding another VP.  (A higher value of vpm_throughput_mode means that fewer cores being unfolded.)

8 application threads example with various vpm_throughput_mode settings.


A higher vpm_throughput_mode lowers application response times, but at the same time less CPU will be utilized (fewer virtual processors will be unfolded) which reduces the overall demand on the shared processing pool.

------------------------

IMPORTANT:

- Prior AIX6 TL4: if only a single hw thread was busy, processor reported as 100% utilized (this is an error because all the secondary threads were not utlized)

- AIX6 TL4 and later: potential capacity of unused hw threads are from TL4 reported as idle time for the processor (it measures the capacity of the unused hw threads.)

------------------------

VPM (virtual processor management)

VPM is an AIX feature, which controls the CPU utilization and calculates the number of virtual processors (VP) needed. (Once per second checks the CPU usage plus adds an additional 20% for headroom.) If the CPU usage is below 80% a VP will be disabled (folded),  if the CPU usage is beyond 80% a VP will be enabled (unfolded).

In this example a partition has 8 VPs and initially it is idle, so only 1 VP is enabled (unfolded).


At 0 seconds we start 8 threads. At this time only 1 VP is unfolded, so all threads are running on one processor. At 1 second an additional VP gets unfolded (since the CPU utilization exceeds the folding threshold). The same happens every second until all eight VPs are unfolded (enabled).

(This scenario assumes we use the default AIX behavior, which is the raw throughput mode. The number of unfolded VPs would be lower when scaled throughput mode is used, because the 8 threads would be placed on 4 or 2 VPs.)

For most workloads, unfolding one VP per second is sufficient. However, some workloads are time sensitive and require available CPUs immediately. Disabling VPM folding would be one option but that would cause all VPs to be active all the time (whether they are needed or not).  Another way is to unfold more VPs per second, which can be done through the schedo tunable "vpm_xvcpus".
If we set vpm_xvcpus to 2, it will unfold each second additionally 1 VP + 2 VP (2 VP comes from vpm_xvcpus=2)



Using the same example with vpm_xvcpus=2: at the beginning 3 of the 8 VPs are unfolded due to the setting vpm_xvcpus=2. At 0 seconds we start a workload that consists of 8 threads. The 8 application threads are now spread across the 3 unfolded processors. At 1 second VPM determined that one additional virtual processor is needed. 3 were enabled before, so the new calculated value is 6. At 2 seconds, all virtual processors are unfolded.
This example uses a vpm_xvcpus=2 to demonstrate how quickly virtual processors are unfolded through vpm_xvcpus tuning, however a value vpm_xvcpus value of 1 usually is sufficient for response time sensitive workloads.

------------------------

Context Switch:
It is inherent in any multiprocessing operating system. Different appl. threads are sharing a CPU. Every time 1 thread is leaving a CPU and a new thread  is dispatched to the CPU, a context switch occurs. The environment of the leaving one has to be saved and new environment ha to be reestablished for the new process. High context switch rates can cause many work (overhead) for the CPU, which can be a problem.

------------------------


PROCESS:

You use commands to tell the operating system what task you want it to perform. When commands are entered, they are recognized by a command interpreter (also known as a shell), and the task is processed.

A program or command that is actually running on the computer is referred to as a process.

The commom types of processes:

Foreground processes
Processes that require a user to start them or to interact with. Programs and commands run as foreground processes by default.

Background processes
Processes that are run  independently of a user. To run a process in the background, type the name of the command with the appropriate parameters and flags, followed by an ampersand (&). When a process is running in the background, you can perform additional tasks by entering other commands at the command prompt. Most processes direct their output to standard output (stdout), even when they run in the background. Because the output from a background process can interfere with your other work on the system, it is usually good practice to redirect the output of a background process to a file.

Daemon processes
Daemons are processes that run unattended. They are constantly in the background and are available at all times. Daemons are started usually when the system starts, and they run until the system stops. A daemon process typically performs system services. For example qdaemon (provides access to system resources such as printers) and sendmail are daemons.

Zombie processes
A zombie process is a dead process that is no longer executing but is still recognized in the process table (in other words, it has a PID number). Zombie processe have been killed or have exited and continue to exist in the process table until the parent process dies or the system is shut down and restarted. Zombie processes display as <defunct> when listed by the ps command. The only way to remove zombies is to reboot the system.

Thread
Each process is made up of one or more kernel threads. A thread is a single sequential flow of control. Rather than duplicating the environment of a parent process, as done via fork, all threads within a process use the same address space and can communicate with each other through variables.

------------------------------

kill

kill command sends a signal to a running process, which normally stops the process. If kill is used with a signal (a number like -9) it will have a different effect on the running process:
-15: Sends a notification to the program to terminate, it is the default
-9: Kills the application without notification


------------------------------

nohup (no hang up)

The nohup command prevents a process from being killed if you log off from the system before it completes. If the process is already running the nohup command modifies the specified process (-p), to ignore all hangup (SIGHUP) signals.  It is good for long running processes, as it can be combined to run a process in the background (&) and prevent it to being killed while logging off (nohup). If you do not redirect output, nohup will redirect output by default to the file nohup.out

nohup alt_disk_copy -d hdisk1 -B &      it will start alt_disk_copy in the background with nohup
nohup -p <pid>                          modifies given process to ignor hangup signals

------------------------------

Process priority

A priority is a number assigned to a thread. The kernel maintains a priority value (0-255). A smaller priority value indicates a more important thread. Real time thread priorities are lower than 40.

Nice value
A nice value is a priority adjustment factor added to the base user priority of 40 (for non-fixed priority threads). The nice value is used by the system to calculate the current priority of a running process. The first process in the system (init) has a nice value of 20, and therefore an effective priority of 60. (PRI heading in the below output) A foreground process has a nice value of 20 (24 for a background process).

ps -el                 shows process priorities
ps -ekl                shows process priorities including kernel processes
ps -kmo THREAD         shows processes with their threads priorities


root@aix31: / # ps -el    <--shows the nice values under the NI heading (-- means it is running with fixed prio.
       F S UID    PID   PPID   C PRI NI ADDR    SZ    WCHAN    TTY  TIME CMD
  200003 A   0      1      0   0  60 20 7037000   784               -  0:39 init
  200103 A   0 311326 352456   0  24 -- 81d8400  4676               -  1:15 xmtopas

The nice value can be set at process creation time by using the nice command. If the process already created the renice command is used.
(ksh will add automatically 4 to the default nice value (20) if a process is started in the background (&))

The nice value can be ranged from 0 to 39, with 39 being the lowest priority.
nice -10 <command>         add 10 to current nice value (lower priority)
nice --10 <command>        subtract 10 from current nice value (higher priority)

The renice value can be -20 to 20. (1-20: lowers the priority, 0:sets to the base scheduling priority, -20 to -1:highers the priority)
renice 10 -p <pid>         add 10 to the default nice value (20) (lower priority)
renice -n 10 -p <pid>      add 10 to current nice value  (lower priority)
renice -10 -p <pid>        subtract 10 from the default nice value (20) (higher priority)
renice -n -10 -p <pid>     subtract 10 from current nice value (higher priority)
                           (-n: incerment is added to the current nice value, not default)

---------------------------

CPU infos:

lscfg | grep proc            shows how many (virtual) processors we have (lsdev -Cc processor, shows also how many virt. proc we have)
bindprocessor -q             shows how many logical (SMT) processors we have
lsattr -El procX             shows the processor settings
pmcycles -m                  shows the processors speed (if smt is enabled it will show for all the logical processors)
smtctl                       it will show how many processor we have (if smt is turned on or not)

---------------------------

Process handling:

Ctrl-C or Ctrl-Backspace   cancels a foreground process

ps                         lists processes (by default lists only processes started from the current terminal)
    -e                     every process runnning on the system
    -f                     full listing (username, PPID...)
    -L <pid>               lists all processes which PPID is <pid>
    -u <user>              lists all processes running under <user>
    -T                     lists the tree of a given process (shows the children of a given process)

ps -elmo THREAD            lists processes and its threads (shows pids and the threads (tid) which belong to a given process)

proctree <pid>             displays the process tree of the specified process

kill <pid>                 notification to the process to terminate (it is using the default, 15, signal)
kill -9 <pid>              kills the process without notification
kill -1 <pid>              restarts the process (rereads the config files as well) (HUP - hangup)
                           (when a background process is running and you log off a hangup signal is sent)
kill -2 <pid>              interrupt signal (same as ctrl+c)
kill -l                    lists all the signals supported by kill (cat /usr/include/sys/signal.h will show as well, with details)

ls -R / > ls.out &         starts ls in the background (standard output is ls.out)
nohup ls -R / > ls.out &   nohup allows a background process to continue after logging off the system
                           (if output isn't redirected, it will create nohup.out)
echo "<command>" | at now  this also starts in the background (and you can log off)
jobs                       lists which processes are running in the background

nohup alt_disk_copy -d hdisk1 -B & can't be hanged up and in backgound (kill command can stop it)

Restarting a stopped foreground process (jobs command):
1. Ctrl-Z                  stops a foreground process, its PID is still in the process table (it goes to background)
2. jobs                    this will list stopped processes
[1] + Stopped (SIGTSTP)        ./myscript    <--you will see a line like this (here #1 is the job id)
3. fg %1                   put given job into foregeound (bg %1 puts into background)


Restarting a stopped foreground process (ps -ef <pid>):
1.Ctrl-Z                   stops a foreground process, its PID is still in the process table (it goes to background)
2.ps -ef | grep <PROC.NAME>    find the process ID (PID)
3.fg <PID>                 restarts that stopped proces (it will go to foreground)

Removing a background process:
1.find / -type f > output &    run the find command in the background
2.ps                      lists the PID numbers
3.kill <PID>              cancel the process

------------------------------

The operating system allows you to manipulate the input and output (I/O) of data to and from your system. For example you can specify to read input entered on the keyboard (standard input) or to read input from a file. Or you can specify to write output data to the screen (standard output) or to write it to a file.

When a command begins running, it usually expects that the following files are already open: standard input, standard output and standard error. A number, called a file descriptor, is associated with each of these files:

0     represents standard input (stdin)
1     represents standard output (stdout)
2     represents standard error (stderr)

The redirection symbols and their meanings:
<     redirects input (stdin) (< filename is added to the end of the command)
>     redirects output (stdout) (> filename is added to the end of the command)
>>    appends output
<<    inline input (see pg. 574)
2>    redirects output (stderr)
1>&2  redirects stdout to stderr
2>&1  redirects stderr to stdout


mail denise < letter1        sends the file letter1 to user denise with the mail command
echo $PATH > path1           saves the value of the PATH variable on the file path1
cat file2 >> file1           append file2 to file1 (the cat commands can concatenate not only display files)
ls -l file1 2> list1         save the stderr to file list1 (if file1 does not exist)
ls *.dat *.txt > files.out 2> files.err    (files.out: stdout, file.err: stderr)
command > output 2>&1        saves all the output (stdout and stderr) in one single file

25 comments:

Anonymous said...

Hi, can you explain about virtual CPU / logical CPU / physical CPU ?

aix said...

Hi, next weekend I'll do some update on this blog, and I'll publish some description about your question as well.
I'll post here the link where will be the answer... please be patient for a few more days, thanks :)

Anonymous said...

along with virtual CPU / logical CPU / physical CPU, could you explain thread and core too?

aix said...

OK, I try my best.

aix said...

I uploaded some info (please check at the top). Some description will follow later.

Siva said...

Hi,

In my aix server, have seeing lot of application process with PPID 1, Please advice how to kill the process

While doing ps -ef , there are no defunct process.

Regards,
Siva

aix said...

Hi, if those are application processes, I would ask application team, to investigate or stop application.

Unknown said...

Hi,

In my environment regularly we are getting CPU Utilisation alerts, can you please advice how to check the list of processes information which are consuming more CPU time.

Please help, Thanks in Advance..

Wisnu Wiyantoro said...

Thank You so much,.. it's helpful

Anonymous said...

thanks You for sharing,
And i have a question, My server have 8 Physical processors, and SMT=4, when monitoring, i saw, CPU 2 and 3 with lpa is "-" and are idle but "pc" parameter is "0.23", i guessed that SMT thread 2 and 3 still holding some of processor unit.So these CPUs is waste?

could you explain for me why and how improve CPU's utilize?

Anonymous said...

Thank you... Useful..

Anonymous said...

Very precise article. Easy understand the concepts. Thanks for sharing and continue the good work.

Anand Subramanian said...

Thank you, you have given good information in a precise way . I am bookmarking your blog :)

Anonymous said...

This is good, thanks for sharing :)

Unknown said...

Thank you for the clear info on your blog.
Can you please give more details about physc and entitlement in topas. And how it's related to vp or lp

Unknown said...

Hi, Is there a easy way to know how CPUs are being used and how many are remaining on the lpar?

surendra said...

Really good article and thanks:)

Vishu AIX said...

Hi First of all i want to thanks for this nice blog ..
I have one doubt if CPU = core then what is mean of 2 core 4 core and so on CPU . how we can identify from OS side .. what is my understanding if SMT value is 4 means we our processor is 4 core CPU if 8 then it is 8 core processor .. pls let me know am i right ???

Anonymous said...

Such a nice blog. Really helpful. Thank You !!

Unknown said...

kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk

Akash Agarwal said...

Can anyone tell how to determine whether a process is hung or not ?
Thanks in advance.

Unknown said...

Can anyone please decribe, what is stale process and how can we find and remove it from process table.
Is zombie and stale are same?

Unknown said...

Hello! Can anyone explain which smt=? should i set for system with 50 physical CPUs and over 3000 Oracle's process...?

aix said...

I think would be good to ask Oracle, what is their recommendation. I have found this document which has a section about smt tuning on Oracle: http://www-01.ibm.com/support/docview.wss?uid=tss1wp102440&aid=1
Hope This helps

Frank0757 said...

Thanks, man. Having search online for 3 hours, this seems like the ultimate guide. Since I am trying to gather information from 4 AIX servers, the introduction of the hierarchy of physical processor -> virtual processor -> logical processor is really helpful.