Using X11 forwarding in SSH
The SSH protocol has the ability to securely forward X Window System applications over your encrypted SSH connection, so that you can run an application on the SSH server machine and have it put its windows up on your local machine without sending any X network traffic in the clear.
In order to use this feature, you will need an X display server for your Windows machine, such as Cygwin/X, X-Win32, or Exceed. This will probably install itself as display number 0 on your local machine; if it doesn't, the manual for the X server should tell you what it does do.
You should then tick the ‘Enable X11 forwarding’ box in the Tunnels panel before starting your SSH session. The ‘X display location’ box is blank by default, which means that PuTTY will try to use a sensible default such as :0, which is the usual display location where your X server will be installed. If that needs changing, then change it.
Now you should be able to log in to the SSH server as normal. To check that X forwarding has been successfully negotiated during connection startup, you can check the PuTTY Event Log. It should say something like this:
2001-12-05 17:22:01 Requesting X11 forwarding
2001-12-05 17:22:02 X11 forwarding enabled
If the remote system is Unix or Unix-like, you should also be able to see that the DISPLAY environment variable has been set to point at display 10 or above on the SSH server machine itself:
fred@unixbox:~$ echo $DISPLAY
unixbox:10.0
--------------------
Overview of the X server:
I think your problem is a confusion about how X works, so a few clarifications first:
An "X-Server" is a process which handles and manages a certain (physically available) display. This usually runs on a *client*. Think of an "X-Server" as sort of a driver for a graphics card. (X-Server is where the Keyboard, Video & Mouse were attached.)
An "X-Client" is a process which uses an X-Server to display (a window with) some information on it. This usually runs on the server. An example would be "xterm" or "aixterm" or "Mozilla", etc.
To tell your xclient which Xserver to use there is an environment variable DISPLAY, which is set pointing to your Xserver:
export DISPLAY="mymachine.withxserverrunning.com:0.0"
means use the Xserver running on this machine and managing display 0 (there could be several) and use screen 0 (mymachine.withxserverrunning.com:0.1 would be screen 1), since displays could consist of several screens (this is: monitors handled by graphics cards). As you see, unlike in Windoze one doesn't need multiheaded graphics cards with dual (several) monitor capabilities to span a graphical environment over several screens, this can be done by run-of-the-mill X-Servers and windowmanagers running on top of them.
You can run your X-Server directly on ylour server only if you have a graphical terminal (an "lft" ) attached to it. Check in your inventory (man lscfg, man lsdev) to find out if you have one.
If you have none (this is the common case, as servers usually don't come with graphics cards), you will have a machine you work on (if you have to endure common working conditions this is a Windoze machine, if you are lucky this is a real computer running some real OS, Linux or AIX for instance). On this machine (client.yournet.com) you start your X-Server. Start a local X-client (a window), then use some Telnet or similar program to log in to your host (host.yournet.com.
On this host issue issue a "export DISPLAY=client.yournet.com:0.0" and then a "xterm &".
A window should pop up on your display with an xterm. This xterm is not running on your local machine, but on the server. The process on the server only uses your screen (vie your X-Server) to display its content. You can check that by issuing "kill -9 %1" in the first window, which would make the second window vanish.
If it doesn't work as described: issue an "xhost +" on your client machine, X-Windows contains a mechanism to limit access to an X-Servers resource only to a defined group of hosts (which is empty by default), the command will enable any host to use the screen.
--------------------
X11 forwarding:
(in putty X11 forwarding should be enabled and an X server (e.g. XMING) has to be run)
0. Xming
1. ssh settings:
in sshd_config (/etc/ssh)set: X11Forwarding yes
stopsrc -s sshd; startsrc -s sshd
2. install X11
in /mnt/5300-00/installp/ppc: smitty install:
-X11.base.5.3.0.0.I (this will install some requisites as well from apps, fonts...)
-X11.apps (it contains a startx, xauth, xhost commands)
do an update to the needed TL level
4. startx
5. then login again:
ssh -X root@aix40
it did this: 1356-364 /usr/bin/X11/xauth: creating new authority file /.Xauthority
5. xclock :)))))
echo $DISPLAY showed: localhost:10.0 (I did not set it at all)
(export DISPLAY=localhost:10.0 perhaps does not needed at all)
(It happend that under roo xlock worked, but as other user it didn't. After copying .Xauthority file (from root) it worked)
--------------
Hostname:Number.Screen
Hostname - where the display physically attached
Number - ID number of the display server on that host machine
Scrreen - number of the screen on that host server
xhost command???
-----------------------------------
If everything looks OK, but you receive this:
root@bb_lpar: / # xclock
Error: Can't open display:
Probably the only problem, you did not use -X: ssh -X root@servername.
When I used -X the DISPLAY variable was configured automatically.:
(I did not set up anything, when I used -X I could see this, but prior -X I received an empty line.)
root@bb_lpar: / # echo $DISPLAY
localhost:10.0
-----------------------------------
X server problems:
(This is not edited, I received these errors when I tried to config X)
X11.base is needed
./firefox
errors I have received:
1 .Gtk-WARNING **: cannot open display <--after setting X11Forwarding yes I received other errors)
someone suggested this:xhost +LOCAL (it gives all non-network connect. access to the display)
2. Gtk-WARNING **: cannot open display: 0.0 <--suggested solution: export DISPLAY=:0.0
3.Xlib: connection to ":0.0" refused by server
Xlib: No protocol specified
After I gave these commands:
xauth list
startx
xclock <--until I gave startx, xclock command did not work
export DISPLAY=localhost:10.0
xhost + localhost
export DISPLAY=10.10.100.96:0.0
xinit
-----------------------------------
Xlib: connection to "localhost:10.0" refused by server
Xlib: Invalid MIT-MAGIC-COOKIE-1 key
Error: Can't open display: localhost:10.0
root@aix10: / # env
DISPLAY=localhost:10.0
You can see in 'ps -ef' that display :10 is already in use:
root@aix10: / # ps -ef | grep ":10"
root 643132 123006 0 Nov 10 - 79:10 /etc/ncs/llbd
root 852170 1458410 0 May 22 - 1:10 /usr/lpp/OV/lbin/eaagt/opcmsga
yyxxxxx 999524 1188014 0 10:45:15 - 0:00 /usr/lpp/CTXSmf/slib/ctxlogin -display :10
Solution is to set in /etc/ssh/sshd_config:
X11DisplayOffset 70
Then displays will be start from 70 and hopefully will not interfere with citrix
-----------------------------------
When doing ssh -X user@host, I received these:
Warning: untrusted X11 forwarding setup failed: xauth key data not generated
Warning: No xauth data; using fake authentication data for X11 forwarding.
$ xclock
X11 connection rejected because of wrong authentication.
X connection to localhost:11.0 broken (explicit kill or server shutdown).
However xclock with ssh -Y user@host worked fine.
After adding on the client (where I was coming from) into /etc/ssh/ssh_config: "ForwardX11Trusted yes" it worked well with ssh -X. (This line was missing from ssh_config, so I added to it.)
-----------------------------------
dropdown menu
I/O - AIO, DIO, CIO, RAW

Filesystem I/O
AIX has special features to enhance the performance of of filesystem I/O for general-purpose file access. These features include read ahead, write behind and I/O buffering. Oracle employs its own I/O optimization and buffering that in most cases are redundant to those provided by AIX file systems. Oracle uses buffer cache management (data blocks buffered to shared memory), and AIX uses virtual memory management (data buffered in virtual memory). If both try to manage data caching, it result in wasted memory, cpu and suboptimal performance.
Generally better to allow Oracle to manage I/O buffering, because it has information regarding the context, so can better optimze memory usage.
Asynchronous I/O
A read can be considered to be synchronous if a disk operation is required to read the data into memory. In this case, application processing cannot continue until the I/O operation is complete. Asynchronous I/O allows applications to initate read or write operations without being blocked, since all I/O operations are done in background. This can improve performance, because I/O operations and appl. processing can run simultaenously.
Asynchronous I/O on filesystems is handled through a kernel process called: aioserver (in this case each I/O is handled by a single kproc)
The minimum number of servers (aioserver) configured, when asynchronous I/O is enabled, is 1 (minservers). Additional aioservers are started when more asynchronous I/O is requested. Tha maximum number of servers is controlled by maxservers. aioserver kernel threads do not go away once started, until the system reboots (so with "ps -k" we can see what was the max number of aio servers that were needed concurrently at some time in the past)
How many should you configure?
The rule of thumb is to set the maximum number of servers (maxservers) equal to ten times the amount of disk or ten times the amount of processors. MinServers would be set at half of this amount. Other than having some more kernel processes hanging out that really don't get used (using a small amount of kernel memory), there really is little risk in oversizing the amount of MaxServers, so don't be afraid to bump it up.
root@aix31: / # lsattr -El aio0
autoconfig defined STATE to be configured at system restart True
fastpath enable State of fast path True
kprocprio 39 Server PRIORITY True
maxreqs 4096 Maximum number of REQUESTS True
maxservers 10 MAXIMUM number of servers per cpu True
minservers 1 MINIMUM number of servers True
maxreqs <-max number of aio requests that can be outstanding at one time
maxservers <-if you have 4 CPU then tha max count of aio kernel threds would be 40
minservers <-this amount will start at boot (this is not per CPU)
Oracle takes full advantage of Asynchronous I/O provided by AIX, resulting in faster database access.
on AIX 6.1: ioo is handling aio (ioo -a)
mkdev -l aio0 enables the AIO device driver (smitty aio)
aioo -a shows the value of minservers, maxservers...(or lsattr -El aio0)
chdev -l aio0 -a maxservers='30' changes the maxserver value to 30 (it will show the new value, butit will be active only after reboot)
ps -k | grep aio | wc -l shows how many aio servers are running
(these are not necessarily are in use, maybe many of them are just hanging there)
pstat -a shows the asynchronous I/O servers by name
iostat -AQ 2 2 it will show if any aio is in use by filesystems
iostat -AQ 1 | grep -v " 0 " it will omit the empty lines
it will show which filesystems are active in regard to aio.
under the count column will show the specified fs requested how much aio...
(it is good to see which fs is aio intensive)
root@aix10: /root # ps -kf | grep aio <--it will show the accumulated CPU time of each aio process
root 127176 1 0 Mar 24 - 15:07 aioserver
root 131156 1 0 Mar 24 - 14:40 aioserver
root 139366 1 0 Mar 24 - 14:51 aioserver
root 151650 1 0 Mar 24 - 14:02 aioserver
It is good to compare these times of each process to see if more aioservers are needed or not. If the times are identical (only few minutes differences) it means all of them are used maximally so more precesses are needed.
------------------------------------
iostat -A reports back asynchronous I/O statistics
aio: avgc avfc maxgc maxfc maxreqs avg-cpu: % user % sys % idle % iowait
10.2 0.0 5 0 4096 20.6 4.5 64.7 10.3
avgc: This reports back the average global asynchronous I/O request per second of the interval you specified.
avfc: This reports back the average fastpath request count per second for your interval.
------------------------------------
Changing aio parameters:
You can set the values online, with no interruption of service – BUT – they will not take affect until the next time the kernel is booted
1. lsattr -El aio0 <-- check current setting for aio0 device
2. chdev -l aio0 -a maxreqs=<value> -P <-- set the value of maxreqs permanently for next reboot
3. restart server
------------------------------------
0509-036 Cannot load program aioo because...
if you receive this:
root@aix30: / # aioo -a
exec(): 0509-036 Cannot load program aioo because of the following errors:
0509-130 Symbol resolution failed for aioo because:
....
probably aioserver is in defined state:
root@aix30: / # lsattr -El aio0
autoconfig defined STATE to be configured at system restart True
You should make it available with: mkdev -l aio0
(and also change it for future restarts: chdev -l aio0 -a autoconfig=available, or with 'smitty aio')
------------------------------------
DIRECT I/O (DIO)
Direct I/O is an alternative non-caching policy which causes file data to be transferred directly to the disk from the application or directly from the disk to the application without going through the VMM file cache.
Direct I/O reads cause synchrounous reads from the disk whereas with the normal cached policy the reads may be satisfied from the cache. This can result in poor performance if the data was likely to be in memory under the normal caching policy.
Direct I/O can be enabled: mount -o dio
If JFS2 DIO or CIO options are active, no filesystem cache is being used for Oracle .dbf and/or online redo logs files.
Databases normally manage data caching at application level, so the do not need the filesystem to implement this service for them. The use of the file buffer cache result in undesirable overhead, since data is first moved from the disk to the file buffer cache and from there to the application buffer. This "double-copying" of data results in additional CPU and memory consumption.
JFS2 supports DIO as well CIO. The CIO model is built on top of the DIO. For JFS2 based environments, CIO should always be used (instead of DIO) for those situations where the bypass of filesystem cache is appropriate.
JFS DIO should only be used:
On Oracle data (.dbf) files, where DB_BLOCK_SIZE is 4k or graeter. (Use of JFS DIO on any other files (e.g redo logs, control files) is likely to result in a severe performance penalty.
------------------------------
CONCURRENT I/O (CIO)
The inode lock imposes write serialization at the file level. JFS2 (by default) employs serialization mechanisms to ensure the integrity of data being updated. An inode lock is used to ensure that there is at most one outstanding write I/O to a file at any point in time, reads are not allowed because they may result in reading stale data.
Oracle implements its own I/O serialization mechanisms to ensure data integrity, so JFS2 offers Concurrent I/O option. Under CIO, multiple threads may simulteanously perform reads and writes on a shared file. Applications that do not enforce serialization should not use CIO (data corruption or perf. issues can occure).
CIO invokes direct I/O, so it has all the other performance considerations associated with direct I/O. With standard direct I/O, inodes are locked to prevent a condition where multiple threads might try to change the consults of a file simultaneously. Concurrent I/O bypasses the inode lock, which allows multiple threads to read and write data concurrently to the same file.
CIO includes the performance benefits previously available with DIO, plus the elimination of the contention on the inode lock.
Concurrent I/O should only be used:
Oracle .dbf files, online redo logs and/or control files.
When used for online redo logs or control files, these files should be isolated in their own JFS2 filesystems, with agblksize= 512.
Filesystems containing .dbf files, should be created with:
-agblksize=2048 if DB_BLOCK_SIZE=2k
-agblksize=4096 if DB_BLOCK_SIZE>=4k
(Failure to implement these agblksize values is likely to result in a severe performance penalty.)
Do not under aby circumstances, use CIO mount option for the filesystem containing the Oracle binaries (!!!).
Additionaly, do not use DIO/CIO options for filesystems containing archive logs or any other files do not discussed here.
Applications that use raw logical volumes fo data storage don't encounter inode lock contention since they don't access files.
fsfastpath should be enabled to initiate aio requestes directly to LVM or disk, for maximum performance (aioo -a)
------------------------------
RAW I/O
When using raw devices with Oracle, the devices are either raw logical volumes or raw disks. When using raw disks, the LVM layer is bypassed. The use of raw lv's is recommended for Oracle data files, unless ASM is used. ASM has the capability to create data files, which do not need to be mapped directly to disks. With ASM, using raw disks is preferred.
Dump - Core
AIX generates a system dump when a severe error occurs. A system dump creates a picture of the system's memory contents. If the AIX kernel crashes kernel data is written to the primary dump device. After a kernel crash AIX must be rebooted. During the next boot, the dump is copied into a dump directory (default is /var/adm/ras). The dump file name is vmcore.x (x indicates a number, e.g. vmcore.0)
When installing the operating system, the dump device is automatically configured. By default, the primary device is /dev/hd6, which is a paging logical volume, and the secondary device is /dev/sysdumpnull.
A rule of thumb is when a dump is created, it is about 1/4 of the size of real memory. The command "sysdumpdev -e" will also provide an estimate of the dump space needed for your machine. (Estimation can differ at times with high load, as kernel space is higher at that time.)
When a system dump is occurring, the dump image is not written to disk in mirrored form. A dump to a mirrored lv results in an inconsistent dump and therefore, should be avoided. The logic behind this fact is that if the mirroring code itself were the cause of the system crash, then trusting the same code to handle the mirrored write would be pointless. Thus, mirroring a dump device is a waste of resources and is not recommended.
Since the default dump device is the primary paging lv, you should create a separate dump lv, if you mirror your paging lv (which is suggested.)If a valid secondary dump device exists and the primary dump device cannot be reached, the secondary dump device will accept the dump information intended for the primary dump device.
IBM recommendation:
All I can recommend you is to force a dump the next time the problem should occur. This will enable us to check which process was hanging or what caused the system to not respond any more. You can do this via the HMC using the following steps:
Operations -> Restart -> Dump
As a general recommendation you should always force a dump if a system is hanging. There are only very few cases in which we can determine the reason for a hanging system without having a dump available for analysis.
-------------------------------------------
Traditional vs Firmware-assisted dump:
Up to POWER5 only traditioanl dumps were available, and the introduction of the POWER6 processor-based systems allowed system dumps to be firmware assisted. When performing a firmware-assisted dump, system memory is frozen and the partition rebooted, which allows a new instance of the operating system to complete the dump.
Traditional dump: it is generated before partition is rebooted.
(When system crashed, memory content is trying to be copied at that moment to dump device)
Firmware-assisted dump: it takes place when the partition is restarting.
(When system crashed, memory is frozen, and by hypervisor (firmware) new memory space is allocated in RAM, and the contents of memory is copied there. Then during reboot it is copied from this new memory area to the dump device.)
Firmware-assisted dump offers improved reliability over the traditional dump, by rebooting the partition and using a new kernel to dump data from the previous kernel crash.
When an administrator attempts to switch from a traditional to firmware-assisted system dump, system memory is checked against the firmware-assisted system dump memory requirements. If these memory requirements are not met, then the "sysdumpdev -t" command output reports the required minimum system memory to allow for firmware-assisted dump to be configured. Changing from traditional to firmware-assisted dump requires a reboot of the partition for the dump changes to take effect.
Firmware-assisted system dumps can be one of these types:
Selective memory dump: Selective memory dumps are triggered by or use of AIX instances that must be dumped.
Full memory dump: The whole partition memory is dumped without any interaction with an AIX instance that is failing.
-------------------------------------------
Use the sysdumpdev command to query or change the primary or secondary dump devices.
- Primary: usually used when you wish to save the dump data
- Secondary: can be used to discard dump data (that is, /dev/sysdumpnull)
Flags for sysdumpdev command:
-l list the current dump destination
-e estimates the size of the dump (in bytes)
-p primary
-s secondary
-P make change permanent
-C turns on compression
-c turns off compression
-L shows info about last dump
-K turns on: alway allow system dump
sysdumpdev -P -p /dev/dumpdev change the primary dumpdevice permanently to /dev/dumpdev
root@aix1: /root # sysdumpdev -l
primary /dev/dumplv
secondary /dev/sysdumpnull
copy directory /var/adm/ras
forced copy flag TRUE
always allow dump TRUE <--if it is on FALSE then in smitty sysdumpdev it can be change
dump compression ON <--if it is on OFF then sysdumpdev -C changes it to ON-ra (-c changes it to OFF)
Other commands:
sysdumpstart starts a dump (smitty dump)(it will do a reboot as well)
kdb it analysis the dump
/usr/lib/ras/dumpcheck checks if dump device and copy directory are able to receive the system dump
If dump device is a paging space, it verifies if enough free space exists in the copy dir to hold the dump
If dump device is a logical volume, it verifies it is large enough to hold a dump
(man dumpcheck)
-------------------------------------------
Creating a dump device
1. sysdumpdev -e <--shows an estimation, how much space is required for a dump
2. mklv -t sysdump -y lg_dumplv rootvg 3 hdisk0 <--it creates a sysdump lv with 3 PPs
3. sysdumpdev -Pp /dev/lg_dumplv <--making it as a primary device (system will use this lv now for dumps)
-------------------------------------------
System dump initiaded by a user
!!!reboot will take place automatically!!!
1. sysdumpstart -p <--initiates a dump to the primary device
(Reboot will be done automatically)
(If a dedicated dump device is used, user initiated dumps are not copied automatically to copy directory.)
(If paging space is used for dump, then dump will be copied automatically to /var/adm/ras)
2. sysdumpdev -L <--shows dump took place on the primary device, time, size ... (errpt will show as well)
3. savecore -d /var/adm/ras <--copy last dump from system dump device to directory /var/adm/ras (if paging space is used this is not needed)
-------------------------------------------
How to move dumplv to another disk:
We want to move from hdisk1 to hdisk0:
1. lslv -l dumplv <--checking which disk
2. sysdumpdev -l <--checking sysdump device (primary was here /dev/dumplv)
3. sysdumpdev -Pp /dev/sysdumpnull <--changing primary to sysdumpnull (secondary, it is a null device) (lsvg -l roovg shows closed state)
4. migratepv -l dumplv hdisk1 hdisk0 <--moving it from hdisk1 to hdisk0
5. sysdumpdev -Pp /dev/dumplv <--changing back to the primary device
-------------------------------------------
The largest dump device is too small: (LABEL: DMPCHK_TOOSMALL IDENT)
1. Dumpcheck runs from crontab
# crontab -l | grep dump
0 15 * * * /usr/lib/ras/dumpcheck >/dev/null 2>&1
2. Check if there are any errors:
# errpt
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
E87EF1BE 0703150008 P O dumpcheck The largest dump device is too small.
E87EF1BE 0702150008 P O dumpcheck The largest dump device is too small.
3. If you find new error message, find dumplv:
# sysdumpdev -l
primary /dev/dumplv
secondary /dev/sysdumpnull
copy directory /var/adm/ras
forced copy flag TRUE
always allow dump TRUE
dump compression ON
List dumplv form rootvg:
# lsvg -l rootvg|grep dumplv
dumplv dump 8 8 1 open/syncd N/A
4. Extend with 1 PP
# extendlv dumplv 1
dumplv dump 9 9 1 open/syncd N/A
Run problem check at the end
OK -> done
Not OK -> Extend with 1 PP again.
-------------------------------------------
changing the autorestart attribute of the systemdump:
(smitty chgsys as well)
1.lsattr -El sys0 -a autorestart
autorestart true Automatically REBOOT system after a crash True
2.chdev -l sys0 -a autorestart=false
sys0 changed
----------------------------------------------
CORE FILE:
errpt shows which program, if not:
- use the “strings” command (for example: ”strings core | grep _=”)
- or the lquerypv command: (for example: “lquerypv -h core 6b0 64”)
man syscorepath
syscorepath -p /tmp
syscorepath -g
AIX generates a system dump when a severe error occurs. A system dump creates a picture of the system's memory contents. If the AIX kernel crashes kernel data is written to the primary dump device. After a kernel crash AIX must be rebooted. During the next boot, the dump is copied into a dump directory (default is /var/adm/ras). The dump file name is vmcore.x (x indicates a number, e.g. vmcore.0)
When installing the operating system, the dump device is automatically configured. By default, the primary device is /dev/hd6, which is a paging logical volume, and the secondary device is /dev/sysdumpnull.
A rule of thumb is when a dump is created, it is about 1/4 of the size of real memory. The command "sysdumpdev -e" will also provide an estimate of the dump space needed for your machine. (Estimation can differ at times with high load, as kernel space is higher at that time.)
When a system dump is occurring, the dump image is not written to disk in mirrored form. A dump to a mirrored lv results in an inconsistent dump and therefore, should be avoided. The logic behind this fact is that if the mirroring code itself were the cause of the system crash, then trusting the same code to handle the mirrored write would be pointless. Thus, mirroring a dump device is a waste of resources and is not recommended.
Since the default dump device is the primary paging lv, you should create a separate dump lv, if you mirror your paging lv (which is suggested.)If a valid secondary dump device exists and the primary dump device cannot be reached, the secondary dump device will accept the dump information intended for the primary dump device.
IBM recommendation:
All I can recommend you is to force a dump the next time the problem should occur. This will enable us to check which process was hanging or what caused the system to not respond any more. You can do this via the HMC using the following steps:
Operations -> Restart -> Dump
As a general recommendation you should always force a dump if a system is hanging. There are only very few cases in which we can determine the reason for a hanging system without having a dump available for analysis.
Traditional vs Firmware-assisted dump:
Up to POWER5 only traditioanl dumps were available, and the introduction of the POWER6 processor-based systems allowed system dumps to be firmware assisted. When performing a firmware-assisted dump, system memory is frozen and the partition rebooted, which allows a new instance of the operating system to complete the dump.
Traditional dump: it is generated before partition is rebooted.
(When system crashed, memory content is trying to be copied at that moment to dump device)
Firmware-assisted dump: it takes place when the partition is restarting.
(When system crashed, memory is frozen, and by hypervisor (firmware) new memory space is allocated in RAM, and the contents of memory is copied there. Then during reboot it is copied from this new memory area to the dump device.)
Firmware-assisted dump offers improved reliability over the traditional dump, by rebooting the partition and using a new kernel to dump data from the previous kernel crash.
When an administrator attempts to switch from a traditional to firmware-assisted system dump, system memory is checked against the firmware-assisted system dump memory requirements. If these memory requirements are not met, then the "sysdumpdev -t" command output reports the required minimum system memory to allow for firmware-assisted dump to be configured. Changing from traditional to firmware-assisted dump requires a reboot of the partition for the dump changes to take effect.
Firmware-assisted system dumps can be one of these types:
Selective memory dump: Selective memory dumps are triggered by or use of AIX instances that must be dumped.
Full memory dump: The whole partition memory is dumped without any interaction with an AIX instance that is failing.
-------------------------------------------
Use the sysdumpdev command to query or change the primary or secondary dump devices.
- Primary: usually used when you wish to save the dump data
- Secondary: can be used to discard dump data (that is, /dev/sysdumpnull)
Flags for sysdumpdev command:
-l list the current dump destination
-e estimates the size of the dump (in bytes)
-p primary
-s secondary
-P make change permanent
-C turns on compression
-c turns off compression
-L shows info about last dump
-K turns on: alway allow system dump
sysdumpdev -P -p /dev/dumpdev change the primary dumpdevice permanently to /dev/dumpdev
root@aix1: /root # sysdumpdev -l
primary /dev/dumplv
secondary /dev/sysdumpnull
copy directory /var/adm/ras
forced copy flag TRUE
always allow dump TRUE <--if it is on FALSE then in smitty sysdumpdev it can be change
dump compression ON <--if it is on OFF then sysdumpdev -C changes it to ON-ra (-c changes it to OFF)
Other commands:
sysdumpstart starts a dump (smitty dump)(it will do a reboot as well)
kdb it analysis the dump
/usr/lib/ras/dumpcheck checks if dump device and copy directory are able to receive the system dump
If dump device is a paging space, it verifies if enough free space exists in the copy dir to hold the dump
If dump device is a logical volume, it verifies it is large enough to hold a dump
(man dumpcheck)
-------------------------------------------
Creating a dump device
1. sysdumpdev -e <--shows an estimation, how much space is required for a dump
2. mklv -t sysdump -y lg_dumplv rootvg 3 hdisk0 <--it creates a sysdump lv with 3 PPs
3. sysdumpdev -Pp /dev/lg_dumplv <--making it as a primary device (system will use this lv now for dumps)
-------------------------------------------
System dump initiaded by a user
!!!reboot will take place automatically!!!
1. sysdumpstart -p <--initiates a dump to the primary device
(Reboot will be done automatically)
(If a dedicated dump device is used, user initiated dumps are not copied automatically to copy directory.)
(If paging space is used for dump, then dump will be copied automatically to /var/adm/ras)
2. sysdumpdev -L <--shows dump took place on the primary device, time, size ... (errpt will show as well)
3. savecore -d /var/adm/ras <--copy last dump from system dump device to directory /var/adm/ras (if paging space is used this is not needed)
-------------------------------------------
How to move dumplv to another disk:
We want to move from hdisk1 to hdisk0:
1. lslv -l dumplv <--checking which disk
2. sysdumpdev -l <--checking sysdump device (primary was here /dev/dumplv)
3. sysdumpdev -Pp /dev/sysdumpnull <--changing primary to sysdumpnull (secondary, it is a null device) (lsvg -l roovg shows closed state)
4. migratepv -l dumplv hdisk1 hdisk0 <--moving it from hdisk1 to hdisk0
5. sysdumpdev -Pp /dev/dumplv <--changing back to the primary device
-------------------------------------------
The largest dump device is too small: (LABEL: DMPCHK_TOOSMALL IDENT)
1. Dumpcheck runs from crontab
# crontab -l | grep dump
0 15 * * * /usr/lib/ras/dumpcheck >/dev/null 2>&1
2. Check if there are any errors:
# errpt
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
E87EF1BE 0703150008 P O dumpcheck The largest dump device is too small.
E87EF1BE 0702150008 P O dumpcheck The largest dump device is too small.
3. If you find new error message, find dumplv:
# sysdumpdev -l
primary /dev/dumplv
secondary /dev/sysdumpnull
copy directory /var/adm/ras
forced copy flag TRUE
always allow dump TRUE
dump compression ON
List dumplv form rootvg:
# lsvg -l rootvg|grep dumplv
dumplv dump 8 8 1 open/syncd N/A
4. Extend with 1 PP
# extendlv dumplv 1
dumplv dump 9 9 1 open/syncd N/A
Run problem check at the end
OK -> done
Not OK -> Extend with 1 PP again.
-------------------------------------------
changing the autorestart attribute of the systemdump:
(smitty chgsys as well)
1.lsattr -El sys0 -a autorestart
autorestart true Automatically REBOOT system after a crash True
2.chdev -l sys0 -a autorestart=false
sys0 changed
----------------------------------------------
CORE FILE:
errpt shows which program, if not:
- use the “strings” command (for example: ”strings core | grep _=”)
- or the lquerypv command: (for example: “lquerypv -h core 6b0 64”)
man syscorepath
syscorepath -p /tmp
syscorepath -g
CRONTAB:
The cron daemon, which translates to Chronological Data Event Monitor, is a program that schedules jobs to run automatically at a specific time and date. The /etc/inittab file contains all the AIX startup programs, including the cron daemon. The init process in AIX starts the cron daemon, or cron, from the inittab file during the initialization process of the operating system.
You can submit jobs, or events, to cron by doing one of the following:
Use the at and batch facilities to submit jobs for one-time execution.
Use the crontab files to execute jobs at regularly scheduled intervals (hourly, daily, weekly, and so on).
By default, cron can concurrently run 100 events of equal importance. The /usr/adm/cron/queuedefs file allows you to change this schedule.
c.200j10n120w
| | | |
| | | wait period (in seconds)
| | nice value
| jobs
cron
At regularly scheduled intervals, cron looks for and reads the crontab files that are located in the directory /var/spool/cron/crontabs.
These files contain jobs submitted by users. For example, the file /var/spool/cron/crontabs/john contains John's jobs that are scheduled to be run by cron
The cron daemon reads the files in the /var/spool/cron/crontabs directory. The files in this directory are named for the individual users.
When changes are made to the files in the crontabs directory, the cron daemon must be notified to reread the files
/var/adm/cron/log cron daemon creates a log of its activities
/var/adm/cron/cron.deny Any user can use cron except those listed in this file
/var/adm/cron/cron.allow Only users listed in this file can use cron (root user included)
crontab -l Lists the contents of your current crontab file
crontab -e Edits your current crontab file (when the file saved, the cron daemon is automatically refreshed.)
crontab -r Removes your crontab file from the crontab directory
crontab -v check crontab submission time
crontab mycronfile submit your crontab file to /var/spool/cron/crontabs directory
crontab file format:
minute hour day_of_month month weekday command
0-59 0-23 1-31 1-12 0-6 Sun-Sat shell command
* * * * * /bin/script.sh schedule a job to run every minute
0 1 15 * * /fullbackup 1 am on the 15th of every month
0 0 * * 1-5 /usr/sbin/backup start the backup command at midnight, Mo - Fr
0,15,30,45 6-17 * * 1-5 /home/script1 execute script1 every 15 minutes between 6AM and 5PM, Mo - Fr
0 1 1 * * /tmp -name 'TRACE*' -mtime +270 -exec rm {} \\; >/dev/null 2>&1 it will delete files older than 9 months
(\\; <-- double "\" needed because to interpret ";" correctly)
----------------------------
AT:
at submits a job for cron to run at a specific time in the future
(at -f /home/root/bb_at -t 2007122503)
echo "<command>" | at now this starts in the background (and you can log off)
at now +2 mins
banner hello > /dev/pts/0
<ctrl-d> (at now + 1 minute,at 5 pm Friday )
/var/adm/cron/at.deny allows any users except those listed in this file to use the at command.
/var/adm/cron/at.allow allows only those users listed in this file to use the at command (including root).
at -l Lists at jobs
atq [user] Views other user's jobs (Only root can use this command.)
at -r Cancels an at job
atrm job Cancels an at job by job number
atrm user Cancels an at job by the user (root can use it for any user; users can cancel their jobs.)
atrm Cancels all at jobs belonging to the user invoking the atrm command
batch submits a job to be run in the background when the processor load is low
The cron daemon, which translates to Chronological Data Event Monitor, is a program that schedules jobs to run automatically at a specific time and date. The /etc/inittab file contains all the AIX startup programs, including the cron daemon. The init process in AIX starts the cron daemon, or cron, from the inittab file during the initialization process of the operating system.
You can submit jobs, or events, to cron by doing one of the following:
Use the at and batch facilities to submit jobs for one-time execution.
Use the crontab files to execute jobs at regularly scheduled intervals (hourly, daily, weekly, and so on).
By default, cron can concurrently run 100 events of equal importance. The /usr/adm/cron/queuedefs file allows you to change this schedule.
c.200j10n120w
| | | |
| | | wait period (in seconds)
| | nice value
| jobs
cron
At regularly scheduled intervals, cron looks for and reads the crontab files that are located in the directory /var/spool/cron/crontabs.
These files contain jobs submitted by users. For example, the file /var/spool/cron/crontabs/john contains John's jobs that are scheduled to be run by cron
The cron daemon reads the files in the /var/spool/cron/crontabs directory. The files in this directory are named for the individual users.
When changes are made to the files in the crontabs directory, the cron daemon must be notified to reread the files
/var/adm/cron/log cron daemon creates a log of its activities
/var/adm/cron/cron.deny Any user can use cron except those listed in this file
/var/adm/cron/cron.allow Only users listed in this file can use cron (root user included)
crontab -l Lists the contents of your current crontab file
crontab -e Edits your current crontab file (when the file saved, the cron daemon is automatically refreshed.)
crontab -r Removes your crontab file from the crontab directory
crontab -v check crontab submission time
crontab mycronfile submit your crontab file to /var/spool/cron/crontabs directory
crontab file format:
minute hour day_of_month month weekday command
0-59 0-23 1-31 1-12 0-6 Sun-Sat shell command
* * * * * /bin/script.sh schedule a job to run every minute
0 1 15 * * /fullbackup 1 am on the 15th of every month
0 0 * * 1-5 /usr/sbin/backup start the backup command at midnight, Mo - Fr
0,15,30,45 6-17 * * 1-5 /home/script1 execute script1 every 15 minutes between 6AM and 5PM, Mo - Fr
0 1 1 * * /tmp -name 'TRACE*' -mtime +270 -exec rm {} \\; >/dev/null 2>&1 it will delete files older than 9 months
(\\; <-- double "\" needed because to interpret ";" correctly)
----------------------------
AT:
at submits a job for cron to run at a specific time in the future
(at -f /home/root/bb_at -t 2007122503)
echo "<command>" | at now this starts in the background (and you can log off)
at now +2 mins
banner hello > /dev/pts/0
<ctrl-d> (at now + 1 minute,at 5 pm Friday )
/var/adm/cron/at.deny allows any users except those listed in this file to use the at command.
/var/adm/cron/at.allow allows only those users listed in this file to use the at command (including root).
at -l Lists at jobs
atq [user] Views other user's jobs (Only root can use this command.)
at -r Cancels an at job
atrm job Cancels an at job by job number
atrm user Cancels an at job by the user (root can use it for any user; users can cancel their jobs.)
atrm Cancels all at jobs belonging to the user invoking the atrm command
batch submits a job to be run in the background when the processor load is low
Backup - Restore
MKSYSB (for rootvg, OS):
(only for rootvg and only for mounted filesystems)
mksysb -i /testfs/mksysb.0725 creates an installable image of the rootvg (from NIM it can be restored)
The restorevgfiles and listvgbackup -r commands perform identical operations and should be considered interchangeable
listvgbackup -f /mnt/aix11.mksysb -r /etc/resolv.conf restores the file /etc/resolv.conf from the specified backup
------------------------------------------------
BACKUP - RESTORE (for filesystems):
find /bckfs -print | backup -i -f /dev/rmt0 backup all the files and subdirs
find: genereates a list of all the files
-i: files will be read from standard input
restore extracts files from archives created with the backup command.
JFS2 snapshot:
creates a point in time image, it is very quick and very small
------------------------------------------------
SAVEVG - RESTVG (for volume groups):
savevg -f /bckfs/backup.0725 bckvg backs up all files belonging to bckvg to the specified file
restvg -f /bckfs/backup.0725 restores the vg and all the files what have been saved with savevg
(it creates the vg, lv, fs...)
------------------------------------------------
TAPE:
tctl gives subcommand to tape (move forward/backward)
tcopy copies magnetic tapes
MKSYSB (for rootvg, OS):
(only for rootvg and only for mounted filesystems)
mksysb -i /testfs/mksysb.0725 creates an installable image of the rootvg (from NIM it can be restored)
The restorevgfiles and listvgbackup -r commands perform identical operations and should be considered interchangeable
listvgbackup -f /mnt/aix11.mksysb -r /etc/resolv.conf restores the file /etc/resolv.conf from the specified backup
------------------------------------------------
BACKUP - RESTORE (for filesystems):
find /bckfs -print | backup -i -f /dev/rmt0 backup all the files and subdirs
find: genereates a list of all the files
-i: files will be read from standard input
restore extracts files from archives created with the backup command.
JFS2 snapshot:
creates a point in time image, it is very quick and very small
------------------------------------------------
SAVEVG - RESTVG (for volume groups):
savevg -f /bckfs/backup.0725 bckvg backs up all files belonging to bckvg to the specified file
restvg -f /bckfs/backup.0725 restores the vg and all the files what have been saved with savevg
(it creates the vg, lv, fs...)
------------------------------------------------
TAPE:
tctl gives subcommand to tape (move forward/backward)
tcopy copies magnetic tapes
Commands
PowerVM Editions:
http://www-912.ibm.com/pod/pod
Under the VET code:
C2DBF2AD8D3427F6CA1F00002C20004110
-Express 0000
-Standard 2C00
-Enterpise 2C20
-----------------
VIOS service package definitions
Fix Pack
A Fix Pack updates your VIOS release to the latest level. A Fix Pack update can contain product enhancements, new function and fixes.
Service Pack
A Service Pack applies to only one (the latest) VIOS level. A Service Pack contains critical fixes for issues found between Fix Pack releases. A Service Pack does not update the VIOS to a new level and it can only be applied to the Fix Pack release for which it is specified.
Interim Fix
An Interim Fix (iFix) applies to only one (the latest) VIOS level and provides a fix for a specific issue.
-----------------
Virtual I/O Server is a special partition that is not intended to run end-user applications, and should only provide login for system administrators. Virtual I/O Server allows the sharing of physical resources between supported AIX partitions to allow more efficient utilization and flexibility for using physical storage and network devices.
-----------------
User padmin:
Primary administrator on the VIO Server is the user padmin. It has a restriced shell (can't change home directory:/home/padmin) with vios commands.
The oem_setup_env command will place the padmin user in a non-restricted root shell with a home directory in the /home/padmin directory. The user can then run any command available to the root user. This is not a supported Virtual I/O Server administration method. The purpose of this command is to allow installation of vendor software, such as device drivers.
By default the ioscli commands are not available for the root user. All ioscli commands are in fact calls of /usr/ios/cli/ioscli with the command as argument. (You see this if you list the aliases of the padmin user.)
You can use all ioscli commands as user root by appending /usr/ios/cli/ioscli. (/usr/ios/cli/ioscli lsmap -all)
You can set an alias:
alias i=/usr/ios/cli/ioscli
i lsmap -all
Typing exit, will return the user to the Virtual I/O Server prompt.
------------------------------------
ioslevel shows vio server level
installios installs the Virtual I/O Server. This command is run from the HMC.
backupios creates an installable image of the root volume group (saves almost everything)
viosbr creates backups from user defined virtual device configs (saves only the mappings, virt. devices..)
viosbr -view -file <file name> displays the contents of a backup file (whic was made before with viosbr)
savevgstruct it will make a backup of a volume group structure
lsgcl show history, what comands have been run on the vio server (gcl: global comand log)
cfgdev devices are recognized after runnig cfgdev
cfgassist on vio server as padmin brings up smitty style menu for doing several tasks
chkdev -dev hdisk4 -verbose show if attached device can be migrated from physical adapter to virtual adapter (PHYS2VIRT_CAP.. -> yes)
lsdev lists all devices
lsdev -slots lists I/O slot information for built-in adapters (those are not hot-pluggable but DLPAR capable)
lsdev -virtual lists virtual devices
lsdev -type adapter same as lsdev -Cc adapter
lsdev -dev vhost0 -vpd same as lscfg -vpl vhost0 in AIX (but on vio as padmin user lscfg does not work
lsdev -dev ent4 -attr shows attributes of the devices (same as lsattr -El ent4)
chdev -dev fscsi0 -attr fc_err_recov=fast_fail dyntrk=yes -perm changes the attributes (same as chdev -l fscsi0 -a fc_err_recov...)
lsmap -all lists all vscsi devices
lsmap -all -npiv lists npiv adapters (with slot numbers (aka adapter ID))
lsmap -all -net lists virtual ethernet adapters (with slot numbers (aka adapter ID))
(U8204.E8A.0680E82-V1-C14-T1 <--C14 is the adapter ID)
lsmap -vadapter vhost0 shows infos about a specific vscsi adapter
lsmap -vadapter vfchost0 -npiv shows infos about a specific npiv adapter
lsmap -vadapter ent11 -net shows infos about a specific virtual ethernet device (with sea or physical devices will not work)
lsvg -lv rootvg same as lsvg -l rootvg
lspv shows all available hdisk devices
lspv -free shows hdisks which are free to be used as backing devices
lspv -size shows hdisks with sizes
rmdev -dev vhost0 -recursive removing a virtual device, e.g. vhost0 (-recursive: removes all still attached child devices as well)
rmvdev -vdev <backing dev.> removing connection between backing dev. (it can be a physical dev. or an lv) and the virtual SCSI adapter
(backing device name can be get from lsmap output)
mkvdev -vlan ent9 -tagid 200 creates a vlan tagged interface over the ent9 interface (ent9 can be a SEA adapter)
netstat -v | grep Prio it will show if this vio server is active or the other vio server is active (regarding network)
license -swma once helped, when vio commands did not want to work
viosecure -firewall on -reload enable firewall with default config (enables: https, http, rmc,ssh, ftp...)
viosecure -firewall -view display current firewall rules
------------------------------------
mirroring vio server rootvg:
1. extendvg -f rootvg hdisk2 <--if pp limitation problem: chvg -factor 6 rootvg, then extendvg
2. mirrorios -defer hdisk2 <--mirror rootvg to hdisk2; -defer is used, as no need to reboot since VIOS 1.5 :)
(use the -f only if required, which will do a reboot without prompting you to continue)
3. bootlist -mode normal -ls <--checking bootlist
------------------------------------
Debugging VIO problems:
I. truss:
$ oem_setup_env
# truss /usr/ios/cli/ioscli <failing_padmin_command> (or truss -feal /usr/ios/cli/ioscli <failing_padmin_command>)
---------
II. CLI_DEBUG=33
By exporting CLI_DEBUG=33, we can see which AIX command is used in the background of VIO command. After running that AIX command we can get more info.
For example:
1. mkvdev -vdev hdiskpower1 -vadapter vhost0 -dev testdisk <--running this command gives not enough info about the problem
*******************************************************************************
The command's response was not recognized. This may or may not indicate a problem.
*******************************************************************************
2. export CLI_DEBUG=33 <--exporting CLI_DEBUg=33
3. mkvdev -vdev hdiskpower1 -vadapter vhost0 -dev testdisk <--running again, we can see which AIX command will be invoked
AIX: "lspv -l hdiskpower1 2>&1 | grep 0516-320"
AIX: "export LANG=C;/usr/sbin/pooladm -I pool querydisk /dev/hdiskpower1"
AIX: "/usr/sbin/lquerypv -V hdiskpower1"
AIX: "mkdev -V hdiskpower1 -p vhost0 -l testdisk " <--this command will be needed
*******************************************************************************
The command's response was not recognized. This may or may not indicate a problem.
*******************************************************************************
4.oem_setup_env <--enabling root environment
5. mkdev -V hdiskpower1 -p vhost0 -l testdisk <--running AIX command and it shows more detailed info
Method error (/usr/lib/methods/cfg_vt_scdisk):
0514-012 Cannot open a file or device.
The solution was in this case, that I forgot to set no_reserve for the given disk on the other VIOS. The other VIOS already used this disk (with reservation) that is why I could not configure here because it was locked.
------------------------------------
Creating a client LPAR via VIO server:
planned devices for client LPAR:
1 virtual Ethernet
2 SCSI:
-for virtual disk
-for virtual optical device (cd)
1. find out what will be the client partition id (on HMC), because it will be needed when creating adapters for the new client LPAR
2. on VIO server (HMC) create virtual server adapters:
- virtual Ethernet: for me it was enough for inter LPAR communicatio, so tagging is not needed (only remember PVID)
- virtual SCSI for hdisk and optical device (cd): here the planned client partition id should be set
3. on VIO server configure the devices and mappings, after that 'cfgdev':
-virtual Ethernet:
set ip: chdev -l en19 -a netaddr=10.10.10.26 -a netmask=255.255.255.0 -a state=up
-virtual SCSI for hdisk:
map disk for rootvg: mkvdev -vdev hdisk45 -vadapter vhost1 -dev bb_lpar_rootvg
-virtual optical device:
create a file backed optical device, for iso images: mkvdev -fbo -vadapter vhost1
copy the iso image to /var/vio/VMLibrary (lsrep)
load the image into the vtopt0 device: loadopt -vtd vtopt0 -disk dvd.1022A4_OBETA_710.iso (lsmap -all will show it)
4. create client LPAR on HMC:
partition id should be as planned (to match with the above created adapters)
set processor, memory... phyisical I/O is not needed
create virtual Ethernet and SCSI (it should be untagged as created on VIO Server with the same PVID)
create SCSI adapter: 1 is enough for disk and optical device
LHEA is not needed and other settings were not changed
5. activate profile
go into SMS -> choose cdrom -> install AIX
6. on the new client LPAR
set ip, hostname, routing...
set ip: chdev -l en0 -a netaddr=10.10.10.25 -a netmask=255.255.255.0 -a state=up
check ping from vio, then ssh is possible to new LPAR
------------------------------------
Network Time Protocol config:
1. vi /home/padmin/config/ntp.conf <--edit ntp.conf file
content should look like this:
server ptbtime1.ptb.de
server ptbtime2.ptb.de
driftfile /home/padmin/config/ntp.drift
tracefile /home/padmin/config/ntp.trace
logfile /home/padmin/config/ntp.log
2. startnetsvc xntpd <--start xntpd daemon
3. cat /home/padmin/config/ntp.log <--check log for errors
if you see this:
time error 3637.530348 is way too large <--if difference between local and timeserver time is too large synchroniztaion cannot occur
4. chdate 1206093607 <--change clock manually
Thu Dec 6 09:36:16 CST 2007
5. cat /home/padmin/config/ntp.log: <--check log again
synchronized to 9.3.4.7, stratum=2
------------------------------------
PowerVM Editions:
http://www-912.ibm.com/pod/pod
Under the VET code:
C2DBF2AD8D3427F6CA1F00002C20004110
-Express 0000
-Standard 2C00
-Enterpise 2C20
-----------------
VIOS service package definitions
Fix Pack
A Fix Pack updates your VIOS release to the latest level. A Fix Pack update can contain product enhancements, new function and fixes.
Service Pack
A Service Pack applies to only one (the latest) VIOS level. A Service Pack contains critical fixes for issues found between Fix Pack releases. A Service Pack does not update the VIOS to a new level and it can only be applied to the Fix Pack release for which it is specified.
Interim Fix
An Interim Fix (iFix) applies to only one (the latest) VIOS level and provides a fix for a specific issue.
-----------------
Virtual I/O Server is a special partition that is not intended to run end-user applications, and should only provide login for system administrators. Virtual I/O Server allows the sharing of physical resources between supported AIX partitions to allow more efficient utilization and flexibility for using physical storage and network devices.
-----------------
User padmin:
Primary administrator on the VIO Server is the user padmin. It has a restriced shell (can't change home directory:/home/padmin) with vios commands.
The oem_setup_env command will place the padmin user in a non-restricted root shell with a home directory in the /home/padmin directory. The user can then run any command available to the root user. This is not a supported Virtual I/O Server administration method. The purpose of this command is to allow installation of vendor software, such as device drivers.
By default the ioscli commands are not available for the root user. All ioscli commands are in fact calls of /usr/ios/cli/ioscli with the command as argument. (You see this if you list the aliases of the padmin user.)
You can use all ioscli commands as user root by appending /usr/ios/cli/ioscli. (/usr/ios/cli/ioscli lsmap -all)
You can set an alias:
alias i=/usr/ios/cli/ioscli
i lsmap -all
Typing exit, will return the user to the Virtual I/O Server prompt.
------------------------------------
ioslevel shows vio server level
installios installs the Virtual I/O Server. This command is run from the HMC.
backupios creates an installable image of the root volume group (saves almost everything)
viosbr creates backups from user defined virtual device configs (saves only the mappings, virt. devices..)
viosbr -view -file <file name> displays the contents of a backup file (whic was made before with viosbr)
savevgstruct it will make a backup of a volume group structure
lsgcl show history, what comands have been run on the vio server (gcl: global comand log)
cfgdev devices are recognized after runnig cfgdev
cfgassist on vio server as padmin brings up smitty style menu for doing several tasks
chkdev -dev hdisk4 -verbose show if attached device can be migrated from physical adapter to virtual adapter (PHYS2VIRT_CAP.. -> yes)
lsdev lists all devices
lsdev -slots lists I/O slot information for built-in adapters (those are not hot-pluggable but DLPAR capable)
lsdev -virtual lists virtual devices
lsdev -type adapter same as lsdev -Cc adapter
lsdev -dev vhost0 -vpd same as lscfg -vpl vhost0 in AIX (but on vio as padmin user lscfg does not work
lsdev -dev ent4 -attr shows attributes of the devices (same as lsattr -El ent4)
chdev -dev fscsi0 -attr fc_err_recov=fast_fail dyntrk=yes -perm changes the attributes (same as chdev -l fscsi0 -a fc_err_recov...)
lsmap -all lists all vscsi devices
lsmap -all -npiv lists npiv adapters (with slot numbers (aka adapter ID))
lsmap -all -net lists virtual ethernet adapters (with slot numbers (aka adapter ID))
(U8204.E8A.0680E82-V1-C14-T1 <--C14 is the adapter ID)
lsmap -vadapter vhost0 shows infos about a specific vscsi adapter
lsmap -vadapter vfchost0 -npiv shows infos about a specific npiv adapter
lsmap -vadapter ent11 -net shows infos about a specific virtual ethernet device (with sea or physical devices will not work)
lsvg -lv rootvg same as lsvg -l rootvg
lspv shows all available hdisk devices
lspv -free shows hdisks which are free to be used as backing devices
lspv -size shows hdisks with sizes
rmdev -dev vhost0 -recursive removing a virtual device, e.g. vhost0 (-recursive: removes all still attached child devices as well)
rmvdev -vdev <backing dev.> removing connection between backing dev. (it can be a physical dev. or an lv) and the virtual SCSI adapter
(backing device name can be get from lsmap output)
mkvdev -vlan ent9 -tagid 200 creates a vlan tagged interface over the ent9 interface (ent9 can be a SEA adapter)
netstat -v | grep Prio it will show if this vio server is active or the other vio server is active (regarding network)
license -swma once helped, when vio commands did not want to work
viosecure -firewall on -reload enable firewall with default config (enables: https, http, rmc,ssh, ftp...)
viosecure -firewall -view display current firewall rules
------------------------------------
mirroring vio server rootvg:
1. extendvg -f rootvg hdisk2 <--if pp limitation problem: chvg -factor 6 rootvg, then extendvg
2. mirrorios -defer hdisk2 <--mirror rootvg to hdisk2; -defer is used, as no need to reboot since VIOS 1.5 :)
(use the -f only if required, which will do a reboot without prompting you to continue)
3. bootlist -mode normal -ls <--checking bootlist
------------------------------------
Debugging VIO problems:
I. truss:
$ oem_setup_env
# truss /usr/ios/cli/ioscli <failing_padmin_command> (or truss -feal /usr/ios/cli/ioscli <failing_padmin_command>)
---------
II. CLI_DEBUG=33
By exporting CLI_DEBUG=33, we can see which AIX command is used in the background of VIO command. After running that AIX command we can get more info.
For example:
1. mkvdev -vdev hdiskpower1 -vadapter vhost0 -dev testdisk <--running this command gives not enough info about the problem
*******************************************************************************
The command's response was not recognized. This may or may not indicate a problem.
*******************************************************************************
2. export CLI_DEBUG=33 <--exporting CLI_DEBUg=33
3. mkvdev -vdev hdiskpower1 -vadapter vhost0 -dev testdisk <--running again, we can see which AIX command will be invoked
AIX: "lspv -l hdiskpower1 2>&1 | grep 0516-320"
AIX: "export LANG=C;/usr/sbin/pooladm -I pool querydisk /dev/hdiskpower1"
AIX: "/usr/sbin/lquerypv -V hdiskpower1"
AIX: "mkdev -V hdiskpower1 -p vhost0 -l testdisk " <--this command will be needed
*******************************************************************************
The command's response was not recognized. This may or may not indicate a problem.
*******************************************************************************
4.oem_setup_env <--enabling root environment
5. mkdev -V hdiskpower1 -p vhost0 -l testdisk <--running AIX command and it shows more detailed info
Method error (/usr/lib/methods/cfg_vt_scdisk):
0514-012 Cannot open a file or device.
The solution was in this case, that I forgot to set no_reserve for the given disk on the other VIOS. The other VIOS already used this disk (with reservation) that is why I could not configure here because it was locked.
------------------------------------
Creating a client LPAR via VIO server:
planned devices for client LPAR:
1 virtual Ethernet
2 SCSI:
-for virtual disk
-for virtual optical device (cd)
1. find out what will be the client partition id (on HMC), because it will be needed when creating adapters for the new client LPAR
2. on VIO server (HMC) create virtual server adapters:
- virtual Ethernet: for me it was enough for inter LPAR communicatio, so tagging is not needed (only remember PVID)
- virtual SCSI for hdisk and optical device (cd): here the planned client partition id should be set
3. on VIO server configure the devices and mappings, after that 'cfgdev':
-virtual Ethernet:
set ip: chdev -l en19 -a netaddr=10.10.10.26 -a netmask=255.255.255.0 -a state=up
-virtual SCSI for hdisk:
map disk for rootvg: mkvdev -vdev hdisk45 -vadapter vhost1 -dev bb_lpar_rootvg
-virtual optical device:
create a file backed optical device, for iso images: mkvdev -fbo -vadapter vhost1
copy the iso image to /var/vio/VMLibrary (lsrep)
load the image into the vtopt0 device: loadopt -vtd vtopt0 -disk dvd.1022A4_OBETA_710.iso (lsmap -all will show it)
4. create client LPAR on HMC:
partition id should be as planned (to match with the above created adapters)
set processor, memory... phyisical I/O is not needed
create virtual Ethernet and SCSI (it should be untagged as created on VIO Server with the same PVID)
create SCSI adapter: 1 is enough for disk and optical device
LHEA is not needed and other settings were not changed
5. activate profile
go into SMS -> choose cdrom -> install AIX
6. on the new client LPAR
set ip, hostname, routing...
set ip: chdev -l en0 -a netaddr=10.10.10.25 -a netmask=255.255.255.0 -a state=up
check ping from vio, then ssh is possible to new LPAR
------------------------------------
Network Time Protocol config:
1. vi /home/padmin/config/ntp.conf <--edit ntp.conf file
content should look like this:
server ptbtime1.ptb.de
server ptbtime2.ptb.de
driftfile /home/padmin/config/ntp.drift
tracefile /home/padmin/config/ntp.trace
logfile /home/padmin/config/ntp.log
2. startnetsvc xntpd <--start xntpd daemon
3. cat /home/padmin/config/ntp.log <--check log for errors
if you see this:
time error 3637.530348 is way too large <--if difference between local and timeserver time is too large synchroniztaion cannot occur
4. chdate 1206093607 <--change clock manually
Thu Dec 6 09:36:16 CST 2007
5. cat /home/padmin/config/ntp.log: <--check log again
synchronized to 9.3.4.7, stratum=2
------------------------------------
VIRTUAL SCSI
Virtual SCSI is based on a client/server relationship. The Virtual I/O Server owns the physical resources and acts as server or, in SCSI terms, target device. The client logical partitions access the virtual SCSI backing storage devices provided by the Virtual I/O Server as clients.
Virtual SCSI server adapters can be created only in Virtual I/O Server. For HMC-managed systems, virtual SCSI adapters are created and assigned to logical partitions using partition profiles.
The vhost SCSI adapter is the same as a normal SCSI adapter. You can have multiple disks assigned to it. Usually one virtual SCSI server adapter mapped to one virtual SCSI client adapter will be configured, mapping backing devices through to individual LPARs. It is possible to map these virtual SCSI server adapters to multiple LPARs, which is useful for creating virtual optical and/or tape devices, allowing removable media devices to be shared between multiple client partitions.
on VIO server:
root@vios1: / # lsdev -Cc adapter
vhost0 Available Virtual SCSI Server Adapter
vhost1 Available Virtual SCSI Server Adapter
vhost2 Available Virtual SCSI Server Adapter
The client partition accesses its assigned disks through a virtual SCSI client adapter. The virtual SCSI client adapter sees the disks, logical volumes or file-backed storage through this virtual adapter as virtual SCSI disk devices.
on VIO client:
root@aix21: / # lsdev -Cc adapter
vscsi0 Available Virtual SCSI Client Adapter
root@aix21: / # lscfg -vpl hdisk2
hdisk2 U9117.MMA.06B5641-V6-C13-T1-L890000000000 Virtual SCSI Disk Drive
In SCSI terms:
virtual SCSI server adapter: target
virtual SCSI client adapter: initiator
(Analogous to server client model, where client is the initiator.)
Physical disks presented to the Virtual I/O Server can be exported and assigned to a client partition in a number of different ways:
- The entire disk is presented to the client partition.
- The disk is divided into several logical volumes, which can be presented to a single client or multiple different clients.
- With the introduction of Virtual I/O Server 1.5, files can be created on these disks and file-backed storage can be created.
- With the introduction of Virtual I/O Server 2.2 Fixpack 24 Service Pack 1 logical units from a shared storage pool can be created.
The IVM and HMC environments present 2 different interfaces for storage management under different names. Storage Pool interface under IVM is essentially the same as LVM under HMC. (These are used sometimes interchangeably.) So volume group can refer to both volume groups and storage pools, and logical volume can refer to both logical volumes and storage pool backing devices.
Once these virtual SCSI server/client adapter connections have been set up, one or more backing devices (whole disks, logical volumes or files) can be presented using the same virtual SCSI adapter.
When using Live Partition Mobility storage needs to be assigned to the Virtual I/O Servers on the target server.
----------------------------
File Backed Virtual SCSI Devices
Virtual I/O Server (VIOS) version 1.5 introduced file-backed virtual SCSI devices. These virtual SCSI devices serve as disks or optical media devices for clients.
In the case of file-backed virtual disks, clients are presented with a file from the VIOS that it accesses as a SCSI disk. With file-backed virtual optical devices, you can store, install and back up media on the VIOS, and make it available to clients.
----------------------------
Check VSCSI adapter mapping on client:
root@bb_lpar: / # echo "cvai" | kdb | grep vscsi <--cvai is a kdb subcommand
read vscsi_scsi_ptrs OK, ptr = 0xF1000000C01A83C0
vscsi0 0x000007 0x0000000000 0x0 aix-vios1->vhost2 <--shows which vhost is used on which vio server for this client
vscsi1 0x000007 0x0000000000 0x0 aix-vios1->vhost1
vscsi2 0x000007 0x0000000000 0x0 aix-vios2->vhost2
----------------------------
Managing VSCSI devices (server-client mapping)
1. HMC -> VIO Server -> DLPAR -> Virtual Adapter (create vscsi adapter, name the client which can use it, then create the same in profile)
(the profile can be updated: configuration -> save current config.)
(in case of an optical device, check out any client partition can connect)
2. HMC -> VIO Client -> DLPAR -> Virtual Adapter (create the same adapter as above, the ids should be mapped, do it in the profile as well)
3. cfgdev (VIO server), cfgmgr (client) <--it will bring up vhostX on vio server, vscsiX on client
4. create needed disk assignments:
-using physical disks:
mkvdev -vdev hdisk34 -vadapter vhost0 -dev vclient_disk <--for easier identification useful to give a name with the -dev flag
rmvdev -vdev <backing dev.> <--back. dev can be checked with lsmap -all (here vclient_disk)
-using logical volumes:
mkvg -vg testvg_vios hdisk34 <--creating vg for lv
lsvg <--listing a vg
reducevg <vg> <disk> <--deleting a vg
mklv -lv testlv_client testvg_vios 10G <--creating lv what will be mapped to client
lsvg -lv <vg> <--lists lvs under a vg
rmlv <lv> <--removes an lv
mkvdev -vdev testlv_client -vadapter vhost0 -dev <any_name> <--for easier identification useful to give a name with the -dev flag
(here backing device is an lv (testlv_client)
rmvdev -vdev <back. dev.> <--removes an assignment to the client
-using logical volumes just with storage pool commands:
(vg=sp, lv=bd)
mksp <vgname> <disk> <--creating a vg (sp)
lssp <--listing stoarge pools (vgs)
chsp -add -sp <sp> PhysicalVolume <--adding a disk to the sp (vg)
chsp -rm -sp bb_sp hdisk2 <--removing hdisk2 from bb_sp (storage pool)
mkbdsp -bd <lv> -sp <vg> 10G <--creates an lv with given size in the sp
lssp -bd -sp <vg> <--lists lvs in the given vg (sp)
rmbdsp -bd <lv> -sp <vg> <--removes an lv from the given vg (sp)
mkvdev..., rmvdev... also applies
-using file backed storage pool
first a normal (LV) storage pool should be created with: mkvg or mksp, after that:
mksp -fb <fb sp name> -sp <vg> -size 20G <--creates a file backed storage pool in the given storage pool with given size
(it wil look like an lv, and a fs will be created automatically as well)
lssp <--it will show as FBPOOL
chsp -add -sp clientData -size 1G <--increase the size of the file storage pool (ClientData) by 1G
mkbdsp -sp fb_testvg -bd fb_bb -vadapter vhost2 10G <--it will create a file backed device and assigns it to the given vhost
mkbdsp -sp fb_testvg -bd fb_bb1 -vadapter vhost2 -tn balazs 8G <--it will also specify a virt. target device name (-tn)
lssp -bd -sp fb_testvg <--lists the lvs (backing devices) of the given sp
rmbdsp -sp fb_testvg -bd fb_bb1 <--removes the given lv (bd) from the sp
rmsp <file sp name> <--remove s the given file storage pool
removing it:
rmdev -dev vhost1 -recursive
----------------------------
On client partitions, MPIO for virtual SCSI devices currently only support failover mode (which means only one path is active at a time:
root@bb_lpar: / # lsattr -El hdisk0
PCM PCM/friend/vscsi Path Control Module False
algorithm fail_over Algorithm True
----------------------------
Multipathing with dual VIO config:
on VIO SERVER:
# lsdev -dev <hdisk_name> -attr <--checking disk attributes
# lsdev -dev <fscsi_name> -attr <--checking FC attributes
# chdev -dev fscsi0 -attr fc_err_recov=fast_fail dyntrk=yes-perm <--reboot is needed for these
fc_err_recov=fast_fail <--in case of a link event IO will fail immediately
dyntrk=yes <--allows the VIO server to tolerate cabling changes in the SAN
# chdev -dev hdisk3 -attr reserve_policy=no_reserve <--each disk must be set to no_reservr
reserve_policy=no_reserve <--if this is configured, dual vio server can present a disk to client
on VIO client:
# chdev -l vscsi0 -a vscsi_path_to=30 -a vscsi_err_recov=fast_fail -P <--path timout checks health of VIOS and detects if VIO Server adapter isn't responding
vscsi_path_to=30 <--by default it is disabled (0), each client adapter must be configured, minimum is 30
vscsi_err_recov=fast_fail <--failover will happen immediately rather than delayed
# chdev -l hdisk0 -a queue_depth=20 -P <--it must match the queue depth value used for the physical disk on the VIO Server
queue_depth <--it determines how many requests will be queued on the disk
# chdev -l hdisk0 -a hcheck_interval=60 -a hcheck_mode=nonactive -P <--health check updates automatically paths state
(otherwise failed path must be set manually))
hcheck_interval=60 <--how often do hcheck, each disk must be configured (hcheck_interval=0 means it is disabled)
hcheck_mode=nonactive <--hcheck is performed on nonactive paths (paths with no active IO)
Never set the hcheck_interval lower than the read/write timeout value of the underlying physical disk on the Virtual I/O Server. Otherwise, an error detected by the Fibre Channel adapter causes new healthcheck requests to be sent before the running requests time out.
The minimum recommended value for the hcheck_interval attribute is 60 for both Virtual I/O and non Virtual I/O configurations.
In the event of adapter or path issues, setting the hcheck_interval too low can cause severe performance degradation or possibly cause I/O hangs.
It is best not to configure more than 4 to 8 paths per LUN (to avoid too many hchecks IO), and set the hcheck_interval to 60 in the client partition and on the Virtual I/O Server.
----------------------------
TESTING PATH PRIORITIES:
By default all the paths are defined with priority 1 meaning that traffic will go through the first path.
If you want to control the paths 'path priority' has to be updated.
Priority of the VSCSI0 path remains at 1, so it is the primary path.
Priority of the VSCSI1 path will be changed to 2, so it will be lower priority.
PREPARATION ON CLIENT:
# lsattr -El hdisk1 | grep hcheck
hcheck_cmd test_unit_rdy <--hcheck is configured, so path should come back automatically from failed state
hcheck_interval 60
hcheck_mode nonactive
# chpath -l hdisk1 -p vscsi1 -a priority=2 <--I changed priority=2 on vscsi1 (by default both paths are priority=1)
# lspath -AHE -l hdisk1 -p vscsi0
priority 1 Priority True
# lspath -AHE -l hdisk1 -p vscsi1
priority 2 Priority True
So, configuration looks like this:
VIOS1 -> vscsi0 -> priority 1
VIOS2 -> vscsi1 -> priority 2
TEST 1:
1. ON VIOS2: # lsmap -all <--checking disk mapping on VIOS2
VTD testdisk
Status Available
LUN 0x8200000000000000
Backing device hdiskpower1
...
2. ON VIOS2: # rmdev -dev testdisk <--removing disk mapping from VIOS2
3. ON CLIENT: # lspath
Enabled hdisk1 vscsi0
Failed hdisk1 vscsi1 <--it will show failed path on vscsi2 (this is coming from VIOS2)
4. ON CLIENT: # errpt <--error report will show "PATh HAS FAILED"
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
DE3B8540 0324120813 P H hdisk1 PATH HAS FAILED
5. ON VIOS2: # mkvdev -vdev hdiskpower1 -vadapter vhost0 -dev testdisk <--configure back disk mapping from VIOS2
6. ON CLIENT: # lspath <--in 30 seconds path will come back automatically
Enabled hdisk1 vscsi0
Enabled hdisk1 vscsi1 <--because of hcheck, path came back automatically (no manual action was needed)
7. ON CLIENT: # errpt <--error report will show path has been recovered
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
F31FFAC3 0324121213 I H hdisk1 PATH HAS RECOVERED
TEST 2:
I did the same on VIOS1 (rmdev...disk, which has path priority 1 (IO is going there by default)
ON CLIENT: # lspath
Failed hdisk1 vscsi0
Enabled hdisk1 vscsi1
ON CLIENT: # errpt <--an additional disk operation error will be in errpt
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
DCB47997 0324121513 T H hdisk1 DISK OPERATION ERROR
DE3B8540 0324121513 P H hdisk1 PATH HAS FAILED
----------------------------
How to change a VSCSI adapter on client:
# lspath
Enabled hdisk0 vscsi0
Enabled hdisk0 vscsi2 <--we want to change vsci2 to vscsi1
On VIO client:
1. # rmpath -p vscsi2 -d <--remove paths from vscsi2 adapter
2. # rmdev -dl vscsi2 <--remove adapter
On VIO server:
3. # lsmap -all <--check assignment and vhost device
4. # rmdev -dev vhost0 -recursive <--remove assignment and vhost device
On HMC:
5. Remove deleted adapter from client (from profil too)
6. Remove deleted adapter from VIOS (from profil too)
7. Create new adapter on client (in profil too) <--cfgmgr on client
8. Create new adapter on VIOS (in profil too) <-cfgdev on VIO server
On VIO server:
9. # mkvdev -vdev hdiskpower0 -vadapter vhost0 -dev rootvg_hdisk0 <--create new assignment
# lspath
Enabled hdisk0 vscsi0
Enabled hdisk0 vscsi1 <--vscsi1 is there (cfgmgr may needed)
----------------------------
Assigning and moving DVD RAM between LPARS
1. lsdev -type optical <--check if VIOS owns optical device (you should see sg. like: cd0 Available SATA DVD-RAM Drive)
2. lsmap -all <--to see if cd0 is already mapped and which vhost to use for assignment (lsmap -all | grep cd0)
3. mkvdev -vdev cd0 -vadapter vhost0 <--it will create vtoptX as a virtual target device (check with lsmap -all )
4. cfgmgr (on client lpar) <--bring up cd0 device on client (before moving cd0 device rmdev device on client first)
5. rmdev -dev vtopt0 -recursive <--to move cd0 to another client, remove assignment from vhost0
6. mkvdev -vdev cd0 -vadapter vhost1 <--create new assignment to vhost1
7. cfgmgr (on other client lpar) <--bring up cd0 device on other client
(Because VIO server adapter is configured with "Any client partition can connect" option, these pairs are not suited for client disks.)
----------------------------
Virtual SCSI is based on a client/server relationship. The Virtual I/O Server owns the physical resources and acts as server or, in SCSI terms, target device. The client logical partitions access the virtual SCSI backing storage devices provided by the Virtual I/O Server as clients.
Virtual SCSI server adapters can be created only in Virtual I/O Server. For HMC-managed systems, virtual SCSI adapters are created and assigned to logical partitions using partition profiles.
The vhost SCSI adapter is the same as a normal SCSI adapter. You can have multiple disks assigned to it. Usually one virtual SCSI server adapter mapped to one virtual SCSI client adapter will be configured, mapping backing devices through to individual LPARs. It is possible to map these virtual SCSI server adapters to multiple LPARs, which is useful for creating virtual optical and/or tape devices, allowing removable media devices to be shared between multiple client partitions.
on VIO server:
root@vios1: / # lsdev -Cc adapter
vhost0 Available Virtual SCSI Server Adapter
vhost1 Available Virtual SCSI Server Adapter
vhost2 Available Virtual SCSI Server Adapter
The client partition accesses its assigned disks through a virtual SCSI client adapter. The virtual SCSI client adapter sees the disks, logical volumes or file-backed storage through this virtual adapter as virtual SCSI disk devices.
on VIO client:
root@aix21: / # lsdev -Cc adapter
vscsi0 Available Virtual SCSI Client Adapter
root@aix21: / # lscfg -vpl hdisk2
hdisk2 U9117.MMA.06B5641-V6-C13-T1-L890000000000 Virtual SCSI Disk Drive
In SCSI terms:
virtual SCSI server adapter: target
virtual SCSI client adapter: initiator
(Analogous to server client model, where client is the initiator.)
Physical disks presented to the Virtual I/O Server can be exported and assigned to a client partition in a number of different ways:
- The entire disk is presented to the client partition.
- The disk is divided into several logical volumes, which can be presented to a single client or multiple different clients.
- With the introduction of Virtual I/O Server 1.5, files can be created on these disks and file-backed storage can be created.
- With the introduction of Virtual I/O Server 2.2 Fixpack 24 Service Pack 1 logical units from a shared storage pool can be created.
The IVM and HMC environments present 2 different interfaces for storage management under different names. Storage Pool interface under IVM is essentially the same as LVM under HMC. (These are used sometimes interchangeably.) So volume group can refer to both volume groups and storage pools, and logical volume can refer to both logical volumes and storage pool backing devices.
Once these virtual SCSI server/client adapter connections have been set up, one or more backing devices (whole disks, logical volumes or files) can be presented using the same virtual SCSI adapter.
When using Live Partition Mobility storage needs to be assigned to the Virtual I/O Servers on the target server.
----------------------------
File Backed Virtual SCSI Devices
Virtual I/O Server (VIOS) version 1.5 introduced file-backed virtual SCSI devices. These virtual SCSI devices serve as disks or optical media devices for clients.
In the case of file-backed virtual disks, clients are presented with a file from the VIOS that it accesses as a SCSI disk. With file-backed virtual optical devices, you can store, install and back up media on the VIOS, and make it available to clients.
----------------------------
Check VSCSI adapter mapping on client:
root@bb_lpar: / # echo "cvai" | kdb | grep vscsi <--cvai is a kdb subcommand
read vscsi_scsi_ptrs OK, ptr = 0xF1000000C01A83C0
vscsi0 0x000007 0x0000000000 0x0 aix-vios1->vhost2 <--shows which vhost is used on which vio server for this client
vscsi1 0x000007 0x0000000000 0x0 aix-vios1->vhost1
vscsi2 0x000007 0x0000000000 0x0 aix-vios2->vhost2
----------------------------
Managing VSCSI devices (server-client mapping)
1. HMC -> VIO Server -> DLPAR -> Virtual Adapter (create vscsi adapter, name the client which can use it, then create the same in profile)
(the profile can be updated: configuration -> save current config.)
(in case of an optical device, check out any client partition can connect)
2. HMC -> VIO Client -> DLPAR -> Virtual Adapter (create the same adapter as above, the ids should be mapped, do it in the profile as well)
3. cfgdev (VIO server), cfgmgr (client) <--it will bring up vhostX on vio server, vscsiX on client
4. create needed disk assignments:
-using physical disks:
mkvdev -vdev hdisk34 -vadapter vhost0 -dev vclient_disk <--for easier identification useful to give a name with the -dev flag
rmvdev -vdev <backing dev.> <--back. dev can be checked with lsmap -all (here vclient_disk)
-using logical volumes:
mkvg -vg testvg_vios hdisk34 <--creating vg for lv
lsvg <--listing a vg
reducevg <vg> <disk> <--deleting a vg
mklv -lv testlv_client testvg_vios 10G <--creating lv what will be mapped to client
lsvg -lv <vg> <--lists lvs under a vg
rmlv <lv> <--removes an lv
mkvdev -vdev testlv_client -vadapter vhost0 -dev <any_name> <--for easier identification useful to give a name with the -dev flag
(here backing device is an lv (testlv_client)
rmvdev -vdev <back. dev.> <--removes an assignment to the client
-using logical volumes just with storage pool commands:
(vg=sp, lv=bd)
mksp <vgname> <disk> <--creating a vg (sp)
lssp <--listing stoarge pools (vgs)
chsp -add -sp <sp> PhysicalVolume <--adding a disk to the sp (vg)
chsp -rm -sp bb_sp hdisk2 <--removing hdisk2 from bb_sp (storage pool)
mkbdsp -bd <lv> -sp <vg> 10G <--creates an lv with given size in the sp
lssp -bd -sp <vg> <--lists lvs in the given vg (sp)
rmbdsp -bd <lv> -sp <vg> <--removes an lv from the given vg (sp)
mkvdev..., rmvdev... also applies
-using file backed storage pool
first a normal (LV) storage pool should be created with: mkvg or mksp, after that:
mksp -fb <fb sp name> -sp <vg> -size 20G <--creates a file backed storage pool in the given storage pool with given size
(it wil look like an lv, and a fs will be created automatically as well)
lssp <--it will show as FBPOOL
chsp -add -sp clientData -size 1G <--increase the size of the file storage pool (ClientData) by 1G
mkbdsp -sp fb_testvg -bd fb_bb -vadapter vhost2 10G <--it will create a file backed device and assigns it to the given vhost
mkbdsp -sp fb_testvg -bd fb_bb1 -vadapter vhost2 -tn balazs 8G <--it will also specify a virt. target device name (-tn)
lssp -bd -sp fb_testvg <--lists the lvs (backing devices) of the given sp
rmbdsp -sp fb_testvg -bd fb_bb1 <--removes the given lv (bd) from the sp
rmsp <file sp name> <--remove s the given file storage pool
removing it:
rmdev -dev vhost1 -recursive
----------------------------
On client partitions, MPIO for virtual SCSI devices currently only support failover mode (which means only one path is active at a time:
root@bb_lpar: / # lsattr -El hdisk0
PCM PCM/friend/vscsi Path Control Module False
algorithm fail_over Algorithm True
----------------------------
Multipathing with dual VIO config:
on VIO SERVER:
# lsdev -dev <hdisk_name> -attr <--checking disk attributes
# lsdev -dev <fscsi_name> -attr <--checking FC attributes
# chdev -dev fscsi0 -attr fc_err_recov=fast_fail dyntrk=yes-perm <--reboot is needed for these
fc_err_recov=fast_fail <--in case of a link event IO will fail immediately
dyntrk=yes <--allows the VIO server to tolerate cabling changes in the SAN
# chdev -dev hdisk3 -attr reserve_policy=no_reserve <--each disk must be set to no_reservr
reserve_policy=no_reserve <--if this is configured, dual vio server can present a disk to client
on VIO client:
# chdev -l vscsi0 -a vscsi_path_to=30 -a vscsi_err_recov=fast_fail -P <--path timout checks health of VIOS and detects if VIO Server adapter isn't responding
vscsi_path_to=30 <--by default it is disabled (0), each client adapter must be configured, minimum is 30
vscsi_err_recov=fast_fail <--failover will happen immediately rather than delayed
# chdev -l hdisk0 -a queue_depth=20 -P <--it must match the queue depth value used for the physical disk on the VIO Server
queue_depth <--it determines how many requests will be queued on the disk
# chdev -l hdisk0 -a hcheck_interval=60 -a hcheck_mode=nonactive -P <--health check updates automatically paths state
(otherwise failed path must be set manually))
hcheck_interval=60 <--how often do hcheck, each disk must be configured (hcheck_interval=0 means it is disabled)
hcheck_mode=nonactive <--hcheck is performed on nonactive paths (paths with no active IO)
Never set the hcheck_interval lower than the read/write timeout value of the underlying physical disk on the Virtual I/O Server. Otherwise, an error detected by the Fibre Channel adapter causes new healthcheck requests to be sent before the running requests time out.
The minimum recommended value for the hcheck_interval attribute is 60 for both Virtual I/O and non Virtual I/O configurations.
In the event of adapter or path issues, setting the hcheck_interval too low can cause severe performance degradation or possibly cause I/O hangs.
It is best not to configure more than 4 to 8 paths per LUN (to avoid too many hchecks IO), and set the hcheck_interval to 60 in the client partition and on the Virtual I/O Server.
----------------------------
TESTING PATH PRIORITIES:
By default all the paths are defined with priority 1 meaning that traffic will go through the first path.
If you want to control the paths 'path priority' has to be updated.
Priority of the VSCSI0 path remains at 1, so it is the primary path.
Priority of the VSCSI1 path will be changed to 2, so it will be lower priority.
PREPARATION ON CLIENT:
# lsattr -El hdisk1 | grep hcheck
hcheck_cmd test_unit_rdy <--hcheck is configured, so path should come back automatically from failed state
hcheck_interval 60
hcheck_mode nonactive
# chpath -l hdisk1 -p vscsi1 -a priority=2 <--I changed priority=2 on vscsi1 (by default both paths are priority=1)
# lspath -AHE -l hdisk1 -p vscsi0
priority 1 Priority True
# lspath -AHE -l hdisk1 -p vscsi1
priority 2 Priority True
So, configuration looks like this:
VIOS1 -> vscsi0 -> priority 1
VIOS2 -> vscsi1 -> priority 2
TEST 1:
1. ON VIOS2: # lsmap -all <--checking disk mapping on VIOS2
VTD testdisk
Status Available
LUN 0x8200000000000000
Backing device hdiskpower1
...
2. ON VIOS2: # rmdev -dev testdisk <--removing disk mapping from VIOS2
3. ON CLIENT: # lspath
Enabled hdisk1 vscsi0
Failed hdisk1 vscsi1 <--it will show failed path on vscsi2 (this is coming from VIOS2)
4. ON CLIENT: # errpt <--error report will show "PATh HAS FAILED"
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
DE3B8540 0324120813 P H hdisk1 PATH HAS FAILED
5. ON VIOS2: # mkvdev -vdev hdiskpower1 -vadapter vhost0 -dev testdisk <--configure back disk mapping from VIOS2
6. ON CLIENT: # lspath <--in 30 seconds path will come back automatically
Enabled hdisk1 vscsi0
Enabled hdisk1 vscsi1 <--because of hcheck, path came back automatically (no manual action was needed)
7. ON CLIENT: # errpt <--error report will show path has been recovered
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
F31FFAC3 0324121213 I H hdisk1 PATH HAS RECOVERED
TEST 2:
I did the same on VIOS1 (rmdev...disk, which has path priority 1 (IO is going there by default)
ON CLIENT: # lspath
Failed hdisk1 vscsi0
Enabled hdisk1 vscsi1
ON CLIENT: # errpt <--an additional disk operation error will be in errpt
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
DCB47997 0324121513 T H hdisk1 DISK OPERATION ERROR
DE3B8540 0324121513 P H hdisk1 PATH HAS FAILED
----------------------------
How to change a VSCSI adapter on client:
# lspath
Enabled hdisk0 vscsi0
Enabled hdisk0 vscsi2 <--we want to change vsci2 to vscsi1
On VIO client:
1. # rmpath -p vscsi2 -d <--remove paths from vscsi2 adapter
2. # rmdev -dl vscsi2 <--remove adapter
On VIO server:
3. # lsmap -all <--check assignment and vhost device
4. # rmdev -dev vhost0 -recursive <--remove assignment and vhost device
On HMC:
5. Remove deleted adapter from client (from profil too)
6. Remove deleted adapter from VIOS (from profil too)
7. Create new adapter on client (in profil too) <--cfgmgr on client
8. Create new adapter on VIOS (in profil too) <-cfgdev on VIO server
On VIO server:
9. # mkvdev -vdev hdiskpower0 -vadapter vhost0 -dev rootvg_hdisk0 <--create new assignment
# lspath
Enabled hdisk0 vscsi0
Enabled hdisk0 vscsi1 <--vscsi1 is there (cfgmgr may needed)
----------------------------
Assigning and moving DVD RAM between LPARS
1. lsdev -type optical <--check if VIOS owns optical device (you should see sg. like: cd0 Available SATA DVD-RAM Drive)
2. lsmap -all <--to see if cd0 is already mapped and which vhost to use for assignment (lsmap -all | grep cd0)
3. mkvdev -vdev cd0 -vadapter vhost0 <--it will create vtoptX as a virtual target device (check with lsmap -all )
4. cfgmgr (on client lpar) <--bring up cd0 device on client (before moving cd0 device rmdev device on client first)
5. rmdev -dev vtopt0 -recursive <--to move cd0 to another client, remove assignment from vhost0
6. mkvdev -vdev cd0 -vadapter vhost1 <--create new assignment to vhost1
7. cfgmgr (on other client lpar) <--bring up cd0 device on other client
(Because VIO server adapter is configured with "Any client partition can connect" option, these pairs are not suited for client disks.)
----------------------------
NPIV (Virtual Fibre Channel Adapter)
With NPIV, you can configure the managed system so that multiple logical partitions can access independent physical storage through the same physical fibre channel adapter. (NPIV means N_Port ID Virtualization. N_Port ID is a storage term, for node port ID, to identify ports on the nod (FC Adpater) in the SAN area.)
To access physical storage in a typical storage area network (SAN) that uses fibre channel, the physical storage is mapped to logical units (LUNs) and the LUNs are mapped to the ports of physical fibre channel adapters. Each physical port on each physical fibre channel adapter is identified using one worldwide port name (WWPN).
NPIV is a standard technology for fibre channel networks that enables you to connect multiple logical partitions to one physical port of a physical fibre channel adapter. Each logical partition is identified by a unique WWPN, which means that you can connect each logical partition to independent physical storage on a SAN.
To enable NPIV on the managed system, you must create a Virtual I/O Server logical partition (version 2.1, or later) that provides virtual resources to client logical partitions. You assign the physical fibre channel adapters (that support NPIV) to the Virtual I/O Server logical partition. Then, you connect virtual fibre channel adapters on the client logical partitions to virtual fibre channel adapters on the Virtual I/O Server logical partition. A virtual fibre channel adapter is a virtual adapter that provides client logical partitions with a fibre channel connection to a storage area network through the Virtual I/O Server logical partition. The Virtual I/O Server logical partition provides the connection between the virtual fibre channel adapters on the Virtual I/O Server logical partition and the physical fibre channel adapters on the managed system.
The following figure shows a managed system configured to use NPIV:
on VIO server:
root@vios1: / # lsdev -Cc adapter
fcs0 Available 01-00 8Gb PCI Express Dual Port FC Adapter (df1000f114108a03)
fcs1 Available 01-01 8Gb PCI Express Dual Port FC Adapter (df1000f114108a03)
vfchost0 Available Virtual FC Server Adapter
vfchost1 Available Virtual FC Server Adapter
vfchost2 Available Virtual FC Server Adapter
vfchost3 Available Virtual FC Server Adapter
vfchost4 Available Virtual FC Server Adapter
on VIO client:
root@aix21: /root # lsdev -Cc adapter
fcs0 Available C6-T1 Virtual Fibre Channel Client Adapter
fcs1 Available C7-T1 Virtual Fibre Channel Client Adapter
Two unique WWPNs (world-wide port names) starting with the letter "c" are generated by the HMC for the VFC client adapter. The pair is critical and both must be zoned if Live Partition Migration is planned to be used. The virtual I/O client partition uses one WWPN to log into the SAN at any given time. The other WWPN is used when the client logical partition is moved to another managed system using PowerVM Live Partition Mobility.
lscfg -vpl fcsX will show only the first WWPN
fcstat fcsX will show only the active WWPN
Both of them are showing only 1 WWPN but fcstat will show always the active WWPN which is in use (which will change after an LPM), however lscfg will show as a static value the 1st WWPN assigned to the HBA only.
One VFC client adapter per physical port per client partition and maximum 64 active VFC client adapter per physical port. There is always one-to-one relationship between the virtual Fibre Channel client adapter and the virtual Fibre Channel server adapter.
The difference between traditional redundancy with SCSI adapters and the NPIV technology using virtual Fibre Channel adapters is that the redundancy occurs on the client, because only the client recognizes the disk. The Virtual I/O Server is essentially just a pass-through managing the data transfer through the POWER hypervisor. When using Live Partition Mobility storage moves to the target server without requiring a reassignment (opposite with virtual scsi), because the virtual Fibre Channels have their own WWPNs that move with the client partitions on the target server.
After creating an FC client adapter, and trying to make it persistent across restarts, another different pair of virtual WWPNs would be generated, when creating the adapter in the profile. To prevent this undesired situation, which would require another SAN zoning and storage configuration, make sure to save any virtual Fibre Channel client adapter DLPAR changes into a new partition profile by selecting: Configuration -> Save Current Configuration and change the default partition profile to the new profile.
-----------------------------------------------------
Check NPIV adapter mapping on client:
root@bb_lpar: / # echo "vfcs" | kdb <--vfcs is a kdb subcommand
...
NAME ADDRESS STATE HOST HOST_ADAP OPENED NUM_ACTIVE
fcs0 0xF1000A000033A000 0x0008 aix-vios1 vfchost8 0x01 0x0000 <--shows which vfchost is used on vio server for this client
fcs1 0xF1000A0000338000 0x0008 aix-vios2 vfchost6 0x01 0x0000
-----------------------------------------------------
NPIV creation and how they are related together:
FCS0: Physical FC Adapter installed on the VIOS
VFCHOST0: Virtual FC (Server) Adapter on VIOS
FCS0 (on client): Virtual FC adapter on VIO client
Creating NPIV adapters:
0. install physical FC Adapters to VIO Servrs
1. HMC -> VIO Server -> DLPAR -> Virtual Adapter (don't forget profile (save current))
2. HMC -> VIO Client -> DLPAR -> Virtual Adapter (the ids should be mapped, don't forget profile)
3. cfgdev (VIO server), cfgmgr (client) <--it will bring up the new adapter vfchostX on vio server, fcsX on client
4. check status:
lsdev -dev vfchost* <--lists virtual FC server adapters
lsmap -vadapter vfchost0 -npiv <--gives more detail about the specified virtual FC server adapter
lsdev -dev fcs* <--lists physical FC server adapters
lsnports <--checks NPIV readiness (fabric=1 means npiv ready)
5. vfcmap -vadapter vfchost0 -fcp fcs0 <--mapping the virtual FC adapter to the VIO's physical FC
6. lsmap -all -npiv <--checks the maping
7. HMC -> VIO Client -> get the WWN of the adapter <--if no LPM will be used only the first WWN is needed
8. SAN zoning
-----------------------------------------------------
Replacement of a physical FC adapter with NPIV
1. identify the adapter
$ lsdev -dev fcs4 -child
name status description
fcnet4 Defined Fibre Channel Network Protocol Device
fscsi4 Available FC SCSI I/O Controller Protocol Device
2. unconfigure the mappings
$ rmdev -dev vfchost0 -ucfg
vfchost0 Defined
3. FC adapters and their child devices must be unconfigured or deleted
$ rmdev -dev fcs4 -recursive -ucfg
fscsi4 Defined
fcnet4 Defined
fcs4 Defined
4. diagmenu
DIAGNOSTIC OPERATING INSTRUCTIONS -> Task Selection -> Hot Plug Task -> PCI Hot Plug Manager -> Replace/Remove a PCI Hot Plug Adapter.
-----------------------------------------------------
With NPIV, you can configure the managed system so that multiple logical partitions can access independent physical storage through the same physical fibre channel adapter. (NPIV means N_Port ID Virtualization. N_Port ID is a storage term, for node port ID, to identify ports on the nod (FC Adpater) in the SAN area.)
To access physical storage in a typical storage area network (SAN) that uses fibre channel, the physical storage is mapped to logical units (LUNs) and the LUNs are mapped to the ports of physical fibre channel adapters. Each physical port on each physical fibre channel adapter is identified using one worldwide port name (WWPN).
NPIV is a standard technology for fibre channel networks that enables you to connect multiple logical partitions to one physical port of a physical fibre channel adapter. Each logical partition is identified by a unique WWPN, which means that you can connect each logical partition to independent physical storage on a SAN.
To enable NPIV on the managed system, you must create a Virtual I/O Server logical partition (version 2.1, or later) that provides virtual resources to client logical partitions. You assign the physical fibre channel adapters (that support NPIV) to the Virtual I/O Server logical partition. Then, you connect virtual fibre channel adapters on the client logical partitions to virtual fibre channel adapters on the Virtual I/O Server logical partition. A virtual fibre channel adapter is a virtual adapter that provides client logical partitions with a fibre channel connection to a storage area network through the Virtual I/O Server logical partition. The Virtual I/O Server logical partition provides the connection between the virtual fibre channel adapters on the Virtual I/O Server logical partition and the physical fibre channel adapters on the managed system.
The following figure shows a managed system configured to use NPIV:
on VIO server:
root@vios1: / # lsdev -Cc adapter
fcs0 Available 01-00 8Gb PCI Express Dual Port FC Adapter (df1000f114108a03)
fcs1 Available 01-01 8Gb PCI Express Dual Port FC Adapter (df1000f114108a03)
vfchost0 Available Virtual FC Server Adapter
vfchost1 Available Virtual FC Server Adapter
vfchost2 Available Virtual FC Server Adapter
vfchost3 Available Virtual FC Server Adapter
vfchost4 Available Virtual FC Server Adapter
on VIO client:
root@aix21: /root # lsdev -Cc adapter
fcs0 Available C6-T1 Virtual Fibre Channel Client Adapter
fcs1 Available C7-T1 Virtual Fibre Channel Client Adapter
Two unique WWPNs (world-wide port names) starting with the letter "c" are generated by the HMC for the VFC client adapter. The pair is critical and both must be zoned if Live Partition Migration is planned to be used. The virtual I/O client partition uses one WWPN to log into the SAN at any given time. The other WWPN is used when the client logical partition is moved to another managed system using PowerVM Live Partition Mobility.
lscfg -vpl fcsX will show only the first WWPN
fcstat fcsX will show only the active WWPN
Both of them are showing only 1 WWPN but fcstat will show always the active WWPN which is in use (which will change after an LPM), however lscfg will show as a static value the 1st WWPN assigned to the HBA only.
One VFC client adapter per physical port per client partition and maximum 64 active VFC client adapter per physical port. There is always one-to-one relationship between the virtual Fibre Channel client adapter and the virtual Fibre Channel server adapter.
The difference between traditional redundancy with SCSI adapters and the NPIV technology using virtual Fibre Channel adapters is that the redundancy occurs on the client, because only the client recognizes the disk. The Virtual I/O Server is essentially just a pass-through managing the data transfer through the POWER hypervisor. When using Live Partition Mobility storage moves to the target server without requiring a reassignment (opposite with virtual scsi), because the virtual Fibre Channels have their own WWPNs that move with the client partitions on the target server.
After creating an FC client adapter, and trying to make it persistent across restarts, another different pair of virtual WWPNs would be generated, when creating the adapter in the profile. To prevent this undesired situation, which would require another SAN zoning and storage configuration, make sure to save any virtual Fibre Channel client adapter DLPAR changes into a new partition profile by selecting: Configuration -> Save Current Configuration and change the default partition profile to the new profile.
-----------------------------------------------------
Check NPIV adapter mapping on client:
root@bb_lpar: / # echo "vfcs" | kdb <--vfcs is a kdb subcommand
...
NAME ADDRESS STATE HOST HOST_ADAP OPENED NUM_ACTIVE
fcs0 0xF1000A000033A000 0x0008 aix-vios1 vfchost8 0x01 0x0000 <--shows which vfchost is used on vio server for this client
fcs1 0xF1000A0000338000 0x0008 aix-vios2 vfchost6 0x01 0x0000
-----------------------------------------------------
NPIV creation and how they are related together:
FCS0: Physical FC Adapter installed on the VIOS
VFCHOST0: Virtual FC (Server) Adapter on VIOS
FCS0 (on client): Virtual FC adapter on VIO client
Creating NPIV adapters:
0. install physical FC Adapters to VIO Servrs
1. HMC -> VIO Server -> DLPAR -> Virtual Adapter (don't forget profile (save current))
2. HMC -> VIO Client -> DLPAR -> Virtual Adapter (the ids should be mapped, don't forget profile)
3. cfgdev (VIO server), cfgmgr (client) <--it will bring up the new adapter vfchostX on vio server, fcsX on client
4. check status:
lsdev -dev vfchost* <--lists virtual FC server adapters
lsmap -vadapter vfchost0 -npiv <--gives more detail about the specified virtual FC server adapter
lsdev -dev fcs* <--lists physical FC server adapters
lsnports <--checks NPIV readiness (fabric=1 means npiv ready)
5. vfcmap -vadapter vfchost0 -fcp fcs0 <--mapping the virtual FC adapter to the VIO's physical FC
6. lsmap -all -npiv <--checks the maping
7. HMC -> VIO Client -> get the WWN of the adapter <--if no LPM will be used only the first WWN is needed
8. SAN zoning
-----------------------------------------------------
Replacement of a physical FC adapter with NPIV
1. identify the adapter
$ lsdev -dev fcs4 -child
name status description
fcnet4 Defined Fibre Channel Network Protocol Device
fscsi4 Available FC SCSI I/O Controller Protocol Device
2. unconfigure the mappings
$ rmdev -dev vfchost0 -ucfg
vfchost0 Defined
3. FC adapters and their child devices must be unconfigured or deleted
$ rmdev -dev fcs4 -recursive -ucfg
fscsi4 Defined
fcnet4 Defined
fcs4 Defined
4. diagmenu
DIAGNOSTIC OPERATING INSTRUCTIONS -> Task Selection -> Hot Plug Task -> PCI Hot Plug Manager -> Replace/Remove a PCI Hot Plug Adapter.
-----------------------------------------------------
Subscribe to:
Posts (Atom)