AIX for System Administrators: June 2011

NETWORK - SSH, X11

Using X11 forwarding in SSH

The SSH protocol has the ability to securely forward X Window System applications over your encrypted SSH connection, so that you can run an application on the SSH server machine and have it put its windows up on your local machine without sending any X network traffic in the clear.

In order to use this feature, you will need an X display server for your Windows machine, such as Cygwin/X, X-Win32, or Exceed. This will probably install itself as display number 0 on your local machine; if it doesn't, the manual for the X server should tell you what it does do.

You should then tick the ‘Enable X11 forwarding’ box in the Tunnels panel before starting your SSH session. The ‘X display location’ box is blank by default, which means that PuTTY will try to use a sensible default such as :0, which is the usual display location where your X server will be installed. If that needs changing, then change it.

Now you should be able to log in to the SSH server as normal. To check that X forwarding has been successfully negotiated during connection startup, you can check the PuTTY Event Log. It should say something like this:

2001-12-05 17:22:01 Requesting X11 forwarding
2001-12-05 17:22:02 X11 forwarding enabled

If the remote system is Unix or Unix-like, you should also be able to see that the DISPLAY environment variable has been set to point at display 10 or above on the SSH server machine itself:

fred@unixbox:~$ echo $DISPLAY
unixbox:10.0

--------------------

Overview of the X server:

I think your problem is a confusion about how X works, so a few clarifications first:

An "X-Server" is a process which handles and manages a certain (physically available) display. This usually runs on a *client*. Think of an "X-Server" as sort of a driver for a graphics card. (X-Server is where the Keyboard, Video & Mouse were attached.)

An "X-Client" is a process which uses an X-Server to display (a window with) some information on it. This usually runs on the server. An example would be "xterm" or "aixterm" or "Mozilla", etc.

To tell your xclient which Xserver to use there is an environment variable DISPLAY, which is set pointing to your Xserver:
export DISPLAY="mymachine.withxserverrunning.com:0.0"

means use the Xserver running on this machine and managing display 0 (there could be several) and use screen 0 (mymachine.withxserverrunning.com:0.1 would be screen 1), since displays could consist of several screens (this is: monitors handled by graphics cards). As you see, unlike in Windoze one doesn't need multiheaded graphics cards with dual (several) monitor capabilities to span a graphical environment over several screens, this can be done by run-of-the-mill X-Servers and windowmanagers running on top of them.

You can run your X-Server directly on ylour server only if you have a graphical terminal (an "lft" ) attached to it. Check in your inventory (man lscfg, man lsdev) to find out if you have one.

If you have none (this is the common case, as servers usually don't come with graphics cards), you will have a machine you work on (if you have to endure common working conditions this is a Windoze machine, if you are lucky this is a real computer running some real OS, Linux or AIX for instance). On this machine (client.yournet.com) you start your X-Server. Start a local X-client (a window), then use some Telnet or similar program to log in to your host (host.yournet.com.

On this host issue issue a "export DISPLAY=client.yournet.com:0.0" and then a "xterm &".

A window should pop up on your display with an xterm. This xterm is not running on your local machine, but on the server. The process on the server only uses your screen (vie your X-Server) to display its content. You can check that by issuing "kill -9 %1" in the first window, which would make the second window vanish.

If it doesn't work as described: issue an "xhost +" on your client machine, X-Windows contains a mechanism to limit access to an X-Servers resource only to a defined group of hosts (which is empty by default), the command will enable any host to use the screen.

--------------------

X11 forwarding:
(in putty X11 forwarding should be enabled and an X server (e.g. XMING) has to be run)

0. Xming

1. ssh settings:
    in sshd_config (/etc/ssh)set: X11Forwarding yes
    stopsrc -s sshd; startsrc -s sshd

2. install X11
    in /mnt/5300-00/installp/ppc: smitty install:
   -X11.base.5.3.0.0.I (this will install some requisites as well from apps, fonts...)
   -X11.apps    (it contains a startx, xauth, xhost commands)
   do an update to the needed TL level

4. startx

5. then login again:
    ssh -X root@aix40
    it did this: 1356-364 /usr/bin/X11/xauth: creating new authority file /.Xauthority

5. xclock :)))))
    echo $DISPLAY showed: localhost:10.0 (I did not set it at all)

(export DISPLAY=localhost:10.0 perhaps does not needed at all)
(It happend that under roo xlock worked, but as other user it didn't. After copying .Xauthority file (from root) it worked)

--------------
Hostname:Number.Screen

Hostname - where the display physically attached
Number - ID number of the display server on that host machine
Scrreen - number of the screen on that host server

xhost command???

-----------------------------------

If everything looks OK, but you receive this:

root@bb_lpar: / # xclock
Error: Can't open display:

Probably the only problem, you did not use -X: ssh -X root@servername.
When I used -X the DISPLAY variable was configured automatically.:
(I did not set up anything, when I used -X I could see this, but prior -X I received an empty line.)

root@bb_lpar: / # echo $DISPLAY
localhost:10.0

-----------------------------------

X server problems:
(This is not edited, I received these errors when I tried to config X)

X11.base is needed

./firefox
errors I have received:
1 .Gtk-WARNING **: cannot open display        <--after setting X11Forwarding yes I received other errors)
someone suggested this:xhost +LOCAL (it gives all non-network connect. access to the display)

2. Gtk-WARNING **: cannot open display: 0.0   <--suggested solution: export DISPLAY=:0.0

3.Xlib: connection to ":0.0" refused by server
Xlib: No protocol specified

After I gave these commands:
xauth list

startx
xclock                            <--until I gave startx, xclock command did not work

export DISPLAY=localhost:10.0
xhost + localhost
export DISPLAY=10.10.100.96:0.0
xinit

-----------------------------------

Xlib: connection to "localhost:10.0" refused by server
Xlib: Invalid MIT-MAGIC-COOKIE-1 key
Error: Can't open display: localhost:10.0

root@aix10: / # env
DISPLAY=localhost:10.0

You can see in 'ps -ef' that display :10 is already in use:

root@aix10: / # ps -ef | grep ":10"
    root 643132 123006   0   Nov 10      - 79:10 /etc/ncs/llbd
    root 852170 1458410   0   May 22      - 1:10 /usr/lpp/OV/lbin/eaagt/opcmsga
yyxxxxx 999524 1188014   0 10:45:15      - 0:00 /usr/lpp/CTXSmf/slib/ctxlogin -display :10

Solution is to set in /etc/ssh/sshd_config:
X11DisplayOffset 70

Then displays will be start from 70 and hopefully will not interfere with citrix

-----------------------------------

When doing ssh -X user@host, I received these:

Warning: untrusted X11 forwarding setup failed: xauth key data not generated
Warning: No xauth data; using fake authentication data for X11 forwarding.

$ xclock
X11 connection rejected because of wrong authentication.
X connection to localhost:11.0 broken (explicit kill or server shutdown).
However xclock with ssh -Y user@host worked fine.

After adding on the client (where I was coming from) into /etc/ssh/ssh_config: "ForwardX11Trusted yes" it worked well with ssh -X. (This line was missing from ssh_config, so I added to it.)

-----------------------------------

HW - IO

I/O - AIO, DIO, CIO, RAW

Filesystem I/O

AIX has special features to enhance the performance of of filesystem I/O for general-purpose file access. These features include read ahead, write behind and I/O buffering. Oracle employs its own I/O optimization and buffering that in most cases are redundant to those provided by AIX file systems. Oracle uses buffer cache management (data blocks buffered to shared memory), and AIX uses virtual memory management (data buffered in virtual memory). If both try to manage data caching, it result in wasted memory, cpu and suboptimal performance.

Generally better to allow Oracle to manage I/O buffering, because it has information regarding the context, so can better optimze memory usage.

Asynchronous I/O

A read can be considered to be synchronous if a disk operation is required to read the data into memory. In this case, application processing cannot continue until the I/O operation is complete. Asynchronous I/O allows applications to initate read or write operations without being blocked, since all I/O operations are done in background. This can improve performance, because I/O operations and appl. processing can run simultaenously.

Asynchronous I/O on filesystems is handled through a kernel process called: aioserver (in this case each I/O is handled by a single kproc)

The minimum number of servers (aioserver) configured, when asynchronous I/O is enabled, is 1 (minservers). Additional aioservers are started when more asynchronous I/O is requested. Tha maximum number of servers is controlled by maxservers. aioserver kernel threads do not go away once started, until the system reboots (so with "ps -k" we can see what was the max number of aio servers that were needed concurrently at some time in the past)

How many should you configure?
The rule of thumb is to set the maximum number of servers (maxservers) equal to ten times the amount of disk or ten times the amount of processors. MinServers would be set at half of this amount. Other than having some more kernel processes hanging out that really don't get used (using a small amount of kernel memory), there really is little risk in oversizing the amount of MaxServers, so don't be afraid to bump it up.

root@aix31: / # lsattr -El aio0
autoconfig defined STATE to be configured at system restart True
fastpath   enable State of fast path                       True
kprocprio 39      Server PRIORITY                          True
maxreqs    4096    Maximum number of REQUESTS               True
maxservers 10      MAXIMUM number of servers per cpu        True
minservers 1       MINIMUM number of servers                True

maxreqs     <-max number of aio requests that can be outstanding at one time
maxservers <-if you have 4 CPU then tha max count of aio kernel threds would be 40
minservers <-this amount will start at boot (this is not per CPU)

Oracle takes full advantage of Asynchronous I/O provided by AIX, resulting in faster database access.

on AIX 6.1: ioo is handling aio (ioo -a)

mkdev -l aio0                 enables the AIO device driver (smitty aio)
ioo -a   shows the value of minservers, maxservers...(or lsattr -El aio0)
chdev -l aio0 -a maxservers='30'    changes the maxserver value to 30 (it will show the new value, butit will be active only after reboot)
ps -k    | grep aio | wc -l   shows how many aio servers are running
                              (these are not necessarily are in use, maybe many of them are just hanging there)
pstat -a            shows the asynchronous I/O servers by name

iostat -AQ 2 2                it will show if any aio is in use by filesystems
iostat -AQ 1 | grep -v "            0              "    it will omit the empty lines
                          it will show which filesystems are active in regard to aio.
                          under the count column will show the specified fs requested how much aio...
                      (it is good to see which fs is aio intensive)

root@aix10: /root # ps -kf | grep aio        <--it will show the accumulated CPU time of each aio process
    root 127176       1   0   Mar 24      - 15:07 aioserver
    root 131156       1   0   Mar 24      - 14:40 aioserver
    root 139366       1   0   Mar 24      - 14:51 aioserver
    root 151650       1   0   Mar 24      - 14:02 aioserver

It is good to compare these times of each process to see if more aioservers are needed or not. If the times are identical (only few minutes differences) it means all of them are used maximally so more precesses are needed.

------------------------------------

iostat -A                   reports back asynchronous I/O statistics

aio: avgc avfc maxgc maxfc maxreqs avg-cpu: % user % sys % idle % iowait
     10.2 0.0     5     0    4096            20.6   4.5   64.7     10.3

avgc: This reports back the average global asynchronous I/O request per second of the interval you specified.
avfc: This reports back the average fastpath request count per second for your interval.

------------------------------------

Changing aio parameters:
You can set the values online, with no interruption of service – BUT – they will not take affect until the next time the kernel is booted

1. lsattr -El aio0                <-- check current setting for aio0 device
2. chdev -l aio0 -a maxreqs=<value> -P     <-- set the value of maxreqs permanently for next reboot
3. restart server

------------------------------------

0509-036 Cannot load program aioo because...

if you receive this:
root@aix30: / # aioo -a
exec(): 0509-036 Cannot load program aioo because of the following errors:
        0509-130 Symbol resolution failed for aioo because:
    ....

probably aioserver is in defined state:

root@aix30: / # lsattr -El aio0
autoconfig defined STATE to be configured at system restart True

You should make it available with: mkdev -l aio0
(and also change it for future restarts: chdev -l aio0 -a autoconfig=available, or with 'smitty aio')

------------------------------------

DIRECT I/O (DIO)

Direct I/O is an alternative non-caching policy which causes file data to be transferred directly to the disk from the application or directly from the disk to the application without going through the VMM file cache.

Direct I/O reads cause synchrounous reads from the disk whereas with the normal cached policy the reads may be satisfied from the cache. This can result in poor performance if the data was likely to be in memory under the normal caching policy.

Direct I/O can be enabled: mount -o dio

If JFS2 DIO or CIO options are active, no filesystem cache is being used for Oracle .dbf and/or online redo logs files.

Databases normally manage data caching at application level, so the do not need the filesystem to implement this service for them. The use of the file buffer cache result in undesirable overhead, since data is first moved from the disk to the file buffer cache and from there to the application buffer. This "double-copying" of data results in additional CPU and memory consumption.

JFS2 supports DIO as well CIO. The CIO model is built on top of the DIO. For JFS2 based environments, CIO should always be used (instead of DIO) for those situations where the bypass of filesystem cache is appropriate.

JFS DIO should only be used:
On Oracle data (.dbf) files, where DB_BLOCK_SIZE is 4k or graeter. (Use of JFS DIO on any other files (e.g redo logs, control files) is likely to result in a severe performance penalty.

------------------------------

CONCURRENT I/O (CIO)

The inode lock imposes write serialization at the file level. JFS2 (by default) employs serialization mechanisms to ensure the integrity of data being updated. An inode lock is used to ensure that there is at most one outstanding write I/O to a file at any point in time, reads are not allowed because they may result in reading stale data.
Oracle implements its own I/O serialization mechanisms to ensure data integrity, so JFS2 offers Concurrent I/O option. Under CIO, multiple threads may simulteanously perform reads and writes on a shared file. Applications that do not enforce serialization should not use CIO (data corruption or perf. issues can occure).

CIO invokes direct I/O, so it has all the other performance considerations associated with direct I/O. With standard direct I/O, inodes are locked to prevent a condition where multiple threads might try to change the consults of a file simultaneously. Concurrent I/O bypasses the inode lock, which allows multiple threads to read and write data concurrently to the same file.

CIO includes the performance benefits previously available with DIO, plus the elimination of the contention on the inode lock.

Concurrent I/O should only be used:
Oracle .dbf files, online redo logs and/or control files.

When used for online redo logs or control files, these files should be isolated in their own JFS2 filesystems, with agblksize= 512.
Filesystems containing .dbf files, should be created with:
    -agblksize=2048 if DB_BLOCK_SIZE=2k
    -agblksize=4096 if DB_BLOCK_SIZE>=4k

(Failure to implement these agblksize values is likely to result in a severe performance penalty.)

Do not under aby circumstances, use CIO mount option for the filesystem containing the Oracle binaries (!!!).
Additionaly, do not use DIO/CIO options for filesystems containing archive logs or any other files do not discussed here.

Applications that use raw logical volumes fo data storage don't encounter inode lock contention since they don't access files.

fsfastpath should be enabled to initiate aio requestes directly to LVM or disk, for maximum performance (aioo -a)
------------------------------

RAW I/O

When using raw devices with Oracle, the devices are either raw logical volumes or raw disks. When using raw disks, the LVM layer is bypassed. The use of raw lv's is recommended for Oracle data files, unless ASM is used. ASM has the capability to create data files, which do not need to be mapped directly to disks. With ASM, using raw disks is preferred.

HW - DUMP

Dump - Core

AIX generates a system dump when a severe error occurs. A system dump creates a picture of the system's memory contents. If the AIX kernel crashes kernel data is written to the primary dump device. After a kernel crash AIX must be rebooted. During the next boot, the dump is copied into a dump directory (default is /var/adm/ras). The dump file name is vmcore.x (x indicates a number, e.g. vmcore.0)

When installing the operating system, the dump device is automatically configured. By default, the primary device is /dev/hd6, which is a paging logical volume, and the secondary device is /dev/sysdumpnull.

A rule of thumb is when a dump is created, it is about 1/4 of the size of real memory. The command "sysdumpdev -e" will also provide an estimate of the dump space needed for your machine. (Estimation can differ at times with high load, as kernel space is higher at that time.)

When a system dump is occurring, the dump image is not written to disk in mirrored form. A dump to a mirrored lv results in an inconsistent dump and therefore, should be avoided. The logic behind this fact is that if the mirroring code itself were the cause of the system crash, then trusting the same code to handle the mirrored write would be pointless. Thus, mirroring a dump device is a waste of resources and is not recommended.

Since the default dump device is the primary paging lv, you should create a separate dump lv, if you mirror your paging lv (which is suggested.)If a valid secondary dump device exists and the primary dump device cannot be reached, the secondary dump device will accept the dump information intended for the primary dump device.

IBM recommendation:
All I can recommend you is to force a dump the next time the problem should occur. This will enable us to check which process was hanging or what caused the system to not respond any more. You can do this via the HMC using the following steps:
Operations -> Restart -> Dump
As a general recommendation you should always force a dump if a system is hanging. There are only very few cases in which we can determine the reason for a hanging system without having a dump available for analysis.

-------------------------------------------

Traditional vs Firmware-assisted dump:

Up to POWER5 only traditioanl dumps were available, and the introduction of the POWER6 processor-based systems allowed system dumps to be firmware assisted. When performing a firmware-assisted dump, system memory is frozen and the partition rebooted, which allows a new instance of the operating system to complete the dump.

Traditional dump: it is generated before partition is rebooted.
(When system crashed, memory content is trying to be copied at that moment to dump device)

Firmware-assisted dump: it takes place when the partition is restarting.
(When system crashed, memory is frozen, and by hypervisor (firmware) new memory space is allocated in RAM, and the contents of memory is copied there. Then during reboot it is copied from this new memory area to the dump device.)

Firmware-assisted dump offers improved reliability over the traditional dump, by rebooting the partition and using a new kernel to dump data from the previous kernel crash.

When an administrator attempts to switch from a traditional to firmware-assisted system dump, system memory is checked against the firmware-assisted system dump memory requirements. If these memory requirements are not met, then the "sysdumpdev -t" command output reports the required minimum system memory to allow for firmware-assisted dump to be configured. Changing from traditional to firmware-assisted dump requires a reboot of the partition for the dump changes to take effect.

Firmware-assisted system dumps can be one of these types:

Selective memory dump: Selective memory dumps are triggered by or use of AIX instances that must be dumped.
Full memory dump: The whole partition memory is dumped without any interaction with an AIX instance that is failing.

-------------------------------------------

Use the sysdumpdev command to query or change the primary or secondary dump devices.
    - Primary:    usually used when you wish to save the dump data
    - Secondary: can be used to discard dump data (that is, /dev/sysdumpnull)

Flags for sysdumpdev command:
    -l                list the current dump destination
    -e                estimates the size of the dump (in bytes)
    -p                primary
    -s                secondary
    -P                make change permanent
    -C                turns on compression
    -c                turns off compression
    -L                shows info about last dump
    -K                turns on: alway allow system dump

sysdumpdev -P -p /dev/dumpdev    change the primary dumpdevice permanently to /dev/dumpdev

root@aix1: /root # sysdumpdev -l
primary              /dev/dumplv
secondary            /dev/sysdumpnull
copy directory       /var/adm/ras
forced copy flag     TRUE
always allow dump    TRUE      <--if it is on FALSE then in smitty sysdumpdev it can be change
dump compression     ON    <--if it is on OFF then sysdumpdev -C changes it to ON-ra (-c changes it to OFF)


Other commands:

sysdumpstart            starts a dump (smitty dump)(it will do a reboot as well)
kdb                   it analysis the dump
/usr/lib/ras/dumpcheck checks if dump device and copy directory are able to receive the system dump
If dump device is a paging space, it verifies if enough free space exists in the copy dir to hold the dump
If dump device is a logical volume, it verifies it is large enough to hold a dump
(man dumpcheck)

-------------------------------------------
SNAP:

snap
   -a               copies all system config. information to /tmp/ibmsupt directory tree
   -c               creates a compressed tar image (snap.tar.Z) of all files in the /tmp/ibmsupt
   -g               gather general information

   -e                for HACMP, it runs clverification and gathers the data creating a snap

1. snap -r            <--removes old snap from /tmp/ibmsupt
2. snap -gc       <--creates a new snap file

Reading a compressed snap file:
1. snap -ac                   <--creates a compressed snap file (/tmp/ibmsupt/snap.pax.Z)
2. uncompress snap.pax.Z       <--uncompresses it, we will have a snap.pax file
3. pax -rvf snap.pax           <--unpack files, after files can be read

-------------------------------------------

Creating a dump device

1. sysdumpdev -e                          <--shows an estimation, how much space is required for a dump
2. mklv -t sysdump -y lg_dumplv rootvg 3 hdisk0    <--it creates a sysdump lv with 3 PPs
3. sysdumpdev -Pp /dev/lg_dumplv            <--making it as a primary device (system will use this lv now for dumps)

-------------------------------------------

System dump initiaded by a user

!!!reboot will take place automatically!!!

1. sysdumpstart -p        <--initiates a dump to the primary device
(Reboot will be done automatically)
(If a dedicated dump device is used, user initiated dumps are not copied automatically to copy directory.)
(If paging space is used for dump, then dump will be copied automatically to /var/adm/ras)
2. sysdumpdev -L           <--shows dump took place on the primary device, time, size ... (errpt will show as well)
3. savecore -d /var/adm/ras    <--copy last dump from system dump device to directory /var/adm/ras (if paging space is used this is not needed)

-------------------------------------------

How to move dumplv to another disk:

We want to move from hdisk1 to hdisk0:

1. lslv -l dumplv                      <--checking which disk
2. sysdumpdev -l                   <--checking sysdump device (primary was here /dev/dumplv)
3. sysdumpdev -Pp /dev/sysdumpnull   <--changing primary to sysdumpnull (secondary, it is a null device) (lsvg -l roovg shows closed state)
4. migratepv -l dumplv hdisk1 hdisk0 <--moving it from hdisk1 to hdisk0
5. sysdumpdev -Pp /dev/dumplv    <--changing back to the primary device

-------------------------------------------
The largest dump device is too small: (LABEL: DMPCHK_TOOSMALL IDENT)

1. Dumpcheck runs from crontab
# crontab -l | grep dump
0 15 * * * /usr/lib/ras/dumpcheck >/dev/null 2>&1

2. Check if there are any errors:
# errpt
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
E87EF1BE 0703150008 P O dumpcheck The largest dump device is too small.
E87EF1BE 0702150008 P O dumpcheck The largest dump device is too small.

3. If you find new error message, find dumplv:
# sysdumpdev -l
primary /dev/dumplv
secondary /dev/sysdumpnull
copy directory /var/adm/ras
forced copy flag TRUE
always allow dump TRUE
dump compression ON
List dumplv form rootvg:

# lsvg -l rootvg|grep dumplv
dumplv dump 8 8 1 open/syncd N/A

4. Extend with 1 PP
# extendlv dumplv 1
dumplv dump 9 9 1 open/syncd N/A

Run problem check at the end
OK -> done
Not OK -> Extend with 1 PP again.
-------------------------------------------
changing the autorestart attribute of the systemdump:
(smitty chgsys as well)

1.lsattr -El sys0 -a autorestart
autorestart true Automatically REBOOT system after a crash True

2.chdev -l sys0 -a autorestart=false
sys0 changed

----------------------------------------------

CORE FILE:

errpt shows which program, if not:
- use the “strings” command (for example: ”strings core | grep _=”)
- or the lquerypv command: (for example: “lquerypv -h core 6b0 64”)

man syscorepath
syscorepath -p /tmp
syscorepath -g

AIX - CRONTAB

CRONTAB:

The cron daemon, which translates to Chronological Data Event Monitor, is a program that schedules jobs to run automatically at a specific time and date. The /etc/inittab file contains all the AIX startup programs, including the cron daemon. The init process in AIX starts the cron daemon, or cron, from the inittab file during the initialization process of the operating system.

You can submit jobs, or events, to cron by doing one of the following:
Use the at and batch facilities to submit jobs for one-time execution.
Use the crontab files to execute jobs at regularly scheduled intervals (hourly, daily, weekly, and so on).

By default, cron can concurrently run 100 events of equal importance. The /usr/adm/cron/queuedefs file allows you to change this schedule.

c.200j10n120w
| |    |   |
| |    |   wait period (in seconds)
| |    nice value
| jobs
cron

At regularly scheduled intervals, cron looks for and reads the crontab files that are located in the directory /var/spool/cron/crontabs.

These files contain jobs submitted by users. For example, the file /var/spool/cron/crontabs/john contains John's jobs that are scheduled to be run by cron. The files in this directory are named for the individual users. When changes are made to the files, the cron daemon must be notified to reread the files. If you open crontab with "ccrontab -e" and after quit, crontab daemon will be notified an re-reads that file.

/var/spool/cron/crontabs users`s crontab files are located here
/var/adm/cron/log        cron daemon creates a log of its activities
/var/adm/cron/cron.deny    Any user can use cron except those listed in this file
/var/adm/cron/cron.allow   Only users listed in this file can use cron (root user included)

crontab -l                 Lists the contents of your current crontab file
crontab -e                 Edits your current crontab file (when the file saved, the cron daemon is automatically refreshed.)
crontab -r                 Removes your crontab file from the crontab directory
crontab -v               check crontab submission time
crontab mycronfile         submit your crontab file to /var/spool/cron/crontabs directory

crontab file format:
minute    hour    day_of_month    month        weekday        command
0-59    0-23    1-31            1-12        0-6 Sun-Sat     shell command

* * * * * /bin/script.sh        schedule a job to run every minute
0 1 15 * * /fullbackup    1 am on the 15th of every month
0 0 * * 1-5 /usr/sbin/backup    start the backup command at midnight, Mo - Fr
0,15,30,45 6-17 * * 1-5 /home/script1                               execute script1 every 15 minutes between 6AM and 5PM, Mo - Fr
0 1 1 * * /tmp -name 'TRACE*' -mtime +270 -exec rm {} \\; >/dev/null 2>&1 it will delete files older than 9 months
                                            (\\; <-- double "\" needed because to interpret ";" correctly)
----------------------------

Ctontab: Adding and Removing lines with script:

To add a line to cron (in this example "0 1 * * * /tmp/test.sh >> /tmp/test.log")
crontab -l | awk '{print} END {print "0 1 * * * /tmp/test.sh >> /tmp/test.log"}' | crontab

To remove an line from cron (in this example any lines that match "/tmp/test.sh >> /tmp/test.log")
crontab -l | sed '\!/tmp/test.sh >> /tmp/test.log!d' | crontab

----------------------------

AT:

at                          submits a job for cron to run at a specific time in the future
                      (at -f /home/root/bb_at -t 2007122503)
echo "<command>" | at now   this starts in the background (and you can log off)

at now +2 mins
banner hello > /dev/pts/0
<ctrl-d>                    (at now + 1 minute,at 5 pm Friday )

/var/adm/cron/at.deny     allows any users except those listed in this file to use the at command.
/var/adm/cron/at.allow      allows only those users listed in this file to use the at command (including root).
at -l                   Lists at jobs
atq [user]                  Views other user's jobs (Only root can use this command.)
at -r                       Cancels an at job
atrm job                    Cancels an at job by job number
atrm user                   Cancels an at job by the user (root can use it for any user; users can cancel their jobs.)
atrm                    Cancels all at jobs belonging to the user invoking the atrm command

batch                   submits a job to be run in the background when the processor load is low

AIX - Backup

Backup - Restore

Rootvg backup (mksysb)

The mksysb command creates a backup of the rootvg, which can be used to reinstall the system or restore it to another system. It backs up only the rootvg and only those filesystems which are mounted (nfs mounted filesystems are not being backed up.) The created backup file can be restored from NIM server or the mksysb file can be converted to bootable DVD images using mkdvd command.

----------------------------------------

/image.data

If mksysb is started with the -i flag, it will call the mkszfile command first, which collectes information about logical volumes, filesystems etc. All these information are written in the /image.data file. After that the mksysb command only backs up the file systems specified in the /image.data file. (The saved information allows the bosinstall routine during restore to recreate the logical volume information as it existed before the backup.) This flag is highly recommended otherwise an already existing image.data file may be used, which may not contain the actual state of the system.

It is also possible to manually change the content of the /image.data. After running mkszfile command, we can change the content of the created /image.data file to customize how we would like to restore our mksysb. For example we can change the PP size of rootvg from 256MB to 128MB, so after restore rootvg will have 128MB PP size. If we customaize an already created /image.data file, then -i flag will not be needed when using mksysb command.

# cat /image.data
image_data:
IMAGE_TYPE= bff
DATE_TIME= Sat Nov 2 13:15:22 CET 2019
UNAME_INFO= AIX my-aixserv 2 7 00C5A0D04B00
PRODUCT_TAPE= no
USERVG_LIST=
PLATFORM= chrp
OSLEVEL= 7.2.3.15
OSLEVEL_R= 7200-03
OSLEVEL_S= 7200-03-02-1845
CPU_ID= 00C4C2D05A00
LPAR_ID= 10

vg_data:
VGNAME= rootvg
PPSIZE= 256 <-- this can be changed
VARYON= yes
VG_SOURCE_DISK_LIST= hdisk0
QUORUM= 2
ENH_CONC_CAPABLE= no
CONC_AUTO= no
BIGVG= no
TFACTOR= 1
CRITVG= no

----------------------------------------

/etc/exclude.rootvg

It is also possible to exclude some files, directories from the backup. In this case we need to create /etc/exclude.rootvg file with the list of excluded directories, files. If mksysb command is started with -e flag then /etc/exclude.rootvg file will be applied during mksysb creation.

for example:
^./tmp/ it will backup /tmp, but not the content of it (at restore an empty /tmp will be created)
(exclude the contents of /tmp and avoid excluding any other directories which have /tmp in the pathname)
^./tmp it won't backup /tmp at all (at restore an empty /tmp won't be created)
/temp/ exclude the contents of every /temp/ directory on the system ("^." makes the exclude start from / and look for your path)
old$ $ indicates that the search should end at the end of the line

# cat /etc/exclude.rootvg
^./tmp/
^./opt/jenkins/

----------------------------------------

mksysb -i /mnt/aixserv.mksysb creating an mksysb to the given location (usually to an NFS mount)
mksysb -ie /mnt/aixserv.mksysb creating mksysb using exclude file

lsmksysb -lf /mnt/my_mksysb list header of the mksysb (VG, size, oslevel, lv info)
lsmksysb -f /mnt/my_mksysb list all the files in the mksysb
lsmksysb -f /mnt/my_mksysb –r ./etc/filesystems restores /etc/filesystem file

The restorevgfiles and listvgbackup -r commands perform identical operations and should be considered interchangeable
listvgbackup -f /mnt/aix11.mksysb -r /etc/resolv.conf restores the file /etc/resolv.conf from the specified backup

----------------------------------------

Full mksysb backup to 1 single large ISO image

If we want to have a bootable ISO image of an LPAR, and we need that in 1 large ISO file, unfortunately it is not so easy to accomplish. With mkdvd it is possible to create mksysb images in DVD format, but the maximum size of a volume is 4.7GB. There is no way to avoid this cutting of larger mksysb file into DVD sized volumes.

A workaround for this, if on VIO server we create and assign a virtual optical device to the LPAR, and on LPAR side we do mksysb directly to /dev/cd0!!!!
(This method by default uses UDF type, so there is no limitation on DVD size (4.7GB), it can be for example 20GB or more.)

1. df -tk `lsvgfs rootvg` | awk '{total+=$3} END {printf "Estimated mksysb size: %d bytes, %.2f GB\n", total*1024, total/1024/1024}'
on VIO client estimate size of mksysb

2. mkvdev -fbo -vadapter vhostX on VIOS create vtopt device
3. mkvopt -name aixserv_mksysb.udf -size 10G create ISO file with the size estimated before
4. loadopt -vtd vtoptX -disk aixserv_mksysb.udf load ISO file to the vtopt device

5. vi /etc/exclude.rootvg: ^./opt/jenkins/ on VIO client update /etc/exclude.rootvg with necessary entries
6. cfgmgr bring up cd device on VIO client (which was assigned previously from VIOS, lsdev will show)
7. mksysb -ie /dev/cd0 start backup directly to /dev/cd0
8. rmdev -dl cd0 delete cd0

9. unloadopt -vtd vtoptX on VIOS unload iso image
10. rmvdev -vtd vtopt remove vtopt
11. mv /var/vio/VMLibrary/aixserv_mksysb.udf /mnt save ISO to NFS mount

----------------------------------------

mkdvd

The mkdvd command creates multi-volume DVDs from different backup formats: mksysb, savevg, savewpar. For example regarding mksysb, it can create an mksysb image from the rootvg and then convert it to bootable DVD images or it can use a previously created mksysb image which will be converted to DVD images. The same way it can also create multi-volume DVDs from a savevg, or savewpar backup image. By default the created DVD images can be used to boot up an LPAR and then restore its content (similar to an mksysb restore). Each created volume has a maximum size of 4.7GB, so if an mksysb is greater than this size, it will be cut into multiple volumes.

-m mksysbimage Specifies the location of a previously created mksysb image which should be converted to DVD image
-M mksysbtarget If you do not specify the -m flag, the mkdvd command calls mksysb, and the created backup is stored in this location
if no location is specified /mkcd/mksysbimage filesystem is created, where the mksysb is temporarily stored.
-S Stops the mkdvd command before it burns the DVD images, without removing the image files.
(by default mkdvd removes the images, but in this case the images remain in the directory marked by the -I flag, or in the /mkcd/cd_images directory.)

mkdvd -S -M /mnt -I /mnt create mksysb backup in bootable ISO image format
(-R should be the same as -S, -e exclude file could be used)
mkdvd -R -S -m /mnt/aixserv2/aixserv2.mksysb -I /mnt/aixserv2 create ISO image from the specified mksysb file

----------------------------------------

device /dev/cd0 does not appear to be ready

root@aix-mgmt:/mgmt # mkdvd -m /mgmt/mksysb_5439724 -U -d /dev/cd0
Initializing mkdvd log: /var/adm/ras/mkcd.log...
Verifying command parameters...
0512-332 mkdvd: Device /dev/cd0 does not appear to be ready. For
information about possible causes, see /usr/lpp/bos.sysmgt/mkcd.README.txt
Continuing...
The device is not ready for operation.
check_cd_ready[41]: /dev/cd0: 0403-016 Cannot find or open the file.
0512-399 mkdvd: Unable to create UDF media.
Cleaning up...

Solution:
loadopt was missing on VIO, so we tried to "burn cd" when the "cd tray" was empty.

----------------------------------------

0512-399 mkdvd: Unable to create UDF media

mkdvd -m /home/labuser/mksysb_5439724 -U -d /dev/cd0
Initializing mkdvd log: /var/adm/ras/mkcd.log...
Verifying command parameters...
0512-399 mkdvd: Unable to create UDF media.
Cleaning up...

Solution was below udfcreate command, after it was successful:
# udfcreate -d /dev/cd0

----------------------------------------

/usr/sbin/bosboot[6]: dirname: not found

root@ls-aix-test: / # mkdvd -m /home/labuser/mksysb_5439724 -U -d /dev/cd0
Initializing mkdvd log: /var/adm/ras/mkcd.log...
Verifying command parameters...
Populating the CD or DVD file system...
Building chrp boot image...
/usr/sbin/bosboot[6]: dirname: not found
/usr/sbin/bosboot[6]: dirname: not found
/usr/sbin/bosboot[6]: dirname: not found
/usr/sbin/bosboot[6]: dirname: not found
/usr/sbin/bosboot[6]: dirname: not found
/usr/sbin/bosboot[6]: dirname: not found
/usr/sbin/bosboot[6]: dirname: not found
/usr/sbin/bosboot[6]: dirname: not found
/usr/sbin/bosboot[6]: dirname: not found
/usr/sbin/bosboot[6]: dirname: not found
/tmp/bosboot_14024764_15499/usr/lib/drivers/cfs.ext: No such file or directory
/tmp/bosboot_14024764_15499/usr/lib/drivers/planar_pal_chrp: No such file or directory
/tmp/bosboot_14024764_15499/usr/lib/drivers/pci/sisraid_dd: No such file or directory
/tmp/bosboot_14024764_15499/usr/lib/drivers/udfs.ext: No such file or directory
/tmp/bosboot_14024764_15499/usr/lib/drivers/scdisk: No such file or directory
/tmp/bosboot_14024764_15499/usr/lib/drivers/vdev_busdd: No such file or directory
/tmp/bosboot_14024764_15499/usr/lib/drivers/vscsi_initdd: No such file or directory
/tmp/bosboot_14024764_15499/usr/lib/drivers/scsidiskpin: No such file or directory
/tmp/bosboot_14024764_15499/usr/lib/drivers/scsidisk: No such file or directory
/tmp/bosboot_14024764_15499/usr/lib/drivers/scdiskpin: No such file or directory
Filesystem Helper: Implementation-specific error, code = (112)

Detailed Solution is at this link:
https://www.ibm.com/developerworks/community/forums/html/topic?id=d06862fa-1566-4c34-9575-d74147c4d904

In general the "coreutils" rpm package overwrites the AIX "dirname" command,

2 steps will be needed:
- install latest coreutils (in newer coreutils this bug has been fixed, but it does not help on the current overwritten system)
- from another AIX (same oslevel) where no coreutils is installed copy over dirname binary from /usr/bin/dirname

----------------------------------------

devices.scsi.disk.diag.com

Cannot find file system or does not match filter: ramdisk0
Populating the CD or DVD file system...
0512-323 mkdvd: The following files are required for the
creation of the CD or DVD image and are not available on the source system:
/usr/lpp/diagnostics/bin/uformat devices.scsi.disk.diag.com

The files can be installed from the listed filesets.
0512-321 mkdvd: Error populating the CD or DVD file system
using the /usr/lpp/bosinst/cdfs.required.list proto file.
Cleaning up...

Solution was to install devices.scsi.disk.diag from below fileset (the other is a requirement):

-rw-r--r-- 1 root system 2651 Oct 14 07:31 .toc
-rw-r--r-- 1 root system 3171328 Oct 14 07:30 devices.pci.14107802
-rw-r--r-- 1 root system 1426432 Oct 14 07:30 devices.scsi.disk

# installp -apgXYd . devices.scsi.disk

----------------------------------------

Earlier general idea was to convert mksysb to UDF (DVD) format, which has no 4.7 GB max. limit. <<<< unfortunately this does not work. mkdvd always cut an image to pieces
After we have this image we can load it on VIO server and boot+restore from there

Backup:

1. Take a normal mksysb on nfs-share
- mount nfs share
- mksysb -i /mnt/ls-aix-j11b720.mksysb

2. On VIOVirtual optical device has been assigned to aix-mgmt
- (if needed on VIO create virt opt. dev (vtopt) to aix-mgmt) mkvdev -fbo -vadapter vhost1
- (on VIO create image file where mksysb will be written) mkvopt -name jdk_iso -size 60G
- load virt opt file: loadopt -disk jdk_iso -vtd vtopt0

3. on aix mgmt convert/copy mksys to UDF file format for /dev/cd0
- mkdvd -U… will used UDF format: mkdvd -m /home/labuser/mksysb_5439724 -U -d /dev/cd0

----------------------------------------

Volume Group Backup (savevg, restvg)

savevg -f /bckfs/backup.0725 bckvg backs up all files belonging to bckvg to the specified file
restvg -f /bckfs/backup.0725 restores the vg and all the files what have been saved with savevg
(it creates the vg, lv, fs...)

----------------------------------------

Filesystem backup (bckfs)

find /bckfs -print | backup -i -f /dev/rmt0 backup all the files and subdirs
find: genereates a list of all the files
-i: files will be read from standard input

restore extracts files from archives created with the backup command.

JFS2 snapshot:
creates a point in time image, it is very quick and very small

----------------------------------------

POWERVM - COMMANDS

Commands

PowerVM Editions:
http://www-912.ibm.com/pod/pod

Under the VET code:
C2DBF2AD8D3427F6CA1F00002C20004110

-Express      0000
-Standard   2C00
-Enterpise    2C20

-----------------

VIOS service package definitions

Fix Pack
A Fix Pack updates your VIOS release to the latest level. A Fix Pack update can contain product enhancements, new function and fixes.

Service Pack
A Service Pack applies to only one (the latest) VIOS level. A Service Pack contains critical fixes for issues found between Fix Pack releases. A Service Pack does not update the VIOS to a new level and it can only be applied to the Fix Pack release for which it is specified.

Interim Fix
An Interim Fix (iFix) applies to only one (the latest) VIOS level and provides a fix for a specific issue.
-----------------

Virtual I/O Server is a special partition that is not intended to run end-user applications, and should only provide login for system administrators. Virtual I/O Server allows the sharing of physical resources between supported AIX partitions to allow more efficient utilization and flexibility for using physical storage and network devices.

-----------------

User padmin:

Primary administrator on the VIO Server is the user padmin. It has a restriced shell (can't change home directory:/home/padmin) with vios commands.

The oem_setup_env command will place the padmin user in a non-restricted root shell with a home directory in the /home/padmin directory. The user can then run any command available to the root user. This is not a supported Virtual I/O Server administration method. The purpose of this command is to allow installation of vendor software, such as device drivers. (It is an environment to set up oem device drivers = oem_setup_env.)

By default the ioscli commands are not available for the root user. All ioscli commands are in fact calls of /usr/ios/cli/ioscli with the command as argument. (You see this if you list the aliases of the padmin user.)

You can use all ioscli commands as user root by appending /usr/ios/cli/ioscli. (/usr/ios/cli/ioscli lsmap -all)

You can set an alias:
alias i=/usr/ios/cli/ioscli
i lsmap -all

Typing exit, will return the user to the Virtual I/O Server prompt.

------------------------------------

/home/ios/logs/ioscli_global.trace shows history what was happening on the system (from commands point of view)

ioslevel               shows vio server level
installios               installs the Virtual I/O Server. This command is run from the HMC.
backupios                creates an installable image of the root volume group (saves almost everything)
viosbr                   creates backups from user defined virtual device configs (saves only the mappings, virt. devices..)
viosbr -view -file <file name>    displays the contents of a backup file (whic was made before with viosbr)
savevgstruct             it will make a backup of a volume group structure

lsgcl                    show history, what comands have been run on the vio server (gcl: global comand log)

cfgdev                   devices are recognized after runnig cfgdev
cfgassist                on vio server as padmin brings up smitty style menu for doing several tasks
chkdev -dev hdisk4 -verbose        show if attached device can be migrated from physical adapter to virtual adapter (PHYS2VIRT_CAP.. -> yes)
chkdev -field NAME PHYS2VIRT_CAPABLE -fmt : show if disks can be used for vscsi (YES:yes, NO:no, NA:disk is already in use by VSCSI)

good overview of SEA sharing mode and adapters state:
entstat -all ent10 | grep -e " Priority" -e "Virtual Adapter" -e " State:" -e "High Availability Mode"

good overview of SEA and adapters VLAN id, MAC addres, Link status:
entstat -all ent10 | grep -e "(ent" -e "Type:" -e "Address:" -e "Link Status:" -e "Link State:" -e "Switch" -e " ent"

lsdev                    lists all devices
lsdev -slots         lists I/O slot information for built-in adapters (those are not hot-pluggable but DLPAR capable)
lsdev -virtual           lists virtual devices
lsdev -type disk -virtual          lists virtual target devices (lists vscsi disks by the name what was given at mkvdev...-dev...)
lsdev -type adapter      same as lsdev -Cc adapter
lsdev -dev vhost0 -vpd    same as lscfg -vpl vhost0 in AIX (but on vio as padmin user lscfg does not work
lsdev -dev ent4 -attr    shows attributes of the devices (same as lsattr -El ent4)

chdev -dev fscsi0 -attr fc_err_recov=fast_fail dyntrk=yes -perm    changes the attributes (same as chdev -l fscsi0 -a fc_err_recov...)

lsmap -all               lists all vscsi devices
lsmap -all -npiv           lists npiv adapters (with slot numbers (aka adapter ID))
lsmap -all -net    lists virtual ethernet adapters (with slot numbers (aka adapter ID))

lsmap -vadapter vhost0           shows infos about a specific vscsi adapter
lsmap -vadapter vfchost0 -npiv    shows infos about a specific npiv adapter
lsmap -vadapter ent11 -net       shows infos about a specific virtual ethernet device (with sea or physical devices will not work)

lsmap -all -field SVSA Physloc "Client Partition ID" VTD -fmt ":" lists vscsi details

lsmap -all -npiv -field Name Physloc ClntID ClntName Status "FC name" -fmt ":" lists npiv details

lsmap -all -vnic -field Name Physloc ClntID ClntName "Backing device" -fmt ":" lists vnic details

lsvg -lv rootvg     same as lsvg -l rootvg

lspv                   shows all available hdisk devices
lspv -free             shows hdisks which are free to be used as backing devices
lspv -size             shows hdisks with sizes

mkvdev -vdev hdiskX -vadapter vhostX -dev <vtd_name> create mapping between vhost and disk (vscsi disk assignment to vio client)
mkvdev -vlan ent9 -tagid 200       creates a vlan tagged interface over the ent9 interface (ent9 can be a SEA adapter)

rmvdev -vtd <vtd>                  removing connection between virt. target dev. (a physical dev. or an lv) and the virtual SCSI adapter
                       (vtd can be found in lsmap output) (rmvdev -vdev <backing dev.> also works)
rmdev                        removes or unconfigures a device (rmvdev command can be replaced by rmdev, and rmdev is more universal)
rmdev -dev vhostX -recursive -ucfg put in defined state vhost adapter and its child devices (lsdev -dev vhostX -child)
                   (-recursive: do actions on childdren as well, -ucfg: put only in defined state, do not delete device)
rmdev -pdev vhost13              deletes only childern devices of the given parent device (pdev) (vhostX will be still in available state)

vfcmap -vadapter vfchostX -fcp fcsX mapping the virtual FC adapter to the VIO's physical FC
vfcmap -vadapter vfchostX -fcp       remove mappings of given vfchost adapter

viosecure -firewall on -reload     enable firewall with default config (enables: https, http, rmc,ssh, ftp...)
viosecure -firewall view       display current firewall rules

license -swma                 once helped, when vio commands did not want to work
remote_management            this command enables VIOS to be remotely managed by a NIM master

/usr/ios/cli/ioscli ioslevel run vios commands as root
viosvrcmd –m MSname –p VIOS1 –c "ioslevel" run vios commands from HMC CLI
------------------------------------

Info about virtual devices:

root@bb_lpar: /root # lscfg -l vscsi0
vscsi0           U8204.E8A.0680E95-V3-C2-T1 Virtual SCSI Client Adapter

8204.E8A                   <--managed system type/model that contains this partition
0680E95                      <--serial number of the managed system
V3                       <--partition id (on HMC LPAR id)
C3                       <--slot number of this adapter (on HMC adapter id)

------------------------------------

mirroring vio server rootvg:
1. extendvg -f rootvg hdisk2        <--if pp limitation problem: chvg -factor 6 rootvg, then extendvg
2. mirrorios -defer hdisk2          <--mirror rootvg to hdisk2; -defer is used, as no need to reboot since VIOS 1.5 :)
                                (use the -f only if required, which will do a reboot without prompting you to continue)
3. bootlist -mode normal -ls        <--checking bootlist

------------------------------------

Debugging VIO problems:

I. truss:

$ oem_setup_env
# truss /usr/ios/cli/ioscli <failing_padmin_command> (or truss -feal /usr/ios/cli/ioscli <failing_padmin_command>)

---------

II. CLI_DEBUG=33

By exporting CLI_DEBUG=33, we can see which AIX command is used in the background of VIO command. After running that AIX command we can get more info.

For example:
1. mkvdev -vdev hdiskpower1 -vadapter vhost0 -dev testdisk       <--running this command gives not enough info about the problem
*******************************************************************************
The command's response was not recognized. This may or may not indicate a problem.
*******************************************************************************

2. export CLI_DEBUG=33                                  <--exporting CLI_DEBUg=33

3. mkvdev -vdev hdiskpower1 -vadapter vhost0 -dev testdisk        <--running again, we can see which AIX command will be invoked
AIX: "lspv -l hdiskpower1 2>&1 | grep 0516-320"
AIX: "export LANG=C;/usr/sbin/pooladm -I pool querydisk /dev/hdiskpower1"
AIX: "/usr/sbin/lquerypv -V hdiskpower1"
AIX: "mkdev -V hdiskpower1 -p vhost0 -l testdisk "           <--this command will be needed
*******************************************************************************
The command's response was not recognized. This may or may not indicate a problem.
*******************************************************************************

4.oem_setup_env                                   <--enabling root environment
5. mkdev -V hdiskpower1 -p vhost0 -l testdisk        <--running AIX command and it shows more detailed info
Method error (/usr/lib/methods/cfg_vt_scdisk):
        0514-012 Cannot open a file or device.

The solution was in this case, that I forgot to set no_reserve for the given disk on the other VIOS. The other VIOS already used this disk (with reservation) that is why I could not configure here because it was locked.

------------------------------------

Creating a client LPAR via VIO server:

planned devices for client LPAR:
1 virtual Ethernet
2 SCSI:
    -for virtual disk
    -for virtual optical device (cd)

1. find out what will be the client partition id (on HMC), because it will be needed when creating adapters for the new client LPAR
2. on VIO server (HMC) create virtual server adapters:
    - virtual Ethernet: for me it was enough for inter LPAR communicatio, so tagging is not needed (only remember PVID)
    - virtual SCSI for hdisk and optical device (cd): here the planned client partition id should be set
3. on VIO server configure the devices and mappings, after that 'cfgdev':
    -virtual Ethernet:
        set ip: chdev -l en19 -a netaddr=10.10.10.26 -a netmask=255.255.255.0 -a state=up

    -virtual SCSI for hdisk:
        map disk for rootvg: mkvdev -vdev hdisk45 -vadapter vhost1 -dev bb_lpar_rootvg

    -virtual optical device:
        create a file backed optical device, for iso images: mkvdev -fbo -vadapter vhost1
        copy the iso image to /var/vio/VMLibrary (lsrep)
        load the image into the vtopt0 device: loadopt -vtd vtopt0 -disk dvd.1022A4_OBETA_710.iso (lsmap -all will show it)

4. create client LPAR on HMC:
    partition id should be as planned (to match with the above created adapters)
    set processor, memory... phyisical I/O is not needed
    create virtual Ethernet and SCSI (it should be untagged as created on VIO Server with the same PVID)
    create SCSI adapter: 1 is enough for disk and optical device
    LHEA is not needed and other settings were not changed

5. activate profile
    go into SMS -> choose cdrom -> install AIX

6. on the new client LPAR
    set ip, hostname, routing...
    set ip: chdev -l en0 -a netaddr=10.10.10.25 -a netmask=255.255.255.0 -a state=up

    check ping from vio, then ssh is possible to new LPAR

------------------------------------

Network Time Protocol config:

1. vi /home/padmin/config/ntp.conf               <--edit (or create) ntp.conf file as padmin (maybe root will need access to file as well)

    content should look like this:
    server ptbtime1.ptb.de
    server ptbtime2.ptb.de
    driftfile /home/padmin/config/ntp.drift
    tracefile /home/padmin/config/ntp.trace
    logfile /home/padmin/config/ntp.log

2. startnetsvc xntpd                           <--start xntpd daemon
3. cat /home/padmin/config/ntp.log               <--check log for errors
    if you see this:
    time error 3637.530348 is way too large        <--if difference between local and timeserver time is too large synchroniztaion cannot occur

4. chdate 1206093607                       <--change clock manually
    Thu Dec 6 09:36:16 CST 2007

5. cat /home/padmin/config/ntp.log:                <--check log again
    synchronized to 9.3.4.7, stratum=2

6. ps -ef | grep ntp <--it should show /home/padmin/config/ntp.conf (not /etc/ntp.conf)
root 4390928 2818268 0 14:46:11 - 0:00 /usr/sbin/xntpd -x -c /home/padmin/config/ntp.conf

------------------------------------

I/O hosting requires a hosting partition - boot not permitted
exit called ok

This happened with me when I created an AIX LPAR, but a VIO image was installed on the disk, where I wanted to boot from.
(When there is a mismatch between the LPAR type and the booting image this error will pop up....I guess.)

POWERVM - VSCSI

VIRTUAL SCSI

Virtual SCSI is based on a client/server relationship. The Virtual I/O Server owns the physical resources and acts as server or, in SCSI terms, target device. The client logical partitions access the virtual SCSI backing storage devices provided by the Virtual I/O Server as clients.

Virtual SCSI server adapters can be created only in Virtual I/O Server. For HMC-managed systems, virtual SCSI adapters are created and assigned to logical partitions using partition profiles.

The vhost SCSI adapter is the same as a normal SCSI adapter. You can have multiple disks assigned to it. Usually one virtual SCSI server adapter mapped to one virtual SCSI client adapter will be configured, mapping backing devices through to individual LPARs. It is possible to map these virtual SCSI server adapters to multiple LPARs, which is useful for creating virtual optical and/or tape devices, allowing removable media devices to be shared between multiple client partitions.

on VIO server:
root@vios1: / # lsdev -Cc adapter
vhost0 Available       Virtual SCSI Server Adapter
vhost1 Available       Virtual SCSI Server Adapter
vhost2 Available       Virtual SCSI Server Adapter

The client partition accesses its assigned disks through a virtual SCSI client adapter. The virtual SCSI client adapter sees the disks, logical volumes or file-backed storage through this virtual adapter as virtual SCSI disk devices.

on VIO client:
root@aix21: / # lsdev -Cc adapter
vscsi0 Available Virtual SCSI Client Adapter

root@aix21: / # lscfg -vpl hdisk2
hdisk2           U9117.MMA.06B5641-V6-C13-T1-L890000000000 Virtual SCSI Disk Drive

In SCSI terms:
virtual SCSI server adapter: target
virtual SCSI client adapter: initiator
(Analogous to server client model, where client is the initiator.)

Physical disks presented to the Virtual I/O Server can be exported and assigned to a client partition in a number of different ways:
- The entire disk is presented to the client partition.
- The disk is divided into several logical volumes, which can be presented to a single client or multiple different clients.
- With the introduction of Virtual I/O Server 1.5, files can be created on these disks and file-backed storage can be created.
- With the introduction of Virtual I/O Server 2.2 Fixpack 24 Service Pack 1 logical units from a shared storage pool can be created.

The IVM and HMC environments present 2 different interfaces for storage management under different names. Storage Pool interface under IVM is essentially the same as LVM under HMC. (These are used sometimes interchangeably.) So volume group can refer to both volume groups and storage pools, and logical volume can refer to both logical volumes and storage pool backing devices.

Once these virtual SCSI server/client adapter connections have been set up, one or more backing devices (whole disks, logical volumes or files) can be presented using the same virtual SCSI adapter.

When using Live Partition Mobility storage needs to be assigned to the Virtual I/O Servers on the target server.

----------------------------

Number of LUNs attached to a VSCSI adapter:

VSCSI adapters have a fixed queue depth that varies depending on how many VSCSI LUNs are configured for the adapter. There are 512 command elements of which 2 are used by the adapter, 3 are reserved for each VSCSI LUN for error recovery and the rest are used for IO requests. Thus, with the default queue_depth of 3 for VSCSI LUNs, that allows for up to 85 LUNs to use an adapter: (512 - 2) / (3 + 3) = 85.

So if we need higher queue depths for the devices, then the number of LUNs per adapter is reduced. E.G., if we want to use a queue_depth of 25, that allows 510/28= 18 LUNs. We can configure multiple VSCSI adapters to handle many LUNs with high queue depths, each requiring additional memory. One may have more than one VSCSI adapter on a VIOC connected to the same VIOS if you need more bandwidth.

Also, one should set the queue_depth attribute on the VIOC's hdisk to match that of the mapped hdisk's queue_depth on the VIOS.

Note that to change the queue_depth on an hdisk at the VIOS requires that we unmap the disk from the VIOC and remap it back, or a simpler approach is to change the values in the ODM (e.g. # chdev -l hdisk30 -a queue_depth=20 -P) then reboot the VIOS.

----------------------------

File Backed Virtual SCSI Devices

Virtual I/O Server (VIOS) version 1.5 introduced file-backed virtual SCSI devices. These virtual SCSI devices serve as disks or optical media devices for clients.

In the case of file-backed virtual disks, clients are presented with a file from the VIOS that it accesses as a SCSI disk. With file-backed virtual optical devices, you can store, install and back up media on the VIOS, and make it available to clients.

----------------------------

Check VSCSI adapter mapping on client:

root@bb_lpar: / # echo "cvai" | kdb | grep vscsi                   <--cvai is a kdb subcommand
read vscsi_scsi_ptrs OK, ptr = 0xF1000000C01A83C0
vscsi0     0x000007 0x0000000000 0x0                aix-vios1->vhost2        <--shows which vhost is used on which vio server for this client
vscsi1     0x000007 0x0000000000 0x0                aix-vios1->vhost1
vscsi2     0x000007 0x0000000000 0x0                aix-vios2->vhost2

Checking for a specific vscsi adapter (vscsi0):

root@bb_lpar: /root # echo "cvscsi\ncvai vscsi0"| kdb |grep -E "vhost|part_name"
priv_cap: 0x1 host_capability: 0x0 host_name: vhost2 host_location:
host part_number: 0x1   os_type: 0x3    host part_name: aix-vios1

----------------------------

Other way to find out VSCSI and VHOST adapter mapping:
If the whole disk is assigned to a VIO client, then PVID can be used to trace back connection between VIO server and VIO client.

1. root@bb_lpar: /root # lspv | grep hdisk0                              <--check pvid of the disk is question on client
   hdisk0          00080e82a84a5c2a                    rootvg

2. padmin@bb_vios1: /home/padmin # lspv | grep 5c2a                    <--check which disk has this pvid on vio server
   hdiskpower21     00080e82a84a5c2a                     None

3. padmin@bb_vios1: /home/padmin # lsmap -all -field SVSA "Backing Device" VTD "Client Partition ID" Status -fmt ":" | grep hdiskpower21
   vhost13:0x0000000c:hdiskpower21:pid12_vtd0:Available                      <--check vhost adapter of the given disk

----------------------------

Managing VSCSI devices (server-client mapping)

We need a server adapter on VIO (vhost) and a client adapter on the LPAR (vscsi) with correct config (pairing the server and client adapter together) and after from VIO a disk, an lv or a virtual optical device can be assigned to the LPAR as a vscsi device.

3 ways to create adapters:
- HMC Enhanced GUI:
this is the easiest and recommended way. First choose the LPAR and then on the left menu Virtual Storage. After it is possible to add a Physical Volume, or SSP Volume or LV, and during this operation all adapters and pairing will be created automatically by HMC.

- changing LPAR/VIO profile on HMC GUI:
It is possible to add these adapters in the profile, but in this case the pairing should be carefully done. Make sure to use the correct IDs both on VIO and LPAR side. After activating the profile adapters should pop up in VIO and LPAR side as well.

- HMC CLI:
With HMC commands it is possible to create these virtual adapters but in this case pairing should be carefully done (Making sure correct IDs are used.)

After VIO and LPAR pairing is done disk/lv can be assigned using VIO commands:
lsmap -all <--first check which vhost adapter will be needed in later commands

-using physical disks:
    mkvdev -vdev hdisk34 -vadapter vhost0 -dev vclient_disk    <--for easier identification useful to give a name with the -dev flag
    rmvdev -vdev <backing dev.>                        <--back. dev can be checked with lsmap -all (here vclient_disk)

-using logical volumes:
    mkvg -vg testvg_vios hdisk34             <--creating vg for lv
    lsvg                               <--listing a vg
    reducevg <vg> <disk>                   <--deleting a vg

    mklv -lv testlv_client testvg_vios 10G       <--creating lv what will be mapped to client
    lsvg -lv <vg>                      <--lists lvs under a vg
    rmlv <lv>                        <--removes an lv

    mkvdev -vdev testlv_client -vadapter vhost0 -dev <any_name> <--for easier identification give a name with the -dev flag
(here backing device is an lv (testlv_client)
    rmvdev -vdev <back. dev.>              <--removes an assignment to the client

-using logical volumes just with storage pool commands:
   (vg=sp, lv=bd)

    mksp <vgname> <disk>               <--creating a vg (sp)
    lssp                                 <--listing stoarge pools (vgs)
    chsp -add -sp <sp> PhysicalVolume        <--adding a disk to the sp (vg)
    chsp -rm -sp bb_sp hdisk2                  <--removing hdisk2 from bb_sp (storage pool)

    mkbdsp -bd <lv> -sp <vg> 10G               <--creates an lv with given size in the sp
    lssp -bd -sp <vg>                      <--lists lvs in the given vg (sp)
    rmbdsp -bd <lv> -sp <vg>           <--removes an lv from the given vg (sp)

    mkvdev..., rmvdev... also applies

-using file backed storage pool
    first a normal (LV) storage pool should be created with: mkvg or mksp, after that:
    mksp -fb <fb sp name> -sp <vg> -size 20G       <--creates a file backed storage pool with given storage pool and size
                                   (it wil look like an lv, and a fs will be created automatically as well)
    lssp                                 <--it will show as FBPOOL
    chsp -add -sp clientData -size 1G              <--increase the size of the file storage pool (ClientData) by 1G

    mkbdsp -sp fb_testvg -bd fb_bb -vadapter vhost2 10G    <--it will create a file backed device and assigns it to the given vhost
    mkbdsp -sp fb_testvg -bd fb_bb1 -vadapter vhost2 -tn balazs 8G <--it will also specify a virt. target device name (-tn)

    lssp -bd -sp fb_testvg                   <--lists the lvs (backing devices) of the given sp
    rmbdsp -sp fb_testvg -bd fb_bb1        <--removes the given lv (bd) from the sp
    rmsp <file sp name>                <--remove s the given file storage pool

removing it:
rmdev -dev vhost1 -recursive
----------------------------

On client partitions, MPIO for virtual SCSI devices currently only support failover mode (which means only one path is active at a time:
root@bb_lpar: / # lsattr -El hdisk0
PCM             PCM/friend/vscsi                 Path Control Module        False
algorithm       fail_over                        Algorithm                  True

----------------------------

Multipathing with dual VIO config:

on VIO SERVER:
# lsdev -dev <hdisk_name> -attr                <--checking disk attributes
# lsdev -dev <fscsi_name> -attr                                    <--checking FC attributes

# chdev -dev fscsi0 -attr fc_err_recov=fast_fail dyntrk=yes-perm   <--reboot is needed for these
    fc_err_recov=fast_fail                       <--in case of a link event IO will fail immediately
    dyntrk=yes                               <--allows the VIO server to tolerate cabling changes in the SAN

# chdev -dev hdisk3 -attr reserve_policy=no_reserve          <--each disk must be set to no_reservr
    reserve_policy=no_reserve                <--if this is configured, dual vio server can present a disk to client

on VIO client:
# chdev -l vscsi0 -a vscsi_path_to=30 -a vscsi_err_recov=fast_fail -P    <--path timout checks health of VIOS and detects if VIO Server adapter isn't responding
    vscsi_path_to=30                            <--by default it is disabled (0), each client adapter must be configured, minimum is 30
    vscsi_err_recov=fast_fail                   <--failover will happen immediately rather than delayed

# chdev -l hdisk0 -a queue_depth=20 -P        <--it must match the queue depth value used for the physical disk on the VIO Server
    queue_depth                               <--it determines how many requests will be queued on the disk

# chdev -l hdisk0 -a hcheck_interval=60 -a hcheck_mode=nonactive -P    <--health check updates automatically paths state
                                                   (otherwise failed path must be set manually))
    hcheck_interval=60                        <--how often do hcheck, each disk must be configured (hcheck_interval=0 means it is disabled)
    hcheck_mode=nonactive                     <--hcheck is performed on nonactive paths (paths with no active IO)

Never set the hcheck_interval lower than the read/write timeout value of the underlying physical disk on the Virtual I/O Server. Otherwise, an error detected by the Fibre Channel adapter causes new healthcheck requests to be sent before the running requests time out.

The minimum recommended value for the hcheck_interval attribute is 60 for both Virtual I/O and non Virtual I/O configurations.
In the event of adapter or path issues, setting the hcheck_interval too low can cause severe performance degradation or possibly cause I/O hangs.

It is best not to configure more than 4 to 8 paths per LUN (to avoid too many hchecks IO), and set the hcheck_interval to 60 in the client partition and on the Virtual I/O Server.

----------------------------

TESTING PATH PRIORITIES:

By default all the paths are defined with priority 1 meaning that traffic will go through the first path.
If you want to control the paths 'path priority' has to be updated.
Priority of the VSCSI0 path remains at 1, so it is the primary path.
Priority of the VSCSI1 path will be changed to 2, so it will be lower priority.

PREPARATION ON CLIENT:
# lsattr -El hdisk1 | grep hcheck
hcheck_cmd      test_unit_rdy                      <--hcheck is configured, so path should come back automatically from failed state
hcheck_interval 60
hcheck_mode     nonactive

# chpath -l hdisk1 -p vscsi1 -a priority=2       <--I changed priority=2 on vscsi1 (by default both paths are priority=1)

# lspath -AHE -l hdisk1 -p vscsi0
    priority 1     Priority    True

# lspath -AHE -l hdisk1 -p vscsi1
    priority 2     Priority    True

So, configuration looks like this:
VIOS1 -> vscsi0 -> priority 1
VIOS2 -> vscsi1 -> priority 2

TEST 1:

1. ON VIOS2: # lsmap -all                               <--checking disk mapping on VIOS2
    VTD                   testdisk
    Status                Available
    LUN                   0x8200000000000000
    Backing device        hdiskpower1
    ...

2. ON VIOS2: # rmdev -dev testdisk                      <--removing disk mapping from VIOS2

3. ON CLIENT: # lspath
    Enabled hdisk1 vscsi0
    Failed hdisk1 vscsi1                               <--it will show failed path on vscsi2 (this is coming from VIOS2)

4. ON CLIENT: # errpt                                   <--error report will show "PATh HAS FAILED"
    IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
    DE3B8540   0324120813 P H hdisk1         PATH HAS FAILED

5. ON VIOS2: # mkvdev -vdev hdiskpower1 -vadapter vhost0 -dev testdisk    <--configure back disk mapping from VIOS2

6. ON CLIENT: # lspath                                    <--in 30 seconds path will come back automatically
    Enabled hdisk1 vscsi0
    Enabled hdisk1 vscsi1                               <--because of hcheck, path came back automatically (no manual action was needed)

7. ON CLIENT: # errpt                                   <--error report will show path has been recovered
    IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
    F31FFAC3   0324121213 I H hdisk1         PATH HAS RECOVERED

TEST 2:

I did the same on VIOS1 (rmdev...disk, which has path priority 1 (IO is going there by default)

ON CLIENT: # lspath
    Failed hdisk1 vscsi0
    Enabled hdisk1 vscsi1

ON CLIENT: # errpt                                      <--an additional disk operation error will be in errpt
    IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION
    DCB47997   0324121513 T H hdisk1         DISK OPERATION ERROR
    DE3B8540   0324121513 P H hdisk1         PATH HAS FAILED

----------------------------

How to change a VSCSI adapter on client:

# lspath
Enabled hdisk0 vscsi0
Enabled hdisk0 vscsi2                                              <--we want to change vsci2 to vscsi1

On VIO client:
1. # rmpath -p vscsi2 -d                                   <--remove paths from vscsi2 adapter
2. # rmdev -dl vscsi2                                              <--remove adapter

On VIO server:
3. # lsmap -all                                                  <--check assignment and vhost device
4. # rmdev -dev vhost0 -recursive                                      <--remove assignment and vhost device

On HMC:
5. Remove deleted adapter from client (from profil too)
6. Remove deleted adapter from VIOS (from profil too)
7. Create new adapter on client (in profil too)                <--cfgmgr on client
8. Create new adapter on VIOS (in profil too)                      <-cfgdev on VIO server

On VIO server:
9. # mkvdev -vdev hdiskpower0 -vadapter vhost0 -dev rootvg_hdisk0    <--create new assignment

# lspath
Enabled hdisk0 vscsi0
Enabled hdisk0 vscsi1                                              <--vscsi1 is there (cfgmgr may needed)

----------------------------

Assigning and moving DVD RAM between LPARS

1. lsdev -type optical                    <--check if VIOS owns optical device (you should see sg. like: cd0 Available SATA DVD-RAM Drive)
2. lsmap -all                       <--to see if cd0 is already mapped and which vhost to use for assignment (lsmap -all | grep cd0)
3. mkvdev -vdev cd0 -vadapter vhost0      <--it will create vtoptX as a virtual target device (check with lsmap -all )

4. cfgmgr (on client lpar)                <--bring up cd0 device on client (before moving cd0 device rmdev device on client first)

5. rmdev -dev vtopt0 -recursive           <--to move cd0 to another client, remove assignment from vhost0
6. mkvdev -vdev cd0 -vadapter vhost1    <--create new assignment to vhost1

7. cfgmgr (on other client lpar)        <--bring up cd0 device on other client

(Because VIO server adapter is configured with "Any client partition can connect" option, these pairs are not suited for client disks.)

----------------------------

dropdown menu