Mirror Pool
Starting with 6.1 TL2 so-called mirror pools were introduced that make it possible to divide the physical volumes of a scalable volume group into separate pools. Mirror Pools allow to group physical volumes in a scalable volume group so that a mirror copy of a logical volume can be restricted to only allocate partitions from physical volumes in a specified group.
A mirror pool is made up of one or more physical volumes. Each physical volume can only belong to one mirror pool at a time. When creating a logical volume, each copy of the lv being created can be assigned to a mirror pool.
Mirro Pool name can be up to 15 characters long and is unique to the volume group it belongs to. Therefore, two separate volume groups could use the same name for their mirror pools.
Any changes to mirror pool characteristics will not affect partitions allocated before the changes were made. The reorgvg command should be used after mirror pool changes are made to move the allocated partitions to conform to the mirror pool restrictions.
------------------------------------------------------
Strict Mirror Pool:
When strict mirror pools are enabled any logical volume created in the volume group must have mirror pools enabled for each copy of the logical volume. (If this is enabled all of the logical volumes in the volume group must use mirror pools.)
mkvg -M y -S <hdisk list> creating a vg with strict mirror pool
chvg -M y <vg name> turn on/off strict miror pool setting for a vg (chvg -M n... will turn off)
lsvg <vg name> shows mirro pool sctrictness (at the end of the output: MIRROR POOL STRICT: on)
------------------------------------------------------
Super strict Mirror Pool:
A super strict allocation policy can be set so that the partitions allocated for one mirror cannot share a physical volume with the partitions from another mirror. With this setting each mirror pool must contain at least one copy of each logical volume.
mkvg -M s -S <hdisk list> creating a vg with super strict setting
chvg -M s <vg name> turn on/off super strict setting for a vg (chvg -M n... will turn off)
lsvg <vg name> shows mirro pool sctrictness (at the end of the output: MIRROR POOL STRICT: super)
------------------------------------------------------
Creating/Removing/Renaming a Mirror Pool (adding disk to a Mirror Pool):
mkvg -S -p PoolA hdisk2 hdisk4 bbvg <--creating a new VG with mirror pool
extendvg -p PoolA bbvg hdisk6 <--extending a VG with a disk (while adding disks to mirror pools)
If we already have a vg:
root@bb_lpar: / # lsvg -P bbvg <--lists the mirror pool that each physical volume in the volume group belongs to
Physical Volume Mirror Pool
hdisk6 None
hdisk7 None
root@bb_lpar: / # chpv -p PoolA hdisk6 <--creating mirror pool with the given disks (disks should be part of a vg)
root@bb_lpar: / # chpv -p PoolB hdisk7 (or if the mirror pool already exists, it will add the specified disk to the pool)
root@bb_lpar: / # lsvg -P bbvg
Physical Volume Mirror Pool
hdisk6 PoolA
hdisk7 PoolB
root@bb_lpar: / # chpv -P hdisk7 <--removes the physical volume from the mirror pool
root@bb_lpar: / # lsvg -P bbvg
Physical Volume Mirror Pool
hdisk6 PoolA
hdisk7 None
root@bb_lpar: / # chpv -m PoolC hdisk6 <--changes the name of the mirror pool
root@bb_lpar: / # lsvg -P bbvg
Physical Volume Mirror Pool
hdisk6 PoolC
hdisk7 None
------------------------------------------------------
Creating/Mirroring lv to a Mirror Pool:
mklv -c 2 -p copy1=PoolA -p copy2=PoolB bbvg 10 <--creates an lv (with default name:lv00) in the given mirror pools with the given size
mklvcopy -p copy2=PoolB bblv 2 <--creates a 2nd copy of an lv to the given mirror pool
mirrorvg -p copy2=MPoolB -c 2 bbvg <--mirrors the whole vg to the given mirror pool
------------------------------------------------------
Adding/Removing an lv to/from a Mirror Pool:
root@bb_lpar: / # lsvg -m bbvg <--shows lvs of a vg with mirror pools
Logical Volume Copy 1 Copy 2 Copy 3
bblv None None None
root@bb_lpar: / # chlv -m copy1=PoolA bblv <--enables mirror pools to the given copy of an lv
root@bb_lpar: / # chlv -m copy2=PoolB bblv
root@bb_lpar: / # lsvg -m bbvg <--checking again the layout
Logical Volume Copy 1 Copy 2 Copy 3
bblv PoolA PoolB None
root@bb_lpar: / # chlv -M 1 bb1lv <--disables mirror poools of the given copy for the lv
root@bb_lpar: / # lsvg -m bbvg <--checking again the layout
Logical Volume Copy 1 Copy 2 Copy 3
bb1lv None PoolB None
------------------------------------------------------
Viewing Mirror Pools:
lsmp bbvg <--lists the mirror pools of the given vg
lspv hdisk6 <--shows PV characteristic (at the last line shows mirror pool the pv belongs to)
lspv -P <--shows all PVs in the system (with mirror pools)
lsvg -P bbvg <--shows the PVs of a VG (with mirror pools)
lsvg -m bbvg <--shows the LVs of a VG (with mirror pools)
lslv bblv <--shows LV characteristics (at the end shows lv copies and the mirror) pools)
------------------------------------------------------
Correct steps of creating and removing a mirror pool (totally):
Mirror pool is a separate entity from LVM. (I imagine it as a small database, which keeps rules and strictness, so the underlying LVM commands, based on those rules are successful or not.) It can happen that you remove the 2nd copy of an LV with rmlvcopy (not in LVM anymore), but mirror pool commands will still show it as an existent copy. So make sure LVM commands (mirrorvg, mklvcopy...) and Mirror Pool commands (chpv -p, chlv -m copy1=.., chvg -M....) are in synchron all the time!
Mirror pool informations are stored in 3 places: PV, LV and VG
If you need to create or remove a mirror pool, make sure mirror pool entry is taken care it all 3 places.
Creating mirror pool on a VG which is already mirrored at LVM level:
0. check if mirrors are OK (each copy in separate disk)
1. chpv -p <poolname> <diskname> <--add disks to the mirror pool
# lspv hdisk0 | grep MIRROR
MIRROR POOL: PoolA
2. chlv -m copy1=PoolA fslv00 <--add lv to the given pool (add all lvs to both pools: copy1 and copy2)
# lslv fslv00 | grep MIRROR
COPY 1 MIRROR POOL: PoolA
COPY 2 MIRROR POOL: PoolB
COPY 3 MIRROR POOL: None
3. chvg -M <strictness> <vgname> <--set strictness for the VG (usually chvg -M s ...)
# lsvg testvg | grep MIRROR
MIRROR POOL STRICT: super
------------------------------------------------------
Removing mirror pool from a system:
1. chvg -M n <vgname> <--turn off strictness
# lsvg testvg | grep MIRROR
MIRROR POOL STRICT: off
2. chlv -M 2 <lvname> <--remove 2nd copy of the LV from mirror pool (remove 1st copy as well: chlv -M 1...)
# lslv fslv00 | grep MIRROR
COPY 1 MIRROR POOL: PoolA
COPY 2 MIRROR POOL: None
COPY 3 MIRROR POOL: None
If every mirror pool is removed from LV level, only then!:
3. chpv -P <diskname> <--remove disk from mirror pool (do it with all disks)
# lspv hdiskpower0| grep MIRROR
MIRROR POOL: None
4. check with lsvg -m <vgname>
------------------------------------------------------
If you remove mirror pool from a disk, but it still exist on LV level (step 2 and 3 are not in correct order), you will get this:
# chpv -P hdiskpower0
0516-1010 chpv: Warning, the physical volume hdiskpower0 has open
logical volumes. Continuing with change.
0516-1812 lchangepv: Warning, existing allocation violates mirror pools.
Consider reorganizing the logical volume to bring it into compliance.
# lsvg -m testvg
Logical Volume Copy 1 Copy 2 Copy 3
loglv00 None None None
fslv00 None None <--it will show incorrect data (Pool was not deleted at LV level)
fslv01 None None None
# chlv -M 1 fslv00 <--remove pool from LV level (copy 1)
# lsvg -m testvg <--it will show correct info
Logical Volume Copy 1 Copy 2 Copy 3
loglv00 None None None
fslv00 None None None
fslv01 None None None
------------------------------------------------------
Changing from one Mirror Pool to another:
If you have a good working system with mirror pool (A and B) and requested to remove disks from pool A and assign new disks from Pool C:
My suggestion:
1. remove mirror pools totally from the system: from VG, LV and PV level
2. remove unnecessary mirror at LVM level (unmirrorvg from the disks of Pool A)
3. delete disks on the system (from Pool A) and assign new disks to the system (Pool C)
4. create LVM mirror to the new disks on Pool C (mirrorvg)
5. create new mirror pools, Pool A and C (PV, LV and VG level)
------------------------------------------------------
0516-622 extendlv: Warning, cannot write lv control block data.
0516-1812 lchangelv: Warning, existing allocation violates mirror pools.
Consider reorganizing the logical volume to bring it into compliance.
This can come up when you want to increase fs (or lv), but the lv layout on the disks is not following fully the mirror pool restrictions. (For example there is an lp which exists on a disk in one pool, but it should reside in the other pool.)
The reorgvg command can solve this (it can run for a long time):
reorgvg <vg name> <lv name>
Sometimes reorgvg can't solve it and you have to manually find where is the problem:
1. check lv - mirror pool distribution:
root@aixdb2: /root # lsvg -m P_NAVISvg
Logical Volume Copy 1 Copy 2 Copy 3
p_admlv VMAX_02 VMAX_03 None
p_datlv VMAX_02 VMAX_03 None
p_archlv VMAX_02 VMAX_03 None
...
As you see all of the 1st copy belongs to VMAX_02 and the 2nd copy to VMAX_03
2. check disk - mirror pool distribution
root@aixdb2: /root # lspv -P
Physical Volume Volume Group Mirror Pool
hdiskpower1 P_NAVISvg VMAX_03 <--it should contain only 2nd copy of lvs
hdiskpower2 P_NAVISvg VMAX_03 <--it should contain only 2nd copy of lvs
...
hdiskpower18 P_NAVISvg VMAX_02 <--it should contain only 1st copy of lvs
hdiskpower19 P_NAVISvg VMAX_02 <--it should contain only 1st copy of lvs
hdiskpower20 P_NAVISvg VMAX_02 <--it should contain only 1st copy of lvs
3. check lv - disk distribution
From the output of lsvg -M <vg name>, you can see the 1st and 2nd copy of an lv resides on which disk.
After that you can check if that disk belongs to the correct mirror pool or not.
this will sort the disks with lvs on it and show which copy (1st or 2nd) is there:
root@aixdbp2: /root # lsvg -M P_NAVISvg | awk -F: '{print $1,$2,$4}'| awk '{print $1,$3,$4}'| sort -u | sort -tr +1 -n
P_NAVISvg:
hdiskpower18 t_datlv 1
hdiskpower18 t_oralv 1
hdiskpower19 p_datlv 2 <--2nd copy of p_datlv resides on hdiskpower19, but hdiskpower19 should contain only 1st copy
hdiskpower19 p_oralv 1
hdiskpower19 t_archlv 1
(the above command: lsvg -M...sort -tr +1 -n, was written for hdiskpower disks (-tr:delimeter is 'r'))
(if you have only hdisk, you can change it to lsvg -M...sort -tk +1 -n, or if you omit this sort, the command should work as well)
4. migrating the wrong lps to a correct disk
checking lps of an lv:
root@aixdb2: /root # lspv -M hdiskpower19 | grep p_datlv
hdiskpower19:889 p_datlv:9968:2
hdiskpower19:890 p_datlv:9969:2
hdiskpower19:891 p_datlv:9970:2
After finding the correct disk with free pps (e.g. this will show you the freepps: lspv -M <disk>):
root@aixdb2: /root # migratelp p_datlv/9968/2 hdiskpower2/329
(Sometimes for migratelp not enough to give diskname only (e.g. hdiskpower2), pp number is needed as well (e.g. hdiskpower2/329))
Practical Guide to AIX (and PowerVM, PowerHA, PowerVC, HMC, DevOps ...)
HA - COMMANDS
Commands:
odmget HACMPlogs shows where are the log files
odmget HACMPcluster shows cluster version
odmget HACMPnode shows info from nodes (cluster version)
/etc/es/objrepos HACMP ODM files
Log files:
(changing log files path: C-SPOC > Log Viewing and Management)
/var/hacmp/adm/cluster.log main PowerHA log file (errors,events,messages ... /usr/es/adm/cluster.log)
/var/hacmp/adm/history/cluster.mmddyy shows only the EVENTS, generated daily (/usr/es/sbin/cluster/history/cluster.mmddyyyy)
/var/hacmp/log/clinfo.log records the activity of the clinfo daemon
/var/hacmp/log/clstrmgr.debug debug info about the cluster (clstrmg.debug.long also exists) IBM support using these
/var/hacmp/log/clutils.log summary of nightly verification
/var/hacmp/log/cspoc.log shows more info of the smitty c-spoc command (good place to look if a command fails)
/var/hacmp/log/hacmp.out similar to cluster.log, but more detailed (with all the output of the scripts)
/var/hacmp/log/loganalyzer/loganalyzer.log log analyzer (clanalyze) stores outputs there
/var/hacmp/clverify shows the results of the verifications (verification errors are logged here)
/var/log/clcomd/clcomd.log contains every connect request between the nodes and return status of the requests
RSCT Logs:
/var/ha/log RSCT logs are here
/var/ha/log/nim.topsvcs... the heartbeats are logged here (comm. is OK between the nodes)
clRGinfo Shows the state of RGs (in earlier HACMP clfindres was used)
clRGinfo -p shows the node that has temporarily the highest priority (POL)
clRGinfo -t shows the delayed timer information
clRGinfo -m shows the status of the application monitors of the cluster
resource groups state can be: online, offline, acquiring, releasing, error, unknown
cldump (or clstat -o) detailed info about the cluster (realtime, shows cluster status) (clstat requires a running clinfo)
cldisp detailed general info about the cluster (not realtime) (cldisp | egrep 'start|stop', lists start/stop scripts)
cltopinfo Detailed information about the network of the cluster (this shows the data in DCD not in ACD)
cltopinfo -i good overview, same as cllsif: this also lists cluster inetrfaces, it was used prior HACMP 5.1
cltopinfo -m shows heartbeat statistics, missed heartbeats (-m is no longer available on PowerHA 7.1)
clshowres Detailed information about the resource group(s)
cllsserv Shows which scripts will be run in case of a takeover
clrgdependency -t PARENT_CHILD -sl shows parent child dependencies of resource groups
clshowsrv -v shows status of the cluster daemons (very good overview!!!)
lssrc -g cluster lists the running cluster daemons
ST_STABLE: cluster services running with resources online
NOT_CONFIGURED: cluster is not configured or node is not synced
ST_INIT: cluster is configured but not active on this node
ST_JOINING: cluster node is joining the cluster
ST_VOTING: cluster nodes are voting to decide event execution
ST_RP_RUNNING: cluster is running a recovery program
RP_FAILED: recovery program event script is failed
ST_BARRIER: clstrmgr is in between events waiting at the barrier
ST_CBARRIER: clstrmgr is exiting a recovery program
ST_UNSTABLE: cluster is unstable usually due to an event error
lssrc -ls clstrmgrES shows if cluster is STABLE or not, cluster version, Dynamic Node Priority (pgspace free, disk busy, cpu idle)
lssrc -ls topsvcs shows the status of individual diskhb devices, heartbeat intervals, failure cycle (missed heartbeats)
lssrc -ls grpsvcs gives info about connected clients, number of groups)
lssrc -ls emsvcs shows the resource monitors known to the event management subsystem)
lssrc -ls snmpd shows info about snmpd
halevel -s shows PowerHA level (from 6.1)
lscluster list CAA cluster configuration information
-c cluster configuration
-d disk (storage) configuration
-i interfaces configuration
-m node configuration
mkcluster create a CAA cluster
chcluster change a CAA cluster configuration
rmcluster remove a CAA cluster configuration
clcmd <command> it will run given <command> on both nodes (for example: clcmd date)
cl_ping pings all the adapters of the given list (e.g.: cl_ping -w 2 aix21 aix31 (-w: wait 2 seconds))
cldiag HACMP troubleshooting tool (e.g.: cldiag debug clstrmgr -l 5 <--shows clstrmgr heartbeat infos)
cldiags vgs -h nodeA nodeB <--this checks the shared vgs definitions on the given node for inconsistencies
clmgr offline cluster WHEN=now MANAGE=offline STOP_CAA=yes stop cluster and CAA as well (after maintenance start with START_CAA=yes)
clmgr view report cluster TYPE=html FILE=/tmp/powerha.report create the HTML report
clanalyze -a -p "diskfailure" analyzes PowerHA logs for applicationfailure, interfacefailure, networkfailure, nodefailure...
lvlstmajor lists the available major numbers for each node in the cluster
------------------------------------------------------
/usr/es/sbin/cluster/utilities/get_local_nodename shows the name of this node within the HACMP
/usr/es/sbin/cluster/utilities/clexit.rc this script halt the node if the cluster manager daemon stopped incorrectly
------------------------------------------------------
Remove HACMP:
1. stop cluster on both nodes
2. remove the cluster configuration ( smitty hacmp) on both nodes
3. remove cluster filesets (startinf with cluster.*)
------------------------------------------------------
If you are planning to do crash-test, do it with halt -q or reboot -q
shutdown -Fr will not work, because it stops hacmp and resource groups garcefully (rc.shutdown), so no takeover will occur
------------------------------------------------------
clhaver - clcomd problem:
If there are problems during start up a cluster or synch. and verif., and you see something like this:
1800-106 An error occurred:
connectconnect: : Connection refusedConnection refused
clhaver[113]: cl_socket(aix20)clhaver[113]: cl_socket(aix04): : Connection refusedConnection refused
Probably there is a problem with clcomd.
1. check if if it is running: clshowsrv -v or lssrc -a | grep clcomd
refresh or start it: refresh -s clcomdES or startsrc -s clcomdES
2. check log file: /var/hacmp/clcomd/clcomd.log
you can see something like this: CONNECTION: REJECTED(Invalid address): aix10: 10.10.10.100->10.10.10.139
for me the solution was:
-update /usr/sbin/cluster/etc/rhosts file on both nodes (I added all ip's of both servers (except service ip + service backup ip))
-refresh -s clcomdES
------------------------------------------------------
When trying to bring up a resource group in HACMP, got the following errors in the hacmp.out log file.
cl_disk_available[187] cl_fscsilunreset fscsi0 hdiskpower1 false
cl_fscsilunreset[124]: openx(/dev/hdiskpower1, O_RDWR, 0, SC_NO_RESERVE): Device busy
cl_fscsilunreset[400]: ioctl SCIOLSTART id=0X11000 lun=0X1000000000000 : Invalid argument
To resolve this, you will have to make sure that the SCSI reset disk method is configured in HACMP. For example, when using EMC storage:
Make sure emcpowerreset is present in /usr/lpp/EMC/Symmetrix/bin/emcpowerreset.
Then add new custom disk method:
smitty hacmp -> Ext. Conf. -> Ext. Res. Conf. -> HACMP Ext. Resources Conf. -> Conf. Custom Disk Methods -> Add Cust. Disk
* Disk Type (PdDvLn field from CuDv) [disk/pseudo/power]
* Method to identify ghost disks [SCSI3]
* Method to determine if a reserve is held [SCSI_TUR]
* Method to break a reserve [/usr/lpp/EMC/Symmetrix/bin/emcpowerreset]
Break reserves in parallel true
* Method to make the disk available [MKDEV]
------------------------------------------------------
Once I had a problem with commands 'cldump' and 'clstat -o' (version 5.4.1 SP3)
cldump: Waiting for the Cluster SMUX peer (clstrmgrES)
to stabilize...
Can not get cluster information.
Solution was:
-checked all the below mentioned daemons (clinfo, clcomd,snmpd...) and started what was missing
-and after that I did: refresh -s clstrmgrES (cldump and clstat was OK only after this refresh has been done)
-once had a problem with clstat -a (but clinfo was running), after refresh -s clinfoES it was OK again
(This can be also good: stopsrc -s clinfoES && sleep 2 && startsrc -s clinfoES )
things what can be checked regarding snmp:
-clinfoES and clcomdES:
clshowsrv -v
-snmpd and mibd daemons (if not active startsrc can start it)
root@aix20: / # lssrc -a | egrep 'snm|mib'
snmpmibd tcpip 552998 active
aixmibd tcpip 524418 active
hostmibd tcpip 430138 active
snmpd tcpip 1212632 active
(hostmibd is not necessary all the time to be active)
-snmpd conf and log files
root@aix20: / # ls -l /etc | grep snmp
-rw-r----- 1 root system 2302 Aug 16 2005 clsnmp.conf
-rw-r--r-- 1 root system 37 Jun 16 16:18 snmpd.boots
-rw-r----- 1 root system 10135 Aug 11 2009 snmpd.conf
-rw-r----- 1 root system 2693 Aug 11 2009 snmpd.peers
-rw-r----- 1 root system 10074 Jun 16 16:22 snmpdv3.conf
drwxrwxr-x 2 root system 256 Aug 11 2009 snmpinterfaces
-rw-r----- 1 root system 1816 Aug 11 2009 snmpmibd.conf
root@aix20: / # ls -l /var/tmp | grep snmp
-rw-r--r-- 1 root system 83130 Jun 16 20:32 snmpdv3.log
-rw-r--r-- 1 root system 100006 Oct 01 2008 snmpdv3.log.1
-rw-r--r-- 1 root system 16417 Jun 16 16:19 snmpmibd.log
------------------------------------------------------
During PowerHA upgrade from 5.4.1 to 6.1 received these errors:
(it was an upgrade where we put into unmanage state the resource groups)
grep: can't open aixdb1
./cluster.es.cspoc.rte.pre_rm: ERROR
Cluster services are active on this node. Please stop all
cluster services prior to installing this software.
...
grep: can't open aixdb1
./cluster.es.client.rte.pre_rm: ERROR
Cluster services are active on this node. Please stop all
cluster services prior to installing this software.
Failure occurred during pre_rm.
Failure occurred during rminstal.
installp: An internal error occurred while attempting
to access the Software Vital Product Data.
Use local problem reporting procedures.
We checked where to find that script at the first ERROR:
root@aixdb1: / # find /usr -name cluster.es.client.rte.pre_rm -ls
145412 5 -rwxr-x--- 1 root system 4506 Feb 26 2009 /usr/lpp/cluster.es/inst_root/cluster.es.client.rte.pre_rm
Looking through the script, found these 2 lines:
LOCAL_NODE=$(odmget HACMPcluster 2>/dev/null | sed -n '/nodename = /s/^.* "\(.*\)".*/\1/p')
LC_ALL=C lssrc -ls clstrmgrES | grep "Forced down" | grep -qw $LOCAL_NODE
Checking these, after running the second line, the original error could be successfully recreated:
root@aixdb1: / # lssrc -ls clstrmgrES | grep "Forced down" | grep -qw $LOCAL_NODE
grep: can't open aixdb1
There were 2 entries in this variable, and that caused the error:
root@aixdb1: / # echo $LOCAL_NODE
aixdb1 aixdb1
root@aixdb1: / # odmget HACMPcluster
HACMPcluster:
id = 1315338110
name = "DFWEAICL"
nodename = "aixdb1" <--grep finds this entry
sec_level = "Standard"
sec_level_msg = ""
...
rg_distribution_policy = "node"
noautoverification = 1
clvernodename = "aixdb1" <--grep finds this entry as well (this is causing the trouble)
clverhour = 0
clverstartupoptions = 0
After Googling, what is clvernodename, find out this field is set by "Automatic Cluster Configuration Verification", and if it is set to Disabled it will remove the additional entry from ODM:
We checked in smitty hacmp -> HACmp verification -> Automatic..:
* Automatic cluster configuration verification Enabled <--we changed it to disabled
* Node name aixdb1
* HOUR (00 - 23) [00]
Debug no
After this correction, smitty update_all issued again. We received some similar errors (grep: can't open...), but when we retried smitty update_all then it was all successful. (All the earlier Broken filesets were corrected, and we had the new PowerHA version, without errors.)
------------------------------------------------------
Manual cluster switch:
1. varyonvg
2. mount FS (mount -t sapX11)(mount -t nfs)
3. check nfs: clshowres
if there are exported fs: exportfs -a
go to the nfs client: mount -t nfs
4. IP configure (ifconfig)
grep ifconfig /tmp/hacmp.out -> it will show the command:
IPAT via IP replacement: ifconfig en1 inet 10.10.110.11 netmask 255.255.255.192 up mtu 1500
IPAT via IP aliasing: ifconfig en3 alias 10.10.90.254 netmask 255.255.255.192
netmask can be found from ifconfig or cltopinfo -i
(removing ip: ifconfig en3 delete 10.10.90.254)
5. check routing (extra routes could be necessary)
(removing route: route delete -host 10.10.90.192 10.10.90.254 or route delete -net 10.10.90.192/26 10.10.90.254)
6. start applications
------------------------------------------------------
odmget HACMPlogs shows where are the log files
odmget HACMPcluster shows cluster version
odmget HACMPnode shows info from nodes (cluster version)
/etc/es/objrepos HACMP ODM files
Log files:
(changing log files path: C-SPOC > Log Viewing and Management)
/var/hacmp/adm/cluster.log main PowerHA log file (errors,events,messages ... /usr/es/adm/cluster.log)
/var/hacmp/adm/history/cluster.mmddyy shows only the EVENTS, generated daily (/usr/es/sbin/cluster/history/cluster.mmddyyyy)
/var/hacmp/log/clinfo.log records the activity of the clinfo daemon
/var/hacmp/log/clstrmgr.debug debug info about the cluster (clstrmg.debug.long also exists) IBM support using these
/var/hacmp/log/clutils.log summary of nightly verification
/var/hacmp/log/cspoc.log shows more info of the smitty c-spoc command (good place to look if a command fails)
/var/hacmp/log/hacmp.out similar to cluster.log, but more detailed (with all the output of the scripts)
/var/hacmp/log/loganalyzer/loganalyzer.log log analyzer (clanalyze) stores outputs there
/var/hacmp/clverify shows the results of the verifications (verification errors are logged here)
/var/log/clcomd/clcomd.log contains every connect request between the nodes and return status of the requests
RSCT Logs:
/var/ha/log RSCT logs are here
/var/ha/log/nim.topsvcs... the heartbeats are logged here (comm. is OK between the nodes)
clRGinfo Shows the state of RGs (in earlier HACMP clfindres was used)
clRGinfo -p shows the node that has temporarily the highest priority (POL)
clRGinfo -t shows the delayed timer information
clRGinfo -m shows the status of the application monitors of the cluster
resource groups state can be: online, offline, acquiring, releasing, error, unknown
cldump (or clstat -o) detailed info about the cluster (realtime, shows cluster status) (clstat requires a running clinfo)
cldisp detailed general info about the cluster (not realtime) (cldisp | egrep 'start|stop', lists start/stop scripts)
cltopinfo Detailed information about the network of the cluster (this shows the data in DCD not in ACD)
cltopinfo -i good overview, same as cllsif: this also lists cluster inetrfaces, it was used prior HACMP 5.1
cltopinfo -m shows heartbeat statistics, missed heartbeats (-m is no longer available on PowerHA 7.1)
clshowres Detailed information about the resource group(s)
cllsserv Shows which scripts will be run in case of a takeover
clrgdependency -t PARENT_CHILD -sl shows parent child dependencies of resource groups
clshowsrv -v shows status of the cluster daemons (very good overview!!!)
lssrc -g cluster lists the running cluster daemons
ST_STABLE: cluster services running with resources online
NOT_CONFIGURED: cluster is not configured or node is not synced
ST_INIT: cluster is configured but not active on this node
ST_JOINING: cluster node is joining the cluster
ST_VOTING: cluster nodes are voting to decide event execution
ST_RP_RUNNING: cluster is running a recovery program
RP_FAILED: recovery program event script is failed
ST_BARRIER: clstrmgr is in between events waiting at the barrier
ST_CBARRIER: clstrmgr is exiting a recovery program
ST_UNSTABLE: cluster is unstable usually due to an event error
lssrc -ls topsvcs shows the status of individual diskhb devices, heartbeat intervals, failure cycle (missed heartbeats)
lssrc -ls grpsvcs gives info about connected clients, number of groups)
lssrc -ls emsvcs shows the resource monitors known to the event management subsystem)
lssrc -ls snmpd shows info about snmpd
halevel -s shows PowerHA level (from 6.1)
lscluster list CAA cluster configuration information
-c cluster configuration
-d disk (storage) configuration
-i interfaces configuration
-m node configuration
mkcluster create a CAA cluster
chcluster change a CAA cluster configuration
rmcluster remove a CAA cluster configuration
clcmd <command> it will run given <command> on both nodes (for example: clcmd date)
cl_ping pings all the adapters of the given list (e.g.: cl_ping -w 2 aix21 aix31 (-w: wait 2 seconds))
cldiag HACMP troubleshooting tool (e.g.: cldiag debug clstrmgr -l 5 <--shows clstrmgr heartbeat infos)
cldiags vgs -h nodeA nodeB <--this checks the shared vgs definitions on the given node for inconsistencies
clmgr offline cluster WHEN=now MANAGE=offline STOP_CAA=yes stop cluster and CAA as well (after maintenance start with START_CAA=yes)
clmgr view report cluster TYPE=html FILE=/tmp/powerha.report create the HTML report
clanalyze -a -p "diskfailure" analyzes PowerHA logs for applicationfailure, interfacefailure, networkfailure, nodefailure...
lvlstmajor lists the available major numbers for each node in the cluster
/usr/es/sbin/cluster/utilities/get_local_nodename shows the name of this node within the HACMP
/usr/es/sbin/cluster/utilities/clexit.rc this script halt the node if the cluster manager daemon stopped incorrectly
------------------------------------------------------
Remove HACMP:
1. stop cluster on both nodes
2. remove the cluster configuration ( smitty hacmp) on both nodes
3. remove cluster filesets (startinf with cluster.*)
------------------------------------------------------
If you are planning to do crash-test, do it with halt -q or reboot -q
shutdown -Fr will not work, because it stops hacmp and resource groups garcefully (rc.shutdown), so no takeover will occur
------------------------------------------------------
clhaver - clcomd problem:
If there are problems during start up a cluster or synch. and verif., and you see something like this:
1800-106 An error occurred:
connectconnect: : Connection refusedConnection refused
clhaver[113]: cl_socket(aix20)clhaver[113]: cl_socket(aix04): : Connection refusedConnection refused
Probably there is a problem with clcomd.
1. check if if it is running: clshowsrv -v or lssrc -a | grep clcomd
refresh or start it: refresh -s clcomdES or startsrc -s clcomdES
2. check log file: /var/hacmp/clcomd/clcomd.log
you can see something like this: CONNECTION: REJECTED(Invalid address): aix10: 10.10.10.100->10.10.10.139
for me the solution was:
-update /usr/sbin/cluster/etc/rhosts file on both nodes (I added all ip's of both servers (except service ip + service backup ip))
-refresh -s clcomdES
------------------------------------------------------
When trying to bring up a resource group in HACMP, got the following errors in the hacmp.out log file.
cl_disk_available[187] cl_fscsilunreset fscsi0 hdiskpower1 false
cl_fscsilunreset[124]: openx(/dev/hdiskpower1, O_RDWR, 0, SC_NO_RESERVE): Device busy
cl_fscsilunreset[400]: ioctl SCIOLSTART id=0X11000 lun=0X1000000000000 : Invalid argument
To resolve this, you will have to make sure that the SCSI reset disk method is configured in HACMP. For example, when using EMC storage:
Make sure emcpowerreset is present in /usr/lpp/EMC/Symmetrix/bin/emcpowerreset.
Then add new custom disk method:
smitty hacmp -> Ext. Conf. -> Ext. Res. Conf. -> HACMP Ext. Resources Conf. -> Conf. Custom Disk Methods -> Add Cust. Disk
* Disk Type (PdDvLn field from CuDv) [disk/pseudo/power]
* Method to identify ghost disks [SCSI3]
* Method to determine if a reserve is held [SCSI_TUR]
* Method to break a reserve [/usr/lpp/EMC/Symmetrix/bin/emcpowerreset]
Break reserves in parallel true
* Method to make the disk available [MKDEV]
------------------------------------------------------
Once I had a problem with commands 'cldump' and 'clstat -o' (version 5.4.1 SP3)
cldump: Waiting for the Cluster SMUX peer (clstrmgrES)
to stabilize...
Can not get cluster information.
Solution was:
-checked all the below mentioned daemons (clinfo, clcomd,snmpd...) and started what was missing
-and after that I did: refresh -s clstrmgrES (cldump and clstat was OK only after this refresh has been done)
-once had a problem with clstat -a (but clinfo was running), after refresh -s clinfoES it was OK again
(This can be also good: stopsrc -s clinfoES && sleep 2 && startsrc -s clinfoES )
things what can be checked regarding snmp:
-clinfoES and clcomdES:
clshowsrv -v
-snmpd and mibd daemons (if not active startsrc can start it)
root@aix20: / # lssrc -a | egrep 'snm|mib'
snmpmibd tcpip 552998 active
aixmibd tcpip 524418 active
hostmibd tcpip 430138 active
snmpd tcpip 1212632 active
(hostmibd is not necessary all the time to be active)
-snmpd conf and log files
root@aix20: / # ls -l /etc | grep snmp
-rw-r----- 1 root system 2302 Aug 16 2005 clsnmp.conf
-rw-r--r-- 1 root system 37 Jun 16 16:18 snmpd.boots
-rw-r----- 1 root system 10135 Aug 11 2009 snmpd.conf
-rw-r----- 1 root system 2693 Aug 11 2009 snmpd.peers
-rw-r----- 1 root system 10074 Jun 16 16:22 snmpdv3.conf
drwxrwxr-x 2 root system 256 Aug 11 2009 snmpinterfaces
-rw-r----- 1 root system 1816 Aug 11 2009 snmpmibd.conf
root@aix20: / # ls -l /var/tmp | grep snmp
-rw-r--r-- 1 root system 83130 Jun 16 20:32 snmpdv3.log
-rw-r--r-- 1 root system 100006 Oct 01 2008 snmpdv3.log.1
-rw-r--r-- 1 root system 16417 Jun 16 16:19 snmpmibd.log
------------------------------------------------------
During PowerHA upgrade from 5.4.1 to 6.1 received these errors:
(it was an upgrade where we put into unmanage state the resource groups)
grep: can't open aixdb1
./cluster.es.cspoc.rte.pre_rm: ERROR
Cluster services are active on this node. Please stop all
cluster services prior to installing this software.
...
grep: can't open aixdb1
./cluster.es.client.rte.pre_rm: ERROR
Cluster services are active on this node. Please stop all
cluster services prior to installing this software.
Failure occurred during pre_rm.
Failure occurred during rminstal.
installp: An internal error occurred while attempting
to access the Software Vital Product Data.
Use local problem reporting procedures.
We checked where to find that script at the first ERROR:
root@aixdb1: / # find /usr -name cluster.es.client.rte.pre_rm -ls
145412 5 -rwxr-x--- 1 root system 4506 Feb 26 2009 /usr/lpp/cluster.es/inst_root/cluster.es.client.rte.pre_rm
Looking through the script, found these 2 lines:
LOCAL_NODE=$(odmget HACMPcluster 2>/dev/null | sed -n '/nodename = /s/^.* "\(.*\)".*/\1/p')
LC_ALL=C lssrc -ls clstrmgrES | grep "Forced down" | grep -qw $LOCAL_NODE
Checking these, after running the second line, the original error could be successfully recreated:
root@aixdb1: / # lssrc -ls clstrmgrES | grep "Forced down" | grep -qw $LOCAL_NODE
grep: can't open aixdb1
There were 2 entries in this variable, and that caused the error:
root@aixdb1: / # echo $LOCAL_NODE
aixdb1 aixdb1
root@aixdb1: / # odmget HACMPcluster
HACMPcluster:
id = 1315338110
name = "DFWEAICL"
nodename = "aixdb1" <--grep finds this entry
sec_level = "Standard"
sec_level_msg = ""
...
rg_distribution_policy = "node"
noautoverification = 1
clvernodename = "aixdb1" <--grep finds this entry as well (this is causing the trouble)
clverhour = 0
clverstartupoptions = 0
After Googling, what is clvernodename, find out this field is set by "Automatic Cluster Configuration Verification", and if it is set to Disabled it will remove the additional entry from ODM:
We checked in smitty hacmp -> HACmp verification -> Automatic..:
* Automatic cluster configuration verification Enabled <--we changed it to disabled
* Node name aixdb1
* HOUR (00 - 23) [00]
Debug no
After this correction, smitty update_all issued again. We received some similar errors (grep: can't open...), but when we retried smitty update_all then it was all successful. (All the earlier Broken filesets were corrected, and we had the new PowerHA version, without errors.)
------------------------------------------------------
Manual cluster switch:
1. varyonvg
2. mount FS (mount -t sapX11)(mount -t nfs)
3. check nfs: clshowres
if there are exported fs: exportfs -a
go to the nfs client: mount -t nfs
4. IP configure (ifconfig)
grep ifconfig /tmp/hacmp.out -> it will show the command:
IPAT via IP replacement: ifconfig en1 inet 10.10.110.11 netmask 255.255.255.192 up mtu 1500
IPAT via IP aliasing: ifconfig en3 alias 10.10.90.254 netmask 255.255.255.192
netmask can be found from ifconfig or cltopinfo -i
(removing ip: ifconfig en3 delete 10.10.90.254)
5. check routing (extra routes could be necessary)
(removing route: route delete -host 10.10.90.192 10.10.90.254 or route delete -net 10.10.90.192/26 10.10.90.254)
6. start applications
------------------------------------------------------
HW - FIRMWARE
FIRMWARE MANAGEMENT FROM HMC
Managed Systems Firmware Update
The FSP (Flexible Service Processor) is responsible to the communication between the HMC and the Power System. This FSP is using a special software called the firmware, which provides low-level control of the hardware. Firmware installation/update can be done through the HMC or through the OS, but if a system is managed by an HMC, then FW update is done through the HMC. (Sometimes IBM calls this firmware as LIC: Licensed Internal Code)
The service processor maintains two copies of the server firmware. One copy is considered the permanent copy and is stored on the permanent side. ("p" side). The other copy is considered the temporary copy and is stored on the temporary side ("t" side). It is recommended that you start and run the server from the temporary side. When you install a server firmware update, it is installed on the temporary side.
A FW update consists of 2 steps:
- installation: update what is in flash (read only memory).
- activation: making the new version to run on the system
Not all updates can be activated concurrently:
- Concurrent: Apply and activate with partitions running.
- Deferred: Concurrent activation but contains updates that affect the initial program load (IPL), these are not activated until the system is powerd off and on.
- Disruptive: System shutdown and restart is required; (none of the update contents are activated until the next time you shut down and restart the server).
If a system runs on the Permanent side, then a concurrent update is not possible. Regardless of firmware update type, if you see this message during a firmware update, it means, it will be a disruptive update. The accept operation cannot be performed because all components are running on the permanent flash side.
When checking firmware level:
- Installed level: This level has been installed and will be activated (loaded into memory) after the managed system is powered off/on.
- Activated level: This is the level that is active and running in memory.
- Accepted level: This is the backup level. You can return to this level if you remove (updlic -o r ...) the current level.(this is the code on the permanent side.)
- Deferred level: It indicates the firmware level that contains unactivated deferred updates. (system restart is needed to activate these)
- Platform IPL level: This is the level the system was booted on. (After concurrent upd., activated level will change, but platf. IPL level remain unchanged)
# lslic -t sys -m <man. sys.>
activated_level=136,activated_spname=FW950.90, <-- this is active now
installed_level=136,installed_spname=FW950.90, <-- during reboot this will be activated
accepted_level=131,accepted_spname=FW950.80, <-- backup level (permanent side),
deferred_level=None,deferred_spname=None, <--this level contains unactivated updates
platform_ipl_level=87,platform_ipl_spname=FW950.20, <--this level was used during last reboot
curr_level_primary=136,curr_spname_primary=FW950.90, <--current (activated) level, of the primary service processor
temp_level_primary=136,temp_spname_primary=FW950.90, <-- temp side of the primary serv. proc.
perm_level_primary=131,perm_spname_primary=FW950.80, <-- perm. side of the primary serv. proc.
curr_level_secondary=136,curr_spname_secondary=FW950.90, <--curr. (activated) level of the secondary serv. proc. (P980/P1080... have redundant serv. processors: prim./sec.)
temp_level_secondary=136,temp_spname_secondary=FW950.90, <-- temp side of the secondary serv. proc.
perm_level_secondary=131,perm_spname_secondary=FW950.80, <-- perm. side of the secondary serv. proc.
---------------------------------------------------------
updlic command
The updlic command can be used to update (query) the system firmware from the HMC CLI
An example to do a concurrent update from IBM website:
updlic -m <man. sys.> -o a -t sys -l latestconcurrent -r ibmwebsite -v
-o <action>
a: retreive,install and activate LIC update (previously activated updates will be automatically accepted, and accept means: copy T --> P)
i: retreive,install but not activate a LIC update
c: accept currently activated LIC update (copy T --> P)
d: disruptively activate a LIC update
u: upgrade LIC to a new release (it will as for a restart)
r: remove the recently installed update and activate the previouly acceted level (it brings back the level that is on the permanent side)
j: reject an installed LIC update (copy P to T)
(!!!this option (j) can be used ONLY if we boot the system first to the permanent side, so P side is active, then can we copy P to T.)
-t <type>
sys: for managed system only,
io: for I/O updates only,
all: for managed system and I/O updates,
sriov: for SR-IOV adapter updates
-l <level>
latest: update to latest level, even if disruptive,
latestconcurrent: update to latest concurrent level
sss: update to a specific level, even if disruptive (sss is 3 cahracter id, like level=136)
release_level: update to a specific release and level, even if disruptive. (950_136 ???)
-r <repository> : ibmwebsite, ftp/sftp, disk/mountpoint for internal hard disk
-q query if an update is concurrent or disruptive. (The update is not performed with this option, and exit codes are documented in man page of updlic)
-----------------------------------------------------
SR-IOV Firmware Update
SR-IOV shared-mode adapters are updated during the system firmware update. If you are updating the system firmware concurrently, the new SR-IOV firmware is not automatically activated to prevent any unexpected outage.
Two types of firmware are required for adapters in SR-IOV shared-mode:
1. adapter driver firmware: it is used for configuring and managing the adapter.
2. adapter firmware: it enables the adapter to interface with the adapter driver firmware.
When you update the system firmware, this new system firmware might also contain adapter driver update or adapter firmware update, or both. The new SR-IOV level is not activated automatically while they are running because of a temporary I/O outage that occurs when the firmware is activated, so we can schedule a time for this outage. The outage lasts about 1 minute per adapter for the adapter driver firmware, and about 5 minutes per adapter when you activate both the adapter driver and the adapter firmware. The best practice is to activate both simultaneously, you cannot activate only the adapter firmware.
New SR-IOV firmware can be activated with system boot, or during maintenance (for example if adapter is replaced during maintenance), or with manual activation from HMC (GUI/CLI) (Firmware update for adapters that are not in SR-IOV shared mode (in dedicated mode) can be done through the OS that owns the adapter or by using HMC as an IO Adapter update.)
-----------------------------------------------------
Commands:
Man. Sys.:
lslic -m <man_sys> -t sys show man. sys. firmware details (installed/activated level....)
lslic -m <man_sys> -t sys -l update -r ibmwebsite show if there are any fw updates available at ibm site, shows concurrency status too
(-l upgrade will check for upgrades)
updlic -m <man. sys.> -o a -t all -l latest -r ibmwebsite -q query whether the latest updates at IBM site are concurr. or disruptive
updlic -m <man_sys> -o k -v readiness check (check before upd. for any issues (events/hw failures..)
updlic -m <man_sys> -o a -t sys -l latestconcurrent -r ibmwebsite -v update man. sys. to the lates concurrent level from ibm site
updlic -m <man_sys> -o r -t sys -v removing last update (go back to previous level)
lslic -c list FW levels on HMC local disk (repository)
updlic -o p --ecnumber 01VM920 remove all FW levels belonging to this release from HMC disk
SR-IOV:
lslic -m <man_sys> -t sriov show sriov level and if there are any updates (update_available=0 means there are no new sriov updates)
(install_separate=1 means the adapter supports to only update the adapter driver firmware)
updlic -m <man_sys> -o f -t sriov --subtype adapterdriver,adapter -s adapt_id -v update (activate) sriov driver+adapter fw (5 min outage, lslic lists adapter id)
updlic -m <man_sys> -o f -t sriov --subtype adapterdriver -s adapt_id1,adapt_id2 -v update only adapter driver firmware for more adapters (1 min outage)
I/O Adapter:
lslic -m <man_sys> -t io -l update -r ibmwebsite check new levels for fc adapter, NVME devices...
(available_level will show new level)
updlic -m <man_sys> -o a -t io -l latestconcurrent -r ibmwebsite -v update io adapters to new level from ibm site
(can be done online, just not in peak hours)
(once a cluster was too sensitive for io latency, and it was reacted during fc adapter upd.)
-----------------------------------------------------
FIRMWARE MANAGEMENT FROM OS (AIX/VIO)
ADAPTER/SYSTEM FIRMWARE:
lsmcode -A displays microcode (same as firmware) level information for all supported devices
(Firmware: Software that has been written onto read-only memory (ROM))
lsmcode -c shows firmware level for system, processor
invscout:
it helps to show which firmware (microcode) should be updated:
1. download: http://public.dhe.ibm.com/software/server/firmware/catalog.mic
2. copy on the server to: /var/adm/invscout/microcode
3. run: invscout (it will collect data and creates: /var/adm/invscout/<hostname>.mup)
4. upload <hostanme>.mup to: http://www14.software.ibm.com/webapp/set2/mds/fetch?page=mdsUpload.html
-----------------------------------------------------
SYSTEM FIRMWARE UPDATE:
(update is concurrent, upgrade is disruptive)
1. download from FLRT the files
2. copy to NIM server (NFS export) the files or put onto an FTP server (xml and rpm was in it for me)
3. makes sure there are no deconfigured cpu/ram in the server, or broken hardware devices
4. On HMC -> LIC (Licensed Internal Code) Maintenance -> LIC Updates -> Channge LIC for the current release
(if you want to do an upgrade (not update) choose: Upgrade Licensed Internal Code)
5. Choose the machine -> Channge LIC (Licensed Internal Code) wizard
6. FTP site:
FTP site:10.20.10.10
User ID: root
Passw: <root pw>
Change Directory: set to the uploaded dir
7. follow the wizard (next, next..), about after 20-30 minutes will be done
"The repository does not contain any applicable upgrade updates HSCF0050" or ..."The selected repository does not contain any new updates."
It can happen if ftp user was not created by official AIX script: /usr/samples/tcpip/anon.ftp
Create ftp user with this and try again.
-----------------------------------------------------
ADAPTER FIRMWARE LEVEL:
For FC adapter:
root@aix1: /root # lscfg -vpl fcs0
fcs0 P2-I2 FC Adapter
Part Number.................09P5079
EC Level....................A
Serial Number...............1C2120A894
Manufacturer................001C
Customer Card ID Number.....2765 <--it shows the feature code (could be like this: Feature Code/Marketing ID...5704)
FRU Number..................09P5080 <--identifies the adapter
Network Address.............10000000C92BC1EF
ROS Level and ID............02C039D0
Device Specific.(Z0)........2002606D
...
Device Specific.(Z9)........CS3.93A0 <--this is the same as ZB
Device Specific.(ZA)........C1D3.93A0
Device Specific.(ZB)........C2D3.93A0 <--to verify the firmware level ignore the first 3 characters in the ZB field (3.93A0)
Device Specific.(ZC)........00000000
Hardware Location Code......P2-I2
For Network adapter:
lscfg -vpl entX --> FRU line
ROM Level.(alterable).......GOL021 <--it is showing the firmware level
For Internal Disks:
lscfg -vpl hdiskX
Manufacturer................IBM
Machine Type and Model......HUS153014VLS300
Firmware (Microcode) update should be started from FLRT (Fix Level Recommendatio Tool).
After selecting the system, it will be possible to choose devices to do an update, and read description, how to do it.
Check with Google the "FRU Number" to get the FC Code (this is needed to get the correct files on FLRT)
Basic steps for firmware upgrade:
0. check if /etc/microcode dir exists
1. download from FLRT to a temp dir the needed rpm package.
2. make the rpm package available to the system:
cd to the copied temp dir
rpm -ihv --ignoreos --force SAS15K300-A428-AIX.rpm
3. check:
rpm -qa
cd /etc/microcode --> ls -l will show the new microcede
compare filesize within the document
compare checksum within document: sum <filename> (sum /usr/lib/microcode/df1000fd-0002.271304)
4. diag -d fcs0 -T download <--it will install (download) the microcode
Description about: "M", "L", "C", "P" when choosing the microcode:
"M" is the most recent level of microcode found on the source.
It is later than the level of microcode currently installed on the adapter.
"L" identifies levels of microcode that are later than the level of microcode currently installed on the adapter.
Multiple later images are listed in descending order.
"C" identifies the level of microcode currently installed on the adapter.
"P" identifies levels of microcode that are previous to the level of microcode currently installed on the adapter.
Multiple previous images are listed in the descending order.
5. lsmcode -A <--for checking
To Back level the firmware
diag -d fcsX -T "download -f -l previous"
Subscribe to:
Comments (Atom)