High Availability, RSCT and HACMP
Computers in the past had very limited performance, but scientific computing tasks required more and more CPU and I/O. The only way to satisfy these demanding requirements was to implement parallelism and spread the workload to multiple systems. With this concept in the 1990s IBM developed a software called PSSP (Parallel System Support Program). PSSP could provide high availability services for large configurations of servers (clusters).
Following the initial success of PSSP, it became apparent that the high availability functions of PSSP could be used in other fields, so these clustering modules have been externalized into a package called RSCT. (RSCT originally was an abbreviation of RISC System Cluster Technology, or RS/6000 Cluster Technology, which later was modified to Reliable Scalable Cluster Technology.)
When RSCT as an individual package became available, other programs, like HACMP started to implement it. HACMP (High Availability Cluster Multi-Processing) development started around the same time to provide high availability solution for applications. HACMP was first shipped in 1991 and at that time one of the pre-requisites was to install IBM PSSP software. Once the Reliable Scalable Clustering Technology (RSCT) became available, HACMP adopted this cluster technology without the need of PSSP installation. HACMP was rebranded to HACMP/ES (Enhanced Scalability), as it provided advantages over the "classic" version.
Following the evolution path, starting with AIX 5.1, RSCT was no longer a separate software product, but it was included in the operating system as a standard feature. So the installation of HACMP/ES has been more simple, as the RSCT package was already part of the AIX installation. Starting with HACMP V5.1, there are no more HACMP classic versions. Later the name HACMP was replaced by PowerHA in Version 5.5 and then PowerHA SystemMirror in Version 6.1. Starting with PowerHA SystemMirror V7.1, some functions of RSCT has been iplemented in AIX kernel, and these functions called as CAA (Cluster Aware AIX). This major change improves the reliability of PowerHA because the cluster service runs now in kernel space rather than user space.
The PowerHA cluster manager uses various sources to get information about possible failures:
- CAA and RSCT monitors the state of the network interfaces, and devices.
- AIX LVM monitors the state of the disks, logical volumes, and volume groups.
- PowerHA application monitoring monitors the state of the applications.
====================================
PowerHA, RSCT and CAA
Since version 5.5, HACMP has been renamed to PowerHA (Power High Availability).The main concept behind PowerHA is the same (as before with HACMP), it helps that applications are still running even if one (or more) components failed. So during PowerHA design the intention is to remove any Single Point of Failure (SPOF) from the environment by using redundant components and automated PowerHA procedures.
When a PowerHA cluster is configured, 3 different layers will work together: PowerHA, RSCT and CAA. We need to configure only PowerHA, and it will take care of the other 2 layers (RSCT and CAA) a well. In traditional situations, there is no need to use CAA or RSCT commands at all, because they are all managed by PowerHA. To check whether the services of each layer are up, we can use different commands, like clmgr, lsrpdomain, and lscluster.
# clmgr -a state query cluster <--Cheking if PowerHA is running
STATE="STABLE“
# lsrpdomain <--Checking if RSCT is running
Name OpState RSCTActiveVersion MixedVersions TSPort GSPort
CL1_N1_cluster Online 3.1.5.0 Yes 12347 12348
# lscluster -m | egrep "Node name|State of node" <--checking if CAA is running:
Node name: powerha-c2n1
State of node: UP
Node name: powerha-c2n2
State of node: UP NODE_LOCAL
If we stop PowerHA, then clmgr command will show "OFFLINE", but RSCT and CAA commands will still show that their services are running. CAA and RSCT are stopped and started together. In a PowerHA environment CAA and RSCT are automatically started by default during reboot. There are situations when we need to stop all three cluster components, for example, when we must change the RSCT or CAA software. To stop all cluster components, use: clmgr offline cluster STOP_CAA=yes
(When CAA is stopped manually, after reboot it has to be started manually again with the START_CAA argument.)
====================================
Cluster services
In PowerHA several services and daemons are working together from RSCT, CAA and PowerHA level.
clshowsrv will show the following:
# clshowsrv -v
...
...
Status of the RSCT subsystems used by PowerHA SystemMirror:
Subsystem Group PID Status
cthags cthags 21627174 active
ctrmc rsct 12910990 active
Status of the PowerHA SystemMirror subsystems:
Subsystem Group PID Status
clstrmgrES cluster 15466992 active
clevmgrdES cluster 15139118 active
clinfoES cluster 18612536 active
Status of the CAA subsystems:
Subsystem Group PID Status
clcomd caa 14942498 active
clconfd caa 21758316 active
PowerHA services:
-clstrmgrES (Cluster Manager daemon)
Cluster Manager is the main PowerHA daemon. It manages the cluster (controls resources, runs event scripts ...), monitors the nodes (using information from RSCT), and recovers from HW/SW failures. After PowerHA is installed, the clstrmgrES process is always running, regardless if the cluster is online or offline. The Cluster Manager depends on RSCT (Group Services, RMC), if RSCT fails to start clstrmgrES fails as well. If clstrmgrES hangs or exited abnormally, SRC will issue halt -q (executes the /usr/es/sbin/cluster/utilities/clexit.rc script to halt the system).
- clevmgrdES (Cluster Event Manager daemon)
Cluster Event Management daemon is collecting event reports from various sources (ahafs, snmpd, error report...) and passes these up to the cluster manager. PowerHA 7.1 introduced system events, which are handled by a new subsystem called clevmgrdES. The event manager daemon automatically monitors for system events like CAA repository disk offline or shared VGs offline. The rootvg system event monitors the loss of access to the rootvg. Previous versions of PowerHA/HACMP were unable to monitor this type of loss or perform a failover (for example when a SAN disk that was hosting the rootvg was lost). System event monitoring is now at the kernel level (/usr/lib/drivers/phakernmgr kernel extension), which is loaded by clevmgrdES.
- clinfoES (Cluster Information daemon)
Clinfo is a cluster monitoring program, which is based on SNMP. The cluster manager updates SNMP with information about the state of the cluster and clinfo polls SNMP processes for these information. Then other applications (like clstat, cldump) which are clinfo client applications, use the clinfo API functions to request and show details of the cluster. The clinfo daemon is optional.
When Clinfo starts, it reads the /usr/sbin/cluster/etc/clhosts file. This file lists IPs of all available nodes. Clinfo searches through this file for an active SNMP process on a node, starting at the first IP address in the clhosts file. Once it locates a SNMP process, Clinfo starts to receive information about the state of the cluster from that SNMP process.
startsrc -s clinfoES starts clinfo (/usr/es/sbin/cluster/etc/rc.cluster this script also starts everything)
stopsrc -s clinfoES stops clinfo
RSCT services: (more details are on RSCT page)
-cthagsd (Group Services)
Cluster Group Services is responsible for coordinating and monitoring changes across all cluster nodes and ensures all of them finished properly.
-rmcd (Resource Monitoring and Control)
Resource Monitoring and Control monitors resources (like disk space, CPU usage, processes etc.) and performs an action in response to a defined condition (event). The rc.cluster script in inittab ensures that the RMC subsystem is running.
CAA services:
- clcomd (Cluster Communications daemon)
Earlier versions of PowerHA used rsh to run commands on other nodes. As this was insecure, PowerHA uses the CAA Cluster Communications daemon (clcomd) to control communication between the nodes. clcomd is used by PowerHA for cluster verification and synchronization, file collections, remote command execution, user and password administration and for the C-SPOC tasks. clcomd must be running before any cluster services can be started and it uses port 16191.
clcomd is started by inittab (entry created by PowerHA during install) and it is controlled by SRC: startsrc/stopsrc. Refresh is used to reread /etc/cluster/rhosts file and move the log files (/var/hacmp/clcomd/clcomd.log). The real use of the /etc/cluster/rhosts file is before the cluster is first synchronized. During initial configuration of the cluster, the /etc/cluster/rhosts file is empty. The /etc/cluster/rhosts file (root.system 0600) should contain a list (one entry per line) of the trusted IP addresses (or hostnames) in the cluster. During the first synchronization, the rhosts file is used by clcomd, ODM classes are populated and the CAA cluster is created . After the CAA cluster is created, the entries in /etc/cluster/rhosts are no longer needed (only when addig new node to the cluster). However, do not remove the file. If you ever delete and re-create the cluster, or restore cluster configuration from a snapshot, you will need to populate /etc/cluster/rhosts again.
Another function of clcomd is, to run remote commands on the cluster nodes. Only a small set of PowerHA commands are trusted and allowed to run as root, these are the commands in /usr/es/sbin/cluster. The remaining commands do not have to run as root.
These use clcomd:
clrexec - Run specific and potentially dangerous commands.
cl_rcp - Copy AIX configuration files.
cl_rsh - Used by the cluster to run commands in a remote shell.
- clconfd (Cluster Configuration daemon)
Cluster configuration daemon (clconfd) keeps CAA cluster configuration information in sync. It wakes up every 10 minutes to synchronize any necessary cluster changes. Starting with PowerHA V7.2, clconfd monitors the /var file system as well (by default thershold is at 75% with 15 minutes intervals)
ps -ef | grep clconfd check configured values used by clconfd
odmget -q "subsysname='clconfd'" SRCsubsys check clconfd config
chssys -s clconfd -a "-t 80 -i 10" change clconfd monitoring threshold to 80% with 10 mins intrval
====================================
clverify (Cluster verification and synchronization)
The PowerHA "Verification and synchronization" task ensures that configurations are correct and consistent across nodes in the cluster. PowerHA stores information in the ODM and these ODM files must be consistent between nodes. During cluster verification, PowerHA collects configuration data from all cluster nodes and checks the consistency of these files. If verification is successful, the configuration can be synchronized across all nodes. During cluster synchronization, PowerHA copies the ODM from the local nodes to all remote nodes.
!!! It is very important regularly verifying the cluster configuration and always verify and synchronize from the node where the change occured !!!
Automatic cluster verification happens each time cluster services are started, so PowerHA will detect and correct configuration issues, if we did not do manual sync. Automatic cluster configuration monitoring is also doing verification by default every 24 hours.
/var/hacmp/clverify/clverify.log automatic cluster conf. mon. log (occurs every 24 hours, by default on the first node in alphabetical)
/var/hacmp/clverify/fail stores data from the most recent failed verification attempt
/var/hacmp/clverify/pass stores data from the most recent passed verification attempt
It is possible to start manually verification without syncing (without changing anything) in smitty hacmp:
- smitty hacmp --> Problem Determination Tools --> PowerHA SystemMirror Verification --> Verify PowerHA SystemMirror Configuration
- smitty hacmp --> Custom Cluster Configuration --> Verify and Synchronize Cluster Configuration (Advanced).
(if cluster is inactive we can choose verification only)
====================================
Cluster Single Point of Control (C-SPOC)
C-SPOC is a useful tool to manage the cluster from any single point. With C-SPOC we can perform cluster-wide administration tasks from any active node in the cluster. The C-SPOC function is provided through SMIT menus and CLI. The commands are in /usr/es/sbin/cluster/cspoc and their name start with cli_ prefix. C-SPOC uses the Cluster Communications daemon (clcomdES) to run commands on remote nodes. If this daemon is not running, the command might not be run and C-SPOC operation might fail. The logs are in the cspoc.log file.
Some C-SPOC tasks:
- File collections: user defined set of files, which are kept in sync between nodes
- User administration: keeps in sync user IDs and passwords
- Shared storage management: LVM commnds and changes are kept in sync between nodes
====================================
Hi :)
ReplyDeleteNeed a help. Is there a way to change the node_id attribute of HACMPnode odm stanza?
Currently my nodes are showing as "2" and "3". But rest of all other clusters are as 1&2. I need to change the node_id value for the problematic cluster to 1&2.
Any thoughts?
Thanks
Hi,
DeleteI never had to do that, but I would open an IBM call, as it looks strange....
Agreed - looks strange to me.. Even tried to cahnge the node_id value in HACMP odm, but it goes back to 2&3 on verify and sync :(
ReplyDeleteCan I update netmon.cf on the live servers (while RG is online)?
ReplyDeleteNetmon.cf file will be read at cluster startup, since RSCT topology services scans the netmon.cf file during initialization. I think you can update the file, just your update will not take effect until you restart the cluster.
DeleteHi,
DeleteThanks for your prompt reply.
So, bringing services down with the unmannaged option and then restart with automatically manage Rg option will activate the netmon.cf?
Thanks,
Hi, as far as I know cluster services are not restarted if you put Resource Group to unmanage and then manage.
DeleteYou need a full stop for the cluster, then at the next start it will read the new config file.
Hi,
ReplyDeleteMay i know, when i can expect your GPFS articles in this blog?
Regards,
Siva
Hi, I don't have any systems with GPFS...when there will be one, where I can test few things, I'll write about it.
DeleteHI...
ReplyDeleteTo start a cluster through SMIT,the command is #smitty cl_start or #smitty clstart.
in which version it is cl_start and in which versin it is clstart
To start a cluster does the smitty path is same for 5.5 and 6.1
I am little bit confused, kindly help me.
Thank you
Hi, both of them are the same and it is working for me under 6.1, I guess same with 5.5 (Why did not try these by yourself?)
DeleteThank you for replying.
DeleteI have recently completed y course. Now I dont have IBM box to check. Thats why I took your help.
Thank you
Hi..,
ReplyDeleteThe df commands show the file system is 100% full,but du commands shows no files,what's the reason?can u give me some solution.
Hi, can you post the outputs of those commands?
DeleteHi,
ReplyDeleteI went through your VIO, NIM, General articles and are very helpful, but I never did a HACMP configuration...with your blog I am able to understand the topology and definitions, can you alos please demonstrate or explain on how to configure 2 node cluster step by step ( like what filesets's or SW / how to gather IP's on node A and B and how to input and how to configure disk/network hearbeat, naming, VG's)
I worked on cluster which is already configured and performed cluster start/stop/sync/manual failover/failback/c-spoc administration {creating users/add pv's etc}, but never configured one.
if you can provide/demonstrate the procedure it will be very appriciated.
Hi, there is some description at this link: http://aix4admins.blogspot.hu/2011/08/build-and-configure-1.html.
Deletehi
ReplyDeletehow to move resource groups in hacmp? what is the pre request for doing that?
Hi, what are the steps to go from "IPAT via IP replacement" to "IPAT via aliasing" with the least effort possible? Thank you
ReplyDeleteCould u please tel me watz the major difference between Aix p6 and p7 serious???
ReplyDeleteCan you run C-SPOC commands when only one node is started in the Cluster? Command like unmirrorvg and mirrorvg. Will the 2nd node's ODM be updated?
ReplyDeleteWhich IP is responsible for Heartbeat, either Serivice IP or Boot IP? Can't get exact information in Redbooks & Google too.
ReplyDeleteSiva
Boot IP will responsible for Heart beat.
DeleteWhat are the differences between PowerHA 5.4/6.1 and 6.1/7.1
ReplyDeleteHi all. I'm asking for some help seeing if anyone has a blue print or document that outlines how they keep their servers in sync. Not talking about the resource groups, but things like ssh or anything that is just local to each machine. Do you have a document that outlines what you do so no steps are missed.
ReplyDeleteHi, I am not aware of any "install packages" sync tool in PowerHA, but for files you can use the "File Collections" in PowerHA. https://www.ibm.com/support/knowledgecenter/SSPHQG_7.2/admin/ha_admin_create_file_collection.html
Delete
ReplyDeleteHello, I found the guide to understand what PowerHA is quite up-to-date, right?
On the other hand, now you can install a graphical interface and work more comfortably, right?
Hello , Do you have some instructions on Live update/pathching step by step ?
ReplyDelete