HA - CAA

CAA (Cluster Aware AIX)

CAA is an AIX feature that was introduced in AIX 6.1 TL6 and AIX 7.1. It helps to easily create a cluster. CAA is not used as a stand-alone package, it is used with PowerHA or with Shared Storage Pool. It can be seen as a set of commands and services that other applications (like PowerHA, SSP) can exploit to provide high availability and disaster recovery support. CAA does not provide application monitoring and resource failover capabilities, those are provided by PowerHA for example. IBM PowerHA, SSP and even RSCT (Reliable Scalable Cluster Technology) use these built-in AIX clustering capabilities, and the reason for these built-in functions was to simplify the configuration and management of high availability clusters.

CAA can provide specific events, so that applications can monitor these from any node in the cluster:
Node UP and node DOWN
Network adapter UP and DOWN
Network address change
Disk UP and DOWN
Predefined and user-defined events

CAA needs the following ports on all nodes for network communication:
4098 (for multicast)
6181
16191
42112

Checking CAA related daemons (services):
# lssrc -g caa
Subsystem Group PID Status
 clcomd caa 6553806 active
 clconfd caa 6619352 active

clcomd: It is the cluster communications daemon. Since PowerHA 7.1 it is a CAA service, before PowerHA 7.1 it was part of PowerHA (clcomdES). The rhosts file that is used by the clcomd is in the /etc/cluster/rhosts file. The old clcomdES rhosts file in the /usr/es/sbin/cluster/etc directory is not used.
clconfd: The clconfd subsystem runs on each node of the cluster. The clconfd daemon wakes up every 10 minutes to synchronize any necessary cluster changes.

Starting with AIX 7.1 TL2, no longer require a total cluster outage to upgrade the cluster nodes:
A rolling upgrade of a cluster is done by taking a node offline and upgrading it to a new AIX technology level, while the other nodes remain active. After a node is upgraded, the node is rebooted and brought online by issuing the clctrl command. This process is repeated until all the nodes are upgraded. In a mixed cluster environment, CAA maintains compatibility with nodes that are still running prior AIX levels. New features are not enabled until all the cluster nodes are upgraded to the new technology level.

=============================

CAA Repository disk

The cluster repository disk stores the cluster configuration data. It provides a central repository, so this disk must be accessible from all nodes in the cluster. A minimal disk size of 10 GB is preferred (1GB may be also enough). This disk cannot be used for application storage or any other purposes, the use of LVM commands (mkvg, mklv...) are not supported whith the cluster repository disk. The AIX LVM commands are single node administrative commands, and are not applicable in a clustered configuration. (The cluster repository disk must be compliant with the 512 byte block size.)

CAA stores the repository disk related information in the ODM CuAt, as part of the cluster information.

# odmget CuAt | grep -p cluster
CuAt:
        name = "cluster0"
        attribute = "node_uuid"
        value = "52a6b8be-fff8-11e5-8e37-56a1a7627864"
        type = "R"
        generic = "DU"
        rep = "s"
        nls_index = 3

CuAt:
        name = "cluster0"
        attribute = "clvdisk"
        value = "d7063c81-3f64-b5f7-d82b-fa8ed99bfe61"
        type = "R"
        generic = "DU"
        rep = "s"
        nls_index = 2

In case this ODM entry is missing (which can cause that a node will fail to join the cluster) it can be repopulated (and the node forced to join the cluster) using clusterconf command: clusterconf -r hdisk#  (hdisk# is the repository disk)

=============================

/aha (AHAFS)

Nodes that belong to a CAA cluster use the common AIX HA File System (AHAFS) for event notification. AHAFS is a pseudo file system used for synchronized information exchange and it is implemented in the AIX kernel extension. AHAFS is mounted on /aha. It can monitor predefined and user-defined system events and automatically notifies registered users or processes about the occurrences of the following types of events:
- Modification of content of a file
- Usage of a file system that exceeds a user-defined threshold
- Death of a process
- Change in the value of a kernel tunable parameter

=============================

Creating a cluster

The command mkcluster is used for creating a CAA cluster:
# mkcluster -n mycluster -m nodeA,nodeB -r hdisk1 -d hdisk2,hdisk3

The name of the cluster is mycluster, the nodes are nodeA and nodeB, the repository disk is hdisk1 and the shared disks are hdisk2 and hdisk3. When the cluster is ready a special volume group (caavg_private), new logical volumes and filesystems are created.

The following happens after issuing the mkcluster command:
- The cluster configuration is written to the cluster repository disk.
- Special volume groups, logical volumes, filesystems are created on the cluster repository disk. (caavg_private)
- Cluster services are made available to other applications like RSCT or PowerHA.
- Additional storage related taks...
- A clusterwide multicast address is established.
- The node discovers and monitors the available communication interfaces.
- The cluster interacts with Autonomic Health Advisory File System (AHAFS) for clusterwide event distribution and makes messages available to PowerHA, RSCT...

CAA uses IP based network communications and storage interface communication through Fibre Channel. When using both type of communication, all nodes in the cluster can always communicate with any other nodes in the cluster and thus eliminating "split brain" incidents. If some node cannot communicate with others DMS (Dead Man Switch) timers are triggered.

A deadman switch is an action when Cluster Aware AIX (CAA) detects that a node become isolated. This occurs when nodes are not capable of communicating via network and repository disk anymore. Based on the deadman switch setting (or the deadman_mode tunable) the AIX operating system can react differently. DMS monitors for some specific time (node_timout) IO traffic, process health etc. and after the timeout it can force a system shutdown or generate an Autonomic Health Advisor File System (AHAFS) event.

=============================

Deadman switch (DMS)

A deadman switch is an action when CAA detects that a node become isolated. This occurs when nodes are not capable of communicating via network and repository disk anymore. The purpose of the DMS is to protect the data on the external disks. The AIX operating system reacts differently depending on the DMS (deadman_mode) tunable. The deadman switch tunable can be set to force a system crash or generate an AHAFS event. 

# clctrl -tune -L deadman_mode           <--check the current setting (clctrl -tune -h deadman_mode gives more details
NAME DEF MIN MAX UNIT SCOPE
 ENTITY_NAME(UUID) CUR
--------------------------------------------------------------------------------
deadman_mode a c n
 caa_cl(25ebea90-784a-11e1-a79d-b6fcc11f846f) a
--------------------------------------------------------------------------------

When the value is set to "a" (assert), the node will crash upon the deadman timer popping.
When the value is set to "e" (event), an AHAFS event is generated.

By default, the CAA deadman_mode option is “a”. If the deadman timeout is reached, the node crashes immediately to prevent a partitioned cluster and data corruption. 

=============================

Commands:

/var/adm/ras/syslog.caa                  caa log (in syslog: caa.info /var/adm/ras/syslog.caa rotate size 1m files 10)

lscluster -i                             lists interface configuration of the cluster
lscluster -i | egrep 'Node|Interface'    overview of cluser, all interfaces (network and disk heartbeat)
lscluster -m                             lists info about nodes in the cluster
lscluster -d                             lists disks in the cluster and their status
lscluster -s                             lists statistis of network of a cluster
lscluster -c                             shows info about cluster configuration

mkcluster                                create a cluster
chcluster                                change a cluster configuration
rmcluster                                remove a cluster configuration
clcmd                                    run a command on all nodes of a cluster (clcmd date: it shows the date on all nodes)

lsattr -El cluster0                      lists IDs of cluster, disks

clctrl -stop -n mycluster -a             stop cluster on all nodes (stop cluster on 1 node: clctrl -stop -n mycluster -m myserver1)
clctrl -start -n mycluster -a            start cluster on all nodes (after completing maintenance)
clctrl -tune -L                          lists CAA related tunables (values stored in repository disk)
clctrl -tune -o <tunable>=<value>        modifies a tunable across cluster (new value will be active at the next start)

=============================

If you want to use force option with CAA commands (not -f flag), the environment variable CAA_FORCE_ENABLE has to be set to 1:
# export CAA_FORCE_ENABLED=1
# rmcluster -r hdisk2

(Using force with rmcluster will remove the repository disk and ignore all errors.

=============================


3 comments:

  1. One question. How do you varyoff a "caavg_private" VG without messing up the cluster. I need to UPDATE SDDPCM and I am not sure if removing the cluster will be a good choiche.

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
  2. You can varyoff CAA (caavg_private) using 'clctl -stop -n CLUSTERNAME -m NODENAME'
    Use "clctl -start -n CLUSTERNAME -m NODENAME" to varyon. (Try this command from the other node of cluster if you are facing problem with varyon on problematic node.)

    ReplyDelete