AIX for System Administrators: HA

DARE - Snapshot

HACMP provides a facility that allows changes to cluster topology and resources to be made while the cluster is active. This facility is known as DARE or to give it it's full name Dynamic Automatic Reconfiguration Event. This requires three copies of the HACMP ODM.

Keep in mind:all SMIT actions only affect the DCD. Invoking a synchronize at this point causes the DCD copied to all cluster nodes and a dynamic reconfiguration event is invoked. (ACD is never changed directly by the user)

An example:
When you configure an HACMP cluster, configuration data is stored in HACMP specific object classes in the Configuration Database (ODM). The AIX ODM object classes are stored in the default system configuration directory (DCD), /etc/es/objrepos. At cluster startup, HACMP copies HACMP-specific ODM classes into a separate directory called the Active Configuration Directory (ACD). If you synchronize the cluster (topology or resources) while the cluster manager is running on the local node, this action triggers a dynamic reconfiguration event. During Dynamic Reconfiguration the Default Configuration Directories (DCDs) on all cluster nodes are updated, and the data in the ACD is overwritten with the new configuration data, then the HACMP daemons are refreshed.

HOW DARE WORKS:

1. change topology or resources in smitty hacmp (this will change the DCD)

    ROLLBACK:
    If you want to discard this changes what you make (in DCD) (only if you did not synch. yet):
    Problem Determ. Tools --> Restore HACMP Conf. Database from Active Conf.
    (during this a snapshot will be taken of the DCD if needed in future)

2. synchronize topology or resources in smitty:
    -Verifying and Synchronizing (Standard): (this is under Initialisation and Standard Conf.)
    (If you choose this there is no opportunity to modify the process in any way)

    or

    -Verifying and Synchronizing (Extended): (this is under Extended Conf.) here you can choose:
    emulate or actual: emulate the change before actually doing it
    force: this is dangerous, should be on "no"
    verify changes only: it will be not synch. only verify, so it can be tested if a change is valid
    logging: verbose can be used if standard does not show the reason why verification fails.

   During the synchronization:
   DCD is copied to SCD on the nodes -> snapshot of the ACD* -> content of SCD copied to ACD -> clstrmgr refresh -> SCD deleted

    ROLLBACK:
    1. If DARE change successful after synch., but it does not give the desired result, then use the snapshot taken prior*
    snapshot path: /usr/es/sbin/cluster/snapshots (i.e: active.0.odm, 0 is the most recent)
    Extended Configuration --> Snapshot Conf. --> Apply Snapshot

    2. If a DARE change fails and SCD does not cleared, then SCD acts as a lock and it prevents further config. changes.
    If an SCD exists on any cluster node, then no further synch. permitted until it is deleted in smitty hacmp:
    Problem Determ. Tools --> Release Locks Set by Dyn. Reconf.

---------------------------------
Except cluster name, the add and delete functions can be used while the cluster is running.
---------------------------------

CLUSTER SNAPSHOT:

The primary tool for backing up PowerHA is the cluster snapshot. The information saved in a snapshot is the data stored in the HACMP Configuration Database classes (such as HACMPcluster, HACMPnode, HACMPnetwork, HACMPdaemons). This is the information used to re-create the cluster configuration when a cluster snapshot is applied.

The cluster snapshot does not save any user-customized scripts, applications. For example, the names of application servers and the locations of their start and stop scripts are stored in the HACMPserver Configuration Database object class. However, the scripts themselves and also any applications they might call are not saved.

The cluster snapshot utility stores the data it saves in two separate files:
- ODM data file (.odm): This file contains all the data stored in the HACMP Configuration Database object classes and it has the.odm extension. Because the Configuration Database information is largely the same on every cluster node, the cluster snapshot saves the values from only one node.
- Cluster state information file (.info): This file contains the output from standard AIX and PowerHA commands with the .info file extension. By default, this file no longer contains cluster log information, but you can specify to collect cluster logs.

For a complete backup, take a mksysb of each cluster node, pick one node to perform a cluster snapshot and save the snapshot to a safe location for disaster recovery purposes. It is a good practice to take the snapshot before taking the mksysb of the node so that it is included in the system backup

----------------------------------

It records the cluster configuration.
Adding a snapshot: Ext. Conf. -> Snapsh. Conf. ...

SNAPSHOTPATH variable shows the path to snapshots.
Default path: /usr/es/sbin/cluster/snapshots

Applying a snapshot:
When a cluster snapshot is applied, it overwrites the existing HACMP ODM classes on all cluster nodes with the HACMP ODM classes contained in the snapshot.

A snapshot may be applied from any cluster node.

if a snapshot is applied to a running cluster it will trigger a cluster wide DARE operation. In addition to modifying the default configuration directory (DCD), it also replaces the configuration data stored in the active configuration directory (ACD), and makes those changes the active configuration of the system.

"Un/Configure Cluster Resources?": If this option set to yes, HACMP changes the definition of the resource in the ODM and it performs any configuration necessary as well. (e.g: if removing a filesystem, it will be removed from the ODM, and it will be unmounted as well.)

9 comments:

Anonymous said...: "Un/Configure Cluster Resources?": If this option set to yes, HACMP changes the definition of the resource in the ODM and it performs any configuration necessary as well. (e.g: if removing a filesystem, it will be removed from the ODM, and it will be unmounted as well.)

Question is... Filesystem Information is not stored in ODM... Can you provide some more example on it ?; November 28, 2012 at 6:23 AM
aix said...: Please take into account that is not a regular filesystem, but a filesystem which has been added to cluster configuration. When you configure an HACMP cluster, configuration data is stored in HACMP specific object classes in the Configuration Database (ODM).So, when you remove something, which belongs to the cluster as well, ODM will be updated too.; November 28, 2012 at 10:07 AM
Anonymous said...: Any Idea how to solve below issue without bringing down the cluster...

Early response is highly appreciated :-)

FYI -- while doing verif. & sycn. I'm getting below error ..

Command: failed stdout: yes stderr: no

Before command completion, additional instructions may appear below.

cldare: Node(s) Us12Test in the cluster is(are) forced down. Dare cannot be run
as long as any of the nodes in the cluster are forced down.; April 16, 2013 at 5:38 AM
aix said...: It says that one of your nodes in the cluster are in forced down state. If you check 'lssrc -ls clstrmgrES' it should show that a node is down. I guess that node has no Resource Groups otherwise it will be more obvious. You should bring that node up. (But be aware if you cluster version is old, and it has a POL setting, it can happen your RG will be moved to the other node....); April 16, 2013 at 8:36 AM
Anonymous said...: Hi Aix, Thanks for your reply ..
When I checked 'lssrc -ls clstrmgrES' it is showing on node Us12Test forced down ...even I have checked cluster services it is running ..
when I do veri. & sync. it is showing error as earlier post.

Cluster version 6.1.0.6

any help..; April 16, 2013 at 9:39 AM
aix said...: I think you should do a start cluster services on Us12Test. As far as I know clstrmgrES is running always on PowerHA 6.1. Someone earlier probably did "unmanage resource group" on Us12Test, and since then you cluster was in this state.

Some info about the relation between forced down and unmanage resource group: "Unmanage resource groups. The cluster services are stopped immediately. Resources that are online on the node are not stopped. Applications continue to run. This option is equivalent to the forced down option in previous releases. "; April 16, 2013 at 11:51 AM
Manoj Suyal said...: Hi Balazs,

Nice article keep it up !

I found some type error on blog .. DCD path is incorrect ..
An example:
When you configure an HACMP cluster, configuration data is stored in HACMP specific object classes in the Configuration Database (ODM). The AIX ODM object classes are stored in the default system configuration directory (DCD), /etc/es/objrepos.; July 18, 2013 at 9:37 AM
aix said...: Hi Manoj, I'm not sure if it is a typo, probably /etc/objrepos and /etc/es/objrepos are both valid:
# ls /etc/es/objrepos
HACMPcommadapter
HACMPcommadapterOLD
HACMPcommand
HACMPcommandOLD
...; July 18, 2013 at 1:25 PM
Anonymous said...: help me to change snapshot default location; January 6, 2014 at 11:38 AM

AIX for System Administrators

dropdown menu

HA - Snapshot

9 comments: