dropdown menu

Storage - VG, NFS

Enhanced Concurrent VG

PowerHA is using highly available filesystems on a special VG, which is called "enhanced concurrent volume group" (ECVG). This is basically a usual VG with the possibility to access to the volume group on all nodes. It can be used in either in a concurrent or a non-concurrent setup.
In Concurrent setup: An application runs on all cluster nodes at the same time. In this setup ECVG enables concurrent access to the VG
In Non-concurrent setup:  An application runs on one node at a time. The volume groups are not concurrently accessed, they are accessed by one node at a time.

ECVG has the ability to vary on VG in two modes:
Active state: The vg behaves the same way as the traditional varyon. Operations can be performed on the vg, and logical volumes and file systems can be mounted.
Passive state: The passive state allows limited read only access to the volume group descriptor area (VGDA) and the logical volume control block (LVCB).




So, in non-concurrent mode, the node which owns the VG (where appl. is running) can do read-write operations, but on the other node only limited read operation is possible (passive mode). These different modes can be checked by lsvg command:

# lsvg bb_vg
VOLUME GROUP:       bb_vg
VG STATE:           active  
VG PERMISSION:      read/write   <--this is the actve node

# lsvg bb_vg
VOLUME GROUP:       bb_vg  
VG STATE:           active    
VG PERMISSION:      passive-only <--this is the standby node

(Also "lsvg -o" will report only those volume groups which are are varied on in active mode.)

During cluster configuration these volume groups are activated in passive mode. When the resource group comes online on the node, the volume group varied on in active mode. When the resource group goes offline, the volume group is varied off to passive mode.

This type of VG can be created in "smitty hacmp" under C-SPOC --> Storage → Volume Groups → Create a Volume Group and then choose both nodes...

--------------------------

Fast disk takeover

Fast disk takeover reduces total fallover time by providing faster acquisition of the disks without having to break SCSI reserves. It uses enhanced concurrent volume groups, and additional LVM enhancements provided by AIX. When a node fails, the other node changes the volume group state from passive mode to active mode. This change takes approximately 10 seconds and it is at the volume group level. This time impact is minimal compared to the previous method of breaking SCSI reserves.

The active and passive mode flags to the varyonvg command are not documented because they should not be used outside a PowerHA environment. However, you can find it in the hacmp.out log.
Active mode varyon command: varyonvg -n -c -A bb_vg
Passive mode varyon command: varyonvg -n -c -P bb_vg

When a resource group is brought online, PowerHA checks a disk with lqueryvg to determine if it is an enhanced concurrent volume group or not:
# lqueryvg -p hdisk0 -X
0             <---return code of 0 (zero) indicates a regular non-concurrent volume group (like rootvg)

# lqueryvg -p hdisk8 -X
32           <--return code of 32 indicates an enhanced concurrent volume group

--------------------------

shared vg should not be on autovary: chvg -an shared_vg
lvlstmajor: this command lists the available major numbers for each node in the cluster

----------------------------

NodeA has changes, we want nodeB to be aware about them:
(for example an fs in a shared vg has been increased without HACMP)

YOU CAN TRY THESE:
I. TRY C-SPOC FIRST: smitty hacmp -> c-spoc -> HACMP Logica Volume Management -> Synchronize a Shared Volume Group Definition


II. MANUAL METHOD:

    1 .NodeA:
    -check lv ownerships:               ls -l /dev/<lvname>
    -if varyoffvg is possible:          varyoffvg <vgname>
    -if varyoffvg is not possible:      varyonvg -bun <vgname>


    2. NodeB:
    -if vg was not exported on the remote node (nodeA)learning import is possible (this preserves lv ownerships under /dev):
                            importvg -L <vgname> -n <hdiskname>
           
    -if learning import not possible and vg already exists, first should be exported:
                            exportvg <vgname>

    -import vg from disk:   importvg -y <vgname> -n -V<maj.numb.> <hdisk>
   
    -!!!!if necessary autovaryon should be set back to no on the standby node: varyonvg -bun <vg> -> chvg -an <vgname> -> varyoffvg
    -if vg was exported/imported lv ownerships will be reset to root.system, check if ok under /dev

    3. NodeA:
    -if varyonvg -bun ... was used, vg should be set back to normal use:
                            varyonvg <vgname>

III. IMPORT VG DEFINITIONS TO HACMP:
    On the node with the incorrect data and the VG NOT varied on export VG: exportvg shared_vg
    On the node with the VG varied on:
    smitty hacmp -> Ext. Conf. -> Ext. Res. Conf. -> HACMP Ext. Res. Group -> Change Show Res. and Attr.:
    "Automatically Import Volume Groups" set to "true", and the VG information will be synchronized immidiately.

--------------------------

Lazy update
If a volume group under PowerHA is updated directly (that is, without C-SPOC), information on other nodes will be updated only, when PowerHA  brings the vg online on those nodes, but not before.

The volume group time stamp is maintained in VGDA, and the local ODM.  PowerHA updates both these timestamps when a change is made to the volume group.  When PowerHA is doing varyon vg, it compares the time stamp in ODM with the VGDA. If the values differ, PowerHA will update ODM with the information in the VGDA.

--------------------------

FS extension after new storage has been added
1. cfgmgr - on both nodes
2. add PVID - check LUNs are the same
3. add LUN to the vg in smitty hacmp (it will do necessary actions on both nodes)
   (smitty hacmp -> system man. -> hacmp log. vol. man. -> shared vol. gr. -> set charact. of a ... -> add a volume ...)
4. lv, fs extension in smitty hacmp:
    1. usual fs extension: hacmp is counting in 512byte blocks. 1MB=2048

    or

    2. if we want to choose which disks should be used:
        -first increase lv (choose disks)
        -lsfs -q (it will show new lv size in 512byte blocks)
        -increase fs by the output of lsfs -q

--------------------------

If PVIDs are not consistent (on nodeB):
nodeA:    varyoffvg sharedvg

nodeB:    rmdev -dl hdisk3 (this will not delete data on the disk, only removes ODM definitions)
          rmdev all inconsistent disks
          cfgmgr
          importvg -V123 -yshared_vg hdisk3  (ODM will be updated with new values)
          chvg -an sharedvg
          varyoffvg sharedvg

nodeA:    varyonvg sharedvg


-----------------------------------

Error: timestamp is different for a VG on the 2 nodes:

This could be because timestamp on the disk and ODM is different for the vg (at least on 1 node)

You can check and compare the timestamps from ODM and disk:
-ODM:
    lsattr -El <vgname>                                    <--there will be a line for the timestamp
    odmget CuAt | grep -p timestamp | grep -p <vgname>     <--there will be a line for the timestamp

-disk:
    lqueryvg -Tp <hdisk>                                   <--it can be any disk from the vg

You can check on both nodes.

Correction:
    -increase in smitty hacmp
    or
    -importvg:
        node1: varyonvg -bun ...          <--break the reservation
        node2: importvg -L ...            <--updates the timestamp
        node2: lsattr -El ...             <--verify the values
        node1: varyonvg -n ...            <--restores the reservation

-----------------------------
Importing vgs with ESS:

-if hdisks has pvid:
    importvg -y vg_name -V major# hdisk1
    hd2vp vg_name

-if vpath has only pvid:
    importvg -y vg_name -V major# vpath0

-if neither hdisks nor vpath has pvid:
    chdev -l vpath0 -a pv=yes
    importvg -y vg_name -V major# vpath0

-----------------------------

NFS with POWERHA:

If NFS exports are defined through PowerHA, all NFS exports must be controlled by PowerHA. AIX and PowerHA NFS exports cannot be mixed. NFS information is kept in /usr/es/sbin/cluster/etc/exports, which has the same format as the AIX exports file (/etc/exports).

When configuring NFS through PowerHA, you can control these items:
- The network that PowerHA will use for NFS mounting.
- NFS exports and mounts at the directory level.
- the field “file systems mounted before IP configured” must be set to true (this prevents client access before needed)
- default is to export filesystems rw to the world, in /usr/es/sbin/cluster/etc/exports you can control that




NFS cross-mounts

By default, NFS exported file systems, will automatically be cross-mounted. This means, that the node that is hosting the resource group mounts the file systems locally, NFS exports them, and NFS mounts them (This node becomes NFS server and NFS client at the same time.) All other nodes of the resource group simply NFS-mount the file systems, thus becoming NFS clients. If the resource group is acquired by another node, that node mounts the file system locally and NFS exports them, thus becoming the new NFS server.


Syntax for configuration: /a; /fsa (/a: local mount point; /fsa: exported dir)

For example:
Node1 with service IP label svc1 will locally mount /fsa and NFS exports it.
Node1 will also NFS-mount svc1:/fsa on /a
Node2 will NFS-mount svc1:/fsa on /a


-------------------------------------------

28 comments:

  1. can we increase PP size of Vg which is on HACMP.
    How to reflect the same on two nodes

    ReplyDelete
  2. If PP size needs to be changed, you need to recreate the VG. (backing up the VG, recreating the VG with correct PP size, restoring the VG). On HACMP you have shared disks, so when you recreate the VG , you have to do it in "smitty hacmp", so both servers will know about the changes. The restore will happen on the online node only. Hope this helps.

    ReplyDelete
  3. How can we export the filesystems after we add to the RG.

    ReplyDelete
  4. The NFS filesystem which are exported, are attributes of the resource group:
    Extended Conf. --> Ext. Resource Conf. --> HACMP Ext. Res. Group Conf. --> Change/Show Res. and Attr....

    ReplyDelete
  5. how we can export the NFS filesystem to particular nodes in HACMP?My understanding is that first we need to add it in Resource group and manually edit /usr/es/sbin/cluster/etc/exports?

    ReplyDelete
    Replies
    1. If /usr/es/sbin/cluster/etc/exports file missing HACMP will export NFS filesystems with default options (root access for everyone). If special options are needed you can do that in /usr/es/sbin/cluster/etc/exports file.

      Delete
  6. I need to add new lun in existing volume group in HACMP
    Could you please tell me step by step procedure ..

    Thanks in Advance

    ReplyDelete
    Replies
    1. The steps are right on this page, just a little above:

      FS extension after new storage has been added
      1. cfgmgr - on both nodes
      2. add PVID - check LUNs are the same
      3. add LUN to the vg in smitty hacmp (it will do necessary actions on both nodes)
      ...

      PVID is needed on both nodes, otherwise in smitty HACMP there is no option to choose the disk.

      Delete
  7. If i wish to configure mirrorvg using 2 different SAN storage with same model as my shared storage,
    what is the proper steps to implement ?

    ReplyDelete
    Replies
    1. You need to add the LUNs to the nodes and the vg (see my previous reply how to configure disks in HACMP (cfgmgr, pvid...) then, smitty hacmp -> system man. -> hacmp log. vol. man. -> shared vol. gr. -> set charact. of a ... -> add a volume ...

      When you have the disks there, you need to mirror it in smitty hacmp:
      smitty hacmp -> system man. -> hacmp log. vol. man. -> shared vol. gr. -> mirror a shared vol. gr.

      Delete
  8. Thanks for reply. Let say if one of the storage go down, and one of the disk from the mirror will likely go into stale state, and if my storage recover and up, how do sync back the disk ?

    ReplyDelete
    Replies
    1. Hi, in smitty hacmp you can do almost everything, which is cluster related.
      For syncing back: smitty hacmp -> system management -> hacmp logical volume management -> synchronize shared lvm mirrorrs ...

      Delete
  9. Hi,
    whats the best option to go with if chvg -t vgname needs to apply on Concurrent VG inorder to increase max PV limits from 16 to 32
    1-varyoff VG , apply change ?
    2- create new VG with B option and then import exisiting VG ?
    how can it applied in Active/Active node

    ReplyDelete
    Replies
    1. Hi,

      "chvg -t" on a normal VG can be applied online, however I have never used this on Concurrent VG.
      To be on the safe side, I would do it while VG is varied off, and do cluster synchronization as well.
      If you need official answer, probably the best to ask IBM support.

      If you find a good solution and you share that here, that would be great!

      Balazs

      Delete
  10. hi,

    I have added one filesystem using OS command in enhance concurrent VG which is managed by HACMP. when I run "lsvg -l vgmane" on passive node then I am able to see the volume in o/p but mount point is not showing on passive node. I have tried to sync hacmp and sync volume group definiation also but no difference. I tried with "varyonvg -bu vgname" and "importvg -L vgname " on 2nd node and it works fine.

    is it the only way to get mount point name reflected on 2nd node?

    ReplyDelete
    Replies
    1. Hi, I would add a filesystem in HACMP with "smitty HACMP" and not by OS command. If you use smitty HACMP, it will take care to implement the changes on the other node as well, and do many other things which is necessary to have a synchronized cluster. (But your workaround looks good as well :-))

      Delete
  11. Hi,
    What is the purpose of using major no to create VG in HACMP? Any specific reason.

    Regards,
    Siva

    ReplyDelete
    Replies
    1. Hi,
      IBM says this:
      "When creating shared volume groups, typically you can leave the Major Number field blank and let the system provide a default. However, NFS uses volume group major numbers to help uniquely identify exported file systems. Therefore, all nodes to be included in a resource group containing an NFS-exported file system must have the same major number for the volume group on which the file system resides."

      Hope this helps,
      Balazs

      Delete
  12. Hi,
    please tell me
    when we need to bring the RG offline while synchronizing(cases) in cluster?
    and also cases of RG online while synchronizing?

    thanks,
    sathish.

    ReplyDelete
    Replies
    1. Hi, there are some cluster configuration settings which can be synchronized only if the RG is in offline state, however most of the settings can be synchronized when RG is online. (More info can be found in PowerHA Redbooks.)

      Delete
  13. I am understand this task but i have an doubt we want to update the entry in /usr/es/sbin/cluster/etc/exports or it will update automatically kindly update a reply to me
    my id: johnsoncls@google.com



    ReplyDelete
    Replies
    1. I guess, it will not update automatically but you need to add manually. Otherwise it will exported with default option like with rw to everyone.

      Delete
  14. Hi Sir,
    This is Kishor
    how to find rg state? and how to know which rg using in cluster?

    ReplyDelete
  15. how to extend the filesystem in hacmp using cmdline?
    How to extend the vg and lv in hacmp using cmdline?

    ReplyDelete
  16. how can i remove a VG from HACMP?

    ReplyDelete
  17. Do PowerHA is able to export GPFS filesystems by NFSv4 server using sec=krb5 (kerberos 5) ?

    I'm working to install and setup this environnement :
    GPFS filesystems to export by NFSv4 using Kerberos 5 authentification.
    I have IBM Kerberos 5 NAS system using ITM LDAP Tivoli Directory Server backend.
    I would like to configure IBM Power HA to have a NFS v4 server in High Availability for exporting my GPFS filesystems (so not a JFS or JFS2 shared volume group in HACMP, but GPFS filesystem).

    So right now, my LDAP and Kerberos 5 servers are configured and working well.
    My PowerHA servers and ressource group is configured and working "pretty well".

    I make kinit host/hostname, and I verified my krb5 creds are ok with klist.
    I'm able to mount my nfsv4_service and list the nfsv4 mount point.

    But when I fail over the 2nd NFSv4 server (from powerha), I lost access to nfsv4 mount point. I'm not able to list the gpfs filesystems anymore. But if I come back on the 1st server, I'm able to list my mount point.

    Does anybody configured something like this ? How to make NFSv4 server using Kerberos 5 authentification in HA environnement, to be able to do maintenance and patch without impact to client ?

    Regards,
    Eric Dubé

    ReplyDelete
  18. Any one can tell me what is cross over mounting a file system in PowerHP7.1 and how to do that.

    ReplyDelete