dropdown menu

Shared Storage Pools:

A shared storage pool is a pool of SAN storage devices that can span multiple Virtual I/O Servers. It is based on a cluster of Virtual I/O Servers and a distributed data object repository. (This repository is using a cluster filesystem that has been developed specifically for the purpose of storage virtualization and you can see something like this: /var/vio/SSP/bb_cluster/D_E_F_A_U_L_T_061310)

The Virtual I/O Servers that are part of the shared storage pool are joined together to form a cluster. Only Virtual I/O Server partitions can be part of a cluster. The Virtual I/O Server clustering model is based on Cluster Aware AIX (CAA) and RSCT technology.

The Virtual I/O Servers in the cluster communicate with each other using Ethernet connections. They share the repository disk and the disks for the storage pool through the SAN.
On the Virtual I/O Server, the poold daemon handles group services and is running in the user space. The vio_daemon daemon is responsible for monitoring the health of the cluster nodes and the pool, plus the pool capacity.



When using shared storage pools, the Virtual I/O Server provides storage through logical units that are assigned to client partitions. A logical unit is a file backed storage device that resides in the cluster filesystem in the shared storage pool. It appears as a virtual SCSI disk in the client partition.

The physical volumes in the shared storage pool are managed as an aggregation of physical blocks and user data is stored in these blocks.  After the physical blocks are allocated to a logical unit to write actual data, the physical blocks are not released from the logical unit until the logical unit is removed from the shared storage pool. Deleting files, file systems or logical volumes, which reside on the virtual disk on a client partition does not increase free space of the shared storage pool.

The system reserves a small amount of each physical volume in the shared storage pool to record meta-data.



Cluster can consist:
VIOS version 2.2.0.11, Fix Pack 24, Service Pack 1         <--1 node
VIOS version 2.2.1.3                                       <--4 node
VIOS Version 2.2.2.0                                       <--16 node
VIOS Version 2.2.5.0                                       <--24 node

------------------------------------------------------------------------------

Thin provisioning

A thin-provisioned device represents a larger image than the actual physical disk space it is using. It is not fully backed by physical storage as long as the blocks are not in use. A thin-provisioned logical unit is defined with a user-specified size when it is created. It appears in the client partition as a virtual SCSI disk with that user-specified size. However, on a thin-provisioned logical unit, blocks on the physical disks in the shared storage pool are only allocated when they are used.

Consider a shared storage pool that has a size of 20 GB. If you create a logical unit with a size of 15 GB, the client partition will see a virtual disk with a size of 15 GB. But as long as the client partition does not write to the disk, only a small portion of that space will initially be used from the shared storage pool. If you create a second logical unit also with a size of 15 GB, the client partition will see two virtual SCSI disks, each with a size of 15 GB. So although the shared storage pool has only 20 GB of physical disk space, the client partition sees 30 GB of disk space in total.

After the client partition starts writing to the disks, physical blocks will be allocated in the shared storage pool and the amount of free space in the shared storage pool will decrease. Deleting files or logical volumes from the shared storage pool, on a client partition does not increase free space of the shared storage pool.

When the shared storage pool is full, client partitions will see an I/O error on the virtual SCSI disk. Therefore even though the client partition will report free space to be available on a disk, that information might not be accurate if the shared storage pool is full.

To prevent such a situation, the shared storage pool provides a threshold that, if reached, writes an event in the errorlog of the Virtual I/O Server.

(If you use -thick flag with mkdbsp command, not a thin provisioned disk, but a usual disk (thick) will be created and client will have all the disk space.)

------------------------------------------------------------------------------

When a cluster is created, you must specify one physical volume for the repository and one for the storage pool physical volume. The storage pool physical volumes are used to provide storage to the client partitions. The repository physical volume is used to perform cluster communication and store the cluster
configuration.

If you need to increase the free space in the shared storage pool, you can either add an additional physical volume or you can replace an existing volume with a bigger one. Physical disks cannot be removed from the shared storage pool.


Requirements:
-each VIO Server must resolve correctly other VIO servers in cluster (DNS or /etc/hosts must be filled up with all VIO Servers)
-hostname command should show FQDN (with domain.com)
-VLAN tagging interfaces are not supported in earlier VIO versions for cluster communications
-fibre channel adapter should be ste to dyntrk=yes, fc_err_recov=fast_fail
-disks reserve policy should be set to no_reserve and all VIO Server must have these disk in available state.
-1 disk is needed for repository (min 10GB) and 1 or more for data (min 10GB) (these should be SAN FC LUNs)
-Active Memory Sharing paging space cannot be on SSP disk

------------------------------------------------------------------------------

General commands:

cluster ...        <-- for cluster create, list an status
lssp ...           <--for the pool free space
lu ...             <--for virtual disk create and control
pv ...             <--for controlling the LUNs in the LUN pool
failgrp ...        <--for creating the pool mirror
lscluster ...      <--- for high level view of the hdisk / LUN names
(ignore those ghastly -clustername -sp and -spname in the syntax after you created your SSP as they are going away)

You ARE allowed to dd SSP LU's for backup and moving between different SSPs look in /var/vio/SSP/spiral/D_E_F_A_U_L_T_061310/VOL1
dd bs=1M ....

------------------------------------------------------------------------------

Commands for create:
cluster -create -clustername bb_cluster -spname bb_pool -repopvs hdiskpower1 -sppvs hdiskpower2 -hostname bb_vio1
        clustername    bb_cluster                                                <--name of the cluster
        -spname bb_pool                                                          <--storage pool name
        -repopvs hdiskpower1                                                     <--disk of repository
        -sppvs hdiskpower2                                                       <--storage pool disk
        -hostname bb_vio1                                                        <--VIO Server hostname (where to create cluster)
(This command will create cluster, start CAA daemons and create shared storage pool)


cluster -addnode -clustername bb_cluster -hostname bb_vio2                       adding node to the cluster (16 node can be added)
chsp -add -clustername bb_cluster -sp bb_pool hdiskpower2                        adding disk to a shared storage poool

mkbdsp -clustername bb_cluster -sp bb_pool 10G -bd bb_disk2                      creating a 10G LUN
mkbdsp -clustername bb_cluster -sp bb_pool -bd bb_disk2 -vadapter vhost0         assigning LUN to a vhost adapter (lsmap will show)
mkbdsp -clustername bb_cluster -sp bb_pool -luudid c7ef7a2 -vadapter vhost0      same as above just with LUN ID



Commands for display:
cluster -list                                                      display cluster name and ID
cluster -status -clustername bb_cluster                            display cluster state and pool state on each node

lssp -clustername bb_cluster                                       list storage pool details (pool size, free space...)
lssp -clustername bb_cluster -sp bb_pool -bd                       list created LUNs in the storage pool (backing devices in lsmap -all)

lspv -clustername bb_cluster -sp bb_pool                           list physical volumes of shared storage pool (disk size, id)
lspv -clustername bb_cluster -capable                              list which disk can be added to the cluster
lsmap -clustername SSP_Cluster_1 -all                              list disk mappings to vscsi devices

lscluster -c                                                       list cluster configuration
lscluster -d                                                       list disk details of the cluster
lscluster -m                                                       list info about nodes (interfaces) of the cluster
lscluster -s                                                       list network statistics of the local node (packets sent...)
lscluster -i -n bb_cluster                                         list interface information of the cluster

odmget -q "name=hdiskpower2 and attribute=unique_id" CuAt          checking LUN ID (as root)

lu -list                                                           list lu name, size, ID (lu commands are since VIOS 2.2.3.1)
lu -list -verbose                                                  lists all details in stanza (thin, tier, snapshot…)
lu -list -fmt : -field LU_SIZE LU_USED_PERCENT LU_USED_SPACE LU_UNUSED_SPACE   lists spec. fields (good for scripting)

lu -list -attr provisioned=true                                    list lus that are mapped to an LPAR
lu -list -attr provisioned=false                                   list lus that are not mapped to an LPAR
lu -resize -lu <disk> -size 128G                                   change the size of the logical unit


Commands for remove:
rmbdsp -clustername bb_cluster -sp bb_pool -bd bb_disk1            remove created LUN (backing device will be deleted from vhost adapter)
                                                                   (disks cannot be removed from cluster (for example hdiskpower...)
cluster -rmnode -clustername bb_cluster -hostname bb_vios1         remove node from cluster
cluster -delete -clustername bb_cluster                            remove cluster completely

lu -remove ...                                                     removes a lun (this is newer command than rmdsp, see example below)

Other commands:
cleandisk -r hdiskX                                                clean cluster signature from hdisk
cleandisk -s hdiskX                                                clean storage pool signature (after disk can be added to SSP)
/var/vio/SSP                                                       cluster related directory (and files) will be created in this path

chrepos -n globular -r +hdisk16                                    move/rebuild repo. disk on hdisk16 (repo. issues won't bring down SSP)
------------------------------------------------------------------------------

Create cluster and Shared Storage Pool:

1. create a cluster and pool: cluster -create ...
2. adding additional nodes to the cluster: cluster -addnode
3. checking which physical volume can be added: lspv -cluatername clusterX -capable
4. adding physical volume: chsp -add
5. create and map LUNS to clients: mkdsp -clustername...

------------------------------------------------------------------------------

Add new LUN to SSP and PowerVC (via HMC)

1. request new LUN from Storage team to all VIO servers
2. cfgmgr on VIO servers (you should see new disks)
3. on HMC: Shared Storage Pool --> click on our SSP --> on the new page click on the check button --> Action --> Add capacity

------------------------------------------------------------------------------

Removing LUN:

$ lu -list -attr provisioned=false
POOL_NAME: SSP_1
TIER_NAME: System
LU_NAME                 SIZE(MB)    UNUSED(MB)  UDID
bb-aix-110-NovaLink-d1  51200       0           7d2750b08a772e1dbeb9adbcdc98f76c
volume-bb-aix-pci61m-1~ 153600      0           505efe7dd1b204bf112563d820a44df2
volume-my-openstack--2~ 153600      0           590de5e7109ce9087e2c521904466241

$ lu -remove -clustername SSP_Cluster_1 -lu ld-aix-h10-NovaLink-d1
Logical unit bb-aix-110-NovaLink-d1 with udid "7d2750b08a772e1dbeb9adbcdc98f76c" is removed.

$ lu -remove -clustername SSP_Cluster_1 -luudid 590de5e7109ce9087e2c521904466241
Logical unit  with udid "590de5e7109ce9087e2c521904466241" is removed.

------------------------------------------------------------------------------

Managing snapshots:

Snapshots from a LUN can be created which later can be restored in case of any problems

# snapshot -create bb_disk1_snap -clustername bb_cluster -spname bb_pool -lu bb_disk1    <--create a snapshot

# snapshot -list -clustername bb_cluster -spname bb_pool                                 <--list snapshots of a storage pool
Lu Name          Size(mb)    ProvisionType    Lu Udid
bb_disk1         10240       THIN             4aafb883c949d36a7ac148debc6d4ee7
Snapshot
bb_disk1_snap

# snapshot -rollback bb_disk1_snap -clustername bb_cluster -spname bb_pool -lu bb_disk1  <--rollback a snapshot to a LUN
$ snapshot -delete bb_disk1_snap -clustername bb_cluster -spname bb_pool -lu bb_disk1    <--delete a snapshot

------------------------------------------------------------------------------

Checking if an LU is a clone from other LU (or not)

The LU_UDID_DERIVED_FROM field tells you (if it has a value) this LU is a clone of the other LU with that particular LU_UDID or "N/A" means it is not a clone.

$ lu -list -verbose | grep -p 775f0715fe27ef06914cccb9194607a0
...
LU_PROVISION_TYPE:THIN
LU_UDID_DERIVED_FROM:775f0715fe27ef06914cccb9194607a0         <-- clone of the below
LU_MOVE_STATUS:N/A
LU_SNAPSHOTS:N/A

...
LU_PROVISION_TYPE:THIN
LU_UDID_DERIVED_FROM:N/A                                      <-- master copy
LU_MOVE_STATUS:N/A
LU_SNAPSHOTS:0d932572c08ca5523a43c64a209d7832IMSnap

----------------------------------------------------------------------------

Migrating an LPAR to PowerVC on SSP (with "dd")

We had to move an old Linux LPAR from an old Power Server with vSCSI to a new Power Server with SSP. This new Power server was managed by PowerVC. The old LPAR was using a vSCSI LUN (as a VIO client), and we copied this LUN with "dd" into the SSP.

1. # shutdown -h now                                              <--stop the Linux server we want to migrate (to avoid any io errors)

2. # dd if=/dev/rh5.d1 of=/home/rh5.d1.dd bs=1M                   <--save with dd the disk of Linux server on VIO
                                                                  (as root on the VIO, check ulimit and free space)

3. create a new VM (rh5_new) in PowerVC with an empty boot disk   <--boot disk should have exactly the same size as the original LPAR

4. # mount nim01:/migrate /mnt                                    <--copy or nfs mount the dd file to the VIO server which has the SSP

5. # ls -ltr /var/vio/SSP/SSP_Cluster_1/D_E_F_A_U_L_T_061310/VOL1 <--find the LUN on VIOS with SSP where we need to do the dd again
                                                                  (all SSP data is under this directory)                   
--w-------    1 root     system          253 Apr 24 12:46 .volume-rh5_new.d1-936e9092-fb3c.7e6745bce57e4b5c01452e91f0322feb
-rwx------    1 root     system   75161927680 Apr 24 12:56 volume-rh5_new.d1-936e9092-fb3c.7e6745bce57e4b5c01452e91f0322feb
-rwx------    1 root     system   75161927680 Apr 24 12:56 volume-rh5_new-692aa120-0000007a-boot-0-565121dd-87d3.8fa2e2c8b1c6ca63f059a86f32389544
--w-------    1 root     system          327 Apr 24 12:57 .volume-rh5_new-692aa120-0000007a-boot-0-565121dd-87d3.8fa2e2c8b1c6ca63f059a86f32389544
(the files with . are not important, from the big size files, we need only which contains "boot", this belong to the RedHat VM, the other is just the volume used in the general image)

6. dd to this file:
# dd if=/mnt/rh5.d1.dd of=/var/vio/SSP/SSP_Cluster_1/D_E_F_A_U_L_T_061310/VOL1/volume-rh5-new-692aa120-0000007a-boot-0-565121dd-87d3.8fa2e2c8b1c6ca63f059a86f32389544 bs=1M
71680+0 records in.
71680+0 records out.

After that VM can be started. We need to go to the SMS menu to manually choose disk device:
5.   Select Boot Options
2.   Configure Boot Device Order
1.   Select 1st Boot Device
6.   List All Devices
2.        -      SCSI 69 GB Harddisk, part=1 ()
2.   Set Boot Sequence: Configure as 1st Boot Device

After that during boot, the boot device was found, but we had an error:  can't allocate kernel memory

IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM IBM
/
Elapsed time since release of system processors: 121069 mins 33 secs

Config file read, 1024 bytes
Welcome
Welcome to yaboot version 1.3.13 (Red Hat 1.3.13-14.el5)
Enter "help" to get some basic usage information
boot: linux
Please wait, loading kernel...
Claim error, can't allocate 900000 at 0xc00000
Claim error, can't allocate kernel memory
boot:

The solution was to change below address from c00000 to 2000000.
To do this do another reboot and go to Firmware Prompt, and do following steps:
 8 = Open Firmware Prompt           

     Memory      Keyboard     Network     Speaker  ok
0 > printenv real-base
-------------- Partition: common -------- Signature: 0x70 ---------------
real-base                c00000              c00000
 ok
0 > setenv real-base 2000000  ok
0 > printenv real-base
-------------- Partition: common -------- Signature: 0x70 ---------------
real-base                2000000             c00000
 ok
0 > reset-all

After that reboot was successful and old Linux server was running nicely on the new hardware with SSP.

----------------------------------------------------------------------------

Setting alerts for Shared Storage Pools:

As thin provisioning is in place, real storage free space cannot be seen exactly. If storage pool gets 100% full, IO error will occur on client LPAR. To avoid this alerts can be configured:

$ alert -list -clustername bb_cluster -spname bb_pool
PoolName                 PoolID                             Threshold%
bb_pool                  000000000A8C1517000000005150C18D   35                        <--it shows the free percentage

# alert -set -clustername bb_cluster -spname bb_pool -type threshold -value 25        <--if free space goes below 25% it will alert

# alert -list -clustername bb_cluster -spname bb_pool
PoolName                 PoolID                             Threshold%
bb_pool                  000000000A8C1517000000005150C18D   25                        <--new value can be seen here

$ alert -unset -clustername bb_cluster -spname bb_pool                                <--unset an alert

in errlog you can see the warning:
0FD4CF1A   0424082818 I O VIOD_POOL      Informational Message

----------------------------------------

9 comments:

  1. Hello can you please update more on thin and thick privisioning please

    ReplyDelete
  2. I'd like to see a topic on troubleshooting a VIO cluster. I'm not yet convinced of the stability of this setup due to the network dependency. Why was this not designed with disk communication rather than ( or in conjunction with ) network like PowerHA.

    ReplyDelete
  3. ABEND FATAL AWK IN MUXPROC

    ReplyDelete
  4. Is there any command to display the usage of a virtual LUN over all VIOs ?
    To see if a LUN is mapped in one VIO before giving it to another LPAR ?

    ReplyDelete
  5. Will it be possible to extend LUN ( thin) which is already presented to client machine in VSSP

    ReplyDelete
  6. How would I reliably relate a disk on the LPAR back to its storage pool backing device on the VIOS please?
    For physical VSCSI assigned disks you can use the PVID but I am not sure how to do so with SSP allocated storage

    ReplyDelete
    Replies
    1. Solution found :

      On LPAR:
      user/> lspv -u | grep hdisk3
      hdisk3 00f68d5a61122458 datavg active 412173194C707447A2AB56DF8EB93F320FB2F103303 NVDISK03IBMvscsi f410292-f9e1-5a24-7503-34d6a282
      user/>
      (Removed excess spaces to make it easier to read)

      Take the number in the 5th column and remove the first 5 and last 6 digits :
      3194C707447A2AB56DF8EB93F320FB2F

      On the VIOS :
      padmin/> lssp -clustername MyClusterName -sp MySSPName -bd | grep -i 3194C707447A2AB56DF8EB93F320FB2F
      v3a_vhost18_dvg00 256000 THIN 78% 55804 3194c707447a2ab56df8eb93f320fb2f
      padmin/>

      The first column is the storage pool lun.
      Note: the number is upper case on LPAR and lower case on VIOS.

      This number is consistent across all VIOS in the storage pool so you can reliably identify the disk from anywhere in the storage pool.

      Delete
    2. Thanks a lot Michael for the solution :)

      Delete
    3. lspv -u | grep hdiskXXX | awk '{print tolower($0)}' | awk '{print $5}' | cut -c 6-37

      Delete