POWERVC - CLOUD INIT

Cloud Init and PowerVC

When a new server is created, the configuration parameters (for example hostname) need to be transferred from PowerVC to the AIX server. Usually these are configured during the first boot and the tool which was used by older PowerVC (before version 1.3.0), was the Virtual Solutions Activation Engine (VSAE), in short the “activation engine.”. The Activation Engine had to be installed manually on the server before it was captured into a deployable image.  VSAE has been depreciated and was withdrawn from support. IBM says it is strongly recommended that new images are constructed with cloud-init. Cloud-init is the strategic image activation technology of IBM. It offers a rich set of system initialization features and a high degree of interoperability.

-------------------------------------------------

Cloud Init

cloud-init handles early initialization of cloud instances.  By default deploying cloud images will create identical operating systems. The user data, which is entered manually (or by  a script) during deployment gives every cloud instance its personality. cloud-init is the tool that applies user data to instances automatically.

Cloud-init is an open-source package which will configure newly deployed virtual machines on first boot. When we install the cloud-init package, the cloud.cfg file will be generated. This file controls customization of the virtual machine. It is written in YAML data serialization format.


Cloud Init and Linux boot up sequence (sysvinit and systemd)

Originally cloud-init was available for Linux only. It was integrated into the Linux boot up sequence. Linux has been using Sys-V Init (System V initialization) to manage system startups for many years. At boot the kernel started init, the very first process at startup.  sysvinit was slow because it started processes one at a time, performed dependency checks, and waited for daemons to start. As time went on, it became clear that init was getting too slow and inflexible.

Then came systemd. systemd is a system and service manager for Linux, compatible with SysV. systemd provides aggressive parallelization capabilities, offers on-demand starting of daemons, keeps track of processes, maintains mount and automount points, etc. It can work as a drop-in replacement for sysvinit. systemd is the first process started and it reads different targets and starts the processes specified which allows the operating system to start.  When booting under systemd, it will determines if cloud-init.target should be included in the boot goals. By default it enables cloud-init.

-------------------------------------------------

Cloud-init Installation and initial boot sequence:

1. Install yum from the AIX toolbox (See the yum readme for instructions).
2. Make sure SSH daemon is running.
3. Run yum install cloud-init.

During installation in /etc/rc.d/rc2.d these are created:
lrwxrwxrwx    1 root     system           33 Apr 19 12:30 S01cloud-init-local -> /etc/rc.d/init.d/cloud-init-local
lrwxrwxrwx    1 root     system           27 Apr 19 12:30 S02cloud-init -> /etc/rc.d/init.d/cloud-init
lrwxrwxrwx    1 root     system           29 Apr 19 12:30 S03cloud-config -> /etc/rc.d/init.d/cloud-config
lrwxrwxrwx    1 root     system           28 Apr 19 12:30 S04cloud-final -> /etc/rc.d/init.d/cloud-final


cloud-init runs these during start up:
# File Command Purpose
S01  /etc/rc.d/init.d/cloud-init-local cloud-init init --local           load config from CD (ip and pre-steps needed for later)
S02  /etc/rc.d/init.d/cloud-init cloud-init init                         run all cloud_init_modules (write_files, set_hostname…)
S03  /etc/rc.d/init.d/cloud-config cloud-init modules --mode config      run all cloud_config_modules (mounts, chef…)
S04  /etc/rc.d/init.d/cloud-final cloud-init modules --mode final        run all cloud_final_modules (final-message, reboot..)


-------------------------------------------------

/opt/freeware/etc/cloud/cloud.cfg

This is the main cloud-init configuration file. There are 4 sections, which are following each other in sequence. Each section contains several modules which are run during boot up:

- (S01) At the top of the file, there are some parameters which will be needed later: like ssh settings, configsource etc.
- (S02) cloud_init_modules: Modules that run early on in the boot sequence (these don' rely on network): write-files, set_hostname, restore-volume-groups, update-bootlist, reset-remc etc.
- (S03) cloud_config_modules: Modules that run after network comes up: disk_setup, mounts, locale, set-password, timezone, puppet etc.
- (S04) cloud_final_modules: Modules that run in final stage (after config modules): scripts-per-once/boot/instance, final-message, power-state-change 


cloud-init-local (Data Sources - ConfigDrive):

The first part of cloud-init runs as soon as possible (/ mounted with read-write). The purpose of the first part (which is called "local" stage) is to find the datasource and determines the network configuration to be used. Data Sources define how the instance metadata is pulled during boot. The instance metadata are instance specific details like hostname, ip, network etc.  The Data Sources are telling which method to use to get these metadata, which depends on what cloud services we are using (OpenStack, AWS, OpenNebula etc.) 

An example for a  Data Source list: 
datasource_list: [ NoCloud, ConfigDrive, OpenNebula, Azure, AltCloud, OVF, MAAS, GCE, OpenStack, CloudSigma, Ec2, CloudStack, None ]

This instructs cloud-init what modules to load while trying to download instance metadata. If needed additional parameters may be passed regarding a datasource :

datasource:
  OpenStack:
    metadata_urls: [ 'http://169.254.169.254:80' ]
    dsmode: net

This  tells OpenStack datasource to use, and use the url http://169.254.169.254:80 to download metadata and to run after network initialization.

In PowerVC implementations we use for DataSource, the "ConfigDrive" parameter. ConfigDrive is basically a Configuration Disk (or device), because OpenStack can write metadata to this special device that will be attached to the instance during first boot. The new instance can retrieve necessary metadata by mounting this disk and reading files from it. With PowerVC, when a new server is deployed, PowerVC will create a small cdrom iso image (virtual cdrom), which is used as ConfigDrive by cloud-init. This cdrom provides ip and hostname of the machine, activation input script, nameservers information etc.

In the cloud-init config file (/opt/freeware/etc/cloud/cloud.cfg) there is this line:
datasource_list: ['ConfigDrive']

During a new server deployment, on the VIO server it is possible to see this cdrom image (lsrep, lsmap)

Name                                                  File Size Optical         Access
cfg_AIX_610905_ba_2d176992_000000.iso                         1 None            ro
cfg_AIX_710404_pc_67319f7a_000000.iso                         1 None            ro


If you mount this and check the content, these things can be found there:
cat /mount/openstack/content/0000             <-- this contains the network configuration
cat /mount/openstack/latest/meta_data.json    <-- this contains server hostname and other propertie
cat /mount/openstack/latest/user_data         <-- this contains scripts, config passed on userdata


cloud-init-modules:
When datasource is available, then cloud init can run the modules in cloud-init-module section of the config file. This contain specifics modules, for example set the hostname (set_hostname), reset the rmc (reset_rmc). In our case this part will automatically change the hostname and the ip address of the machine by the values provided in PowerVC at the deployement time.


cloud-config-modules
At this stage the minimal requirements have already been configured by the previous cloud_init_module stage (dns, ip address, hostname are ok). This stage runs the disk_setup and mounts modules which may partition and format disks and configure mount points (such as in /etc/fstab). Those modules cannot run earlier as they may receive configuration input from sources only available via network. For example, a user may have provided user-data in a network resource that describes how local mounts should be done. Here several different configuration can happen: mounts, passwords, timezone etc.


cloud-final-modules
This runs as final part of boot. This stage runs as late in boot as possible. Any scripts that a user is accustomed to running after logging into a system should run correctly here. (Package installations, user scripts, final message, last reboot etc.)

-------------------------------------------------

Cloud-init frequency

Most cloud-init Python modules have a default run frequency: per_always (which means run on every boot), per_instance (run only if it has never run with this specific instance-id), per_boot (run only once, regardless of the instance-id).

The modules can be found in /opt/freeware/lib/python2.7/site-packages/cloudinit/config:

ls -l /opt/freeware/lib/python2.7/site-packages/cloudinit/config
-rw-r--r--    1 root     system         5166 Oct 18 2016  cc_puppet.py
-rw-r--r--    1 root     system         3040 Jun 08 2017  cc_puppet.pyc
-rw-r--r--    1 root     system         4632 Oct 18 2016  cc_reset_rmc.py
-rw-r--r--    1 root     system         3888 Jun 08 2017  cc_reset_rmc.pyc
-rw-r--r--    1 root     system         5749 Oct 18 2016  cc_resizefs.py
-rw-r--r--    1 root     system         4842 Jun 08 2017  cc_resizefs.pyc
-rw-r--r--    1 root     system         4052 Oct 18 2016  cc_resolv_conf.py
-rw-r--r--    1 root     system         2401 Jun 08 2017  cc_resolv_conf.pyc
-rw-r--r--    1 root     system         3331 Oct 18 2016  cc_set_hostname_from_dns.py
-rw-r--r--    1 root     system         2709 Jun 08 2017  cc_set_hostname_from_dns.pyc

for example:
cc_update_hostname.py  --> frequency = PER_ALWAYS
cc_set_hostname_from_dns.py -->  frequency = PER_INSTANCE

(If not provided PER_INSTANCE will be assumed. )

PER_INSTANCE: The module will be run on the first deploy of the VM but never again.
PER_ALWAYS: The module will be run on every boot of the VM, including the first deploy.
PER_ONCE: The module will be run only a single time (i.e., the first time the image is deployed, but not on any other deploys).

One can specify which specific modules to run at specific stages and override default module frequency:
#cloud-config
cloud_init_modules:
  - seed_random
  - [bootcmd, once]
  - [write-files, instance]
  - growpart
  - resizefs
  - set_hostname
  - update_hostname

Cloud-init keeps track of whether particular modules have been run by using semaphores. These get placed into the /var/lib/cloud/sem and /var/lib/cloud/sem/instances/<instance-id>/sem directories. 

-------------------------------------------------

cloud-init-per

The cloud-init-per utility is provided by the cloud-init package to create boot up commands (or scripts) that run at a specified frequency.

The cloud-init-per utility syntax is as follows: cloud-init-per <frequency> <name> <cmd> 
location on AIX: /opt/freeware/bin/cloud-init-per 
more info: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/bootstrap_container_instance.html#cloud-init-per

-------------------------------------------------

Files, Commands:

/opt/freeware/etc/cloud/cloud.cfg                      main configuration file
/opt/freeware/var/lib/cloud/instance/sem               "semaphore" files are stored here (to show which modules have already been run)
/opt/freeware/lib/python2.7/site-packages/cloudinit/distros/aix_util.py

/var/log/cloud-init-output.log                         cloud-init output log 
                                                       ("grep ^Cloud /var/log/cloud-init-output.log" shows running times)

/opt/freeware/bin/cloud-init -v                        show cloud-init version

/opt/freeware/bin/cloud-init --debug init              run init in debug mode
/opt/freeware/bin/cloud-init single -n set_hostname    run only one module
/opt/freeware/bin/cloud-init --debug modules -m final  run final modules in debug mode

re-run cloud-init:
rm -rf /opt/freeware/var/lib/cloud/instance/sem


-------------------------------------------------

Running scripts during deployment

If you have a script (usual shell script) that does not require different parameter values on a per-deploy basis and do not want to pass the script in on each deploy you can put the script in the image and have cloud-init call it on each deploy.  To do this, place the script in the /var/lib/cloud/scripts/per-instance directory on Linux or the /opt/freeware/var/lib/cloud/scripts/per-instance directory on AIX.  Cloud-init will run the script on the first boot of the VM post-deploy.

If script is needed at every reboot, then place it to "per-boot" directory.

# ls -l /opt/freeware/var/lib/cloud/scripts
drwxr-xr-x    2 root     system          256 Jul 05 13:37 per-boot
drwxr-xr-x    2 root     system          256 Jul 05 13:37 per-instance
drwxr-xr-x    2 root     system          256 Jul 05 13:37 per-once
drwxr-xr-x    2 root     system          256 Jul 05 13:37 vendor


For example: 
1. copy script to per-instance directory: 
/opt/freeware/var/lib/cloud/scripts/per-instance/aix_init.sh

2. enable cloud-init for the next reboot (for testing purposes)   
If we want to test our script at next reboot, we need to clean up cloud init logs, so cloud-init thinks this is the first boot (deployment):

rm -rf /opt/freeware/var/lib/cloud/instance/sem                      <--enable cloud-init to run during the next reboot
> /var/log/cloud-init-output.log                                     <--clean up cloud-init log


-------------------------------------------------

Testing Modules

Tips for Testing Modules
In order to efficiently develop and debug your modules, you’ll want to “trick” cloud-init into re-running, even when the VM has already booted up. This way, you don’t have to wait for additional deploys, captures, etc. To do this, you need to make cloud-init think that your module has not been run and then make cloud-init re-run it. Cloud-init has a notion of semaphores for modules to ensure that they are only run the right number of times (i.e., their frequency). 

Two workarounds for this (first one is the preferred one):
1. Delete the semaphore file for your module in the /var/lib/cloud/instance/sem/ directory (e.g., /var/lib/cloud/instance/sem/config_foo). Then run `/usr/bin/cloud-init modules -mode <init,config,final>` depending on what kind of module you are writing, to have cloud-init re-execute your module. Some versions of cloud-init allow you to run a single module at a time as well.
I usually delete the instance directory, which works well (which is a link):
lrwxrwxrwx    1 root     system           74 Apr 12 08:45 instance -> /opt/freeware/var/lib/cloud/instances/3976d990-ab42-4939-911a-21f6149170cd
drwxr-xr-x    3 root     system          256 Apr 12 08:45 instances

deleting instance dir:  rm -rf /opt/freeware/var/lib/cloud/instance/sem

2. Modify the instance ID found in /var/lib/cloud/data/instance-id and then rename the corresponding directory in /var/lib/cloud/instances/ to the new instance ID, then reboot the VM to have cloud-init re-execute your module. 


Re-runningmodules will run into issues if using PowerVC on PowerVM. This is because information (such as cloud-config data) is passed to cloud-init via an attached virtual optical device. After roughly an hour post-deploy, this virtual optical device is detached from the VM on PowerVM systems. This issue does not exist on PowerKVM systems, as the virtual optical device is never detached from the VM. If cloud-init installed and it says Config-Drive and you reboot manually (so PowerVC small ISO image is not there) cloud-init will fail so no cloud-init clean up is necessary….otherwise after boot clean-up is needed.

There are two methods to workaround this on PowerVM systems.  The first option is the preferred option:
1. Modify the /etc/cloud/cloud.cfg and change the line “datasource_list: [‘ConfigDrive’]” to “datasource_list: [‘None’]”   Remember to revert this change when you are finished developing the module and before you capture the VM to create a deployable image.  Note that when you make this change the semaphore file you must delete to allow your module to re-run will be in the /var/lib/cloud/instances/iid-datasource-none/sem directory.

2. You can disable the removal of virtual optical devices by setting “ibmpowervm_iso_detach_timeout” to a negative value in the /etc/nova/nova*.conf configuration file that corresponds to the host you will deploy the VM to, then restarting the nova services. Remember to revert this change and then restart nova services again once you are done developing your module.

Once you are confident in your module, you should capture your VM with the module in place (i.e., Python cc_ file in config/ directory and the module enabled in cloud.cfg), then deploy the VM and verify that your module worked as intended. Also, you should test passing different cloud-config data if your module accepts user input through that method.


-------------------------------------------------

Hostname tips:

If you want to change the host name after the deployment, remove "-update_hostname" from the list of cloud_init_modules. If you do not remove it, cloud-init resets the host name to the original host name deployed value when the system is restarted.

set_hostname_from_interface: Use this module to choose the network interface and IP address to be used for the reverse lookup. The valid values are interface names, such as eth0 and en1. On Linux, the default value is eth0. On AIX, the default value is en0

If you want cloud-init to set the host name, in the /etc/sysconfig/network/dhcp file, set the DHCLIENT_SET_HOSTNAME option to no. If you want cloud-init to set the default route by using the first static interface, which is standard, set the DHCLIENT_SET_DEFAULT_ROUTE option in the /etc/sysconfig/network/dhcp file to no. If you do not set these settings to no and then deploy with both static and DHCP interfaces, the DHCP client software might overwrite the value that cloud-init sets for the host name and default route, depending on how long it takes to get DHCP leases for each DHCP interface.


-------------------------------------------------

1 comment: