Hiera

Hiera is a "Hierarchical Database" to store values for variables (key/value pair). It provides to Puppet a separation between the code and the data. With Hiera it is possible to write Puppet code, where the value for a variable will be searched in Hiera database and if it is found there our Puppet class will use that. Puppet classes can request whatever data they need, and your Hiera data will act like a site-wide config file. It makes our code easy to share and reuse, because the code and the data is separated.

Without Hiera our class would look like this:
class puppet::params {
 $puppetserver = "puppet.example.com"
}

Here we have a variable that will be needed somewhere else, and Hiera is a good place to keep these kind of variables. Hiera presents itself to Puppet as a function call and searches in special YAML or JSON files for the given variable. As an example one of our Hiera data file (written in YAML) could look like this:
puppetserver = 'puppet.example.com'

Then we can rewrite our puppet code like this:
class puppet::params {
 $puppetserver = hiera('puppetserver')
}

In this example Puppet will use the hiera function to get the string value 'puppet.example.com' and place it into the $puppetserver variable.

Hiera helps us to separate configuration from data. It helps us to create modules that are interchangeable blocks, so  the details of that configuration (the data) stays in Hiera data files and the logic of the module stays in the Puppet manifests.

==============================

Hiera installation

Hiera is usually installed on  Puppet Master server, it is optional and unnecessary on agent nodes.
(When I installed on CentOS puppet agent ( dnf install puppet-agent), after that I could use hiera as well.)

puppet resource package hiera ensure=install  <--installing hiera with Puppet
gem install hiera                             <--installing hiera using ruby

After installation the main Hiera config file hiera.yaml is available. This file lists how we want to use for store key-value pairs (YAML or JSON) and lists as well all the hierarchy levels from top to down, where hiera will search for the requested variables. (Each top-level key in the hash must be a Ruby symbol with a colon (:)

An example hiera.yaml file
(it is written version 3, later versions have other syntax)

---                                           <--these 3 dashes (---) show the start of the document
:backends:                                    <--lists how we want to store key-value pairs (yaml, json...),
  - yaml
:yaml:                                        <--for yaml we can set some configuration settings (like datadir...)
  :datadir: /etc/puppet/hieradata             <--the directory in which to find yaml data source files
:hierarchy:                                   <--lists how search should happen, in which order hierarchy levels are followed
  - "%{::fqdn}"                               <--these are the name of the files where data is stored (in this case like myhost.domain.com.yaml)
  - "%{::custom_location}"
  - common                                    <--the file name here will be common.yaml (/etc/puppet/hieradata/common.yaml)


These files in the hierarchy are called data sources, and they can be:
Static data source:  A hierarchy element without any variables used there (without any interpolation tokens). A static data source will be the same for every node. In the example above, "common" is a static data source, because a virtual machine named web01 and a physical machine named db01 would both use common.

Dynamic data source: A hierarchy element with at least one interpolation token (variable). If two nodes have different values for the variables it references, a dynamic data source will use two different data sources for those nodes. In the example above: the special $::fqdn Puppet variable has a unique value for every node. A machine named web01.example.com would have a data source named web01.example.com.yaml, while a machine named db01.example.com would have db01.example.com.yaml.

==============================

Backends

A Backend is that part of a computer system or application that is not directly accessed by the user. It is typically responsible for storing and manipulating data. In Hiera the backends are those files where the actual key-value pairs are stored. Hiera will search in these files and provides the data for the user.

The 2 main types which can be used are yaml and json. (It is possible to use other backends or write our own backends.) For each listed backends, the datadir is specified. This is the directory where our yaml (or json) files are stored (where the data source files are stored). It is possible to use variables (like %{variable}) with datadir, for example: /etc/puppet/hieradata/%{::environment}, so we can keep our production and development data separate.

------------------
Multiple Backends
We can specify multiple backends as an array in hiera.yaml. Hiera will give priority to the first backend, and will check every level of the hierarchy in it before moving on to the second backend.
For example in the following yaml fiel we use yaml and json backends (in this order):
---
:backends:
  - yaml
  - json
:yaml:
  :datadir: /etc/puppet/hieradata
:json:
  :datadir: /etc/puppet/hieradata
:hierarchy:
  - one
  - two
  - three

If we search for something in the hierarchy, then hiera will check files in this order:
one.yaml
two.yaml
three.yaml
one.json
two.json
three.json
------------------
==============================

Hierarchies

Hiera uses an ordered hierarchy to look up data, and this hierarchy is written in the hiera.yaml file. Each element in the hierarchy must be a string, which may or may not include variables (interpolation tokens). Hiera will treat each element in the hierarchy as the name of a data source.

For example:
:hierarchy: 
  - "%{::fqdn}"
  - common

Hiera uses Puppet facts (like fqdn) and if we use these facts as variables in the hierarchy definitions, then we can create separate yaml files (server1.yaml, server2.yaml... based on fqdn) with their own separate configuration values for each server (server 1 needs this package, server2 needs other package). Remove Puppetʼs $ (dollar sign) prefix when using its variables in Hiera. (That is, a variable called $::clientcert in Puppet is called ::clientcert in Hiera.) Puppet variables can be accessed by their short name or qualified name

Each element in the hierarchy resolves to the name of a data source(myhost.example.com.yaml, common.yaml). Hiera will check these data sources in order, starting with the first. If a data source in the hierarchy doesnʼt exist (the yaml file was deleted), Hiera will move on to the next data source. If a data source exists but does not have the piece of data Hiera is searching for, it will move on to the next data source (first checks myhost.examle.com.yaml, if data is not found it will check common.yaml). If a value is found in a normal (priority) lookup, Hiera will stop and return that value.  If Hiera goes through the entire hierarchy without finding a value, it will use the default value if one was provided, or fail with an error.

For example here the numbers show which data source (yaml file) is searched in which order:
(it shows the hierarchy levels from hiera.yaml and the facts, which are used during search for node db01.example.com)


So the final hierarchy in this example:
1. db01.example.com.yaml
2. development.yaml
3. common.yaml


(There are other search mechanisms, for example when Hiera will not stop at the first occurencce, but searches through all the hierarchy levels, and at the end combines the different values into an array. It is called the Array merge lookup method. And there is another method called Hash merge, which I never used.)

==============================

Data Sources (YAML, JSON)

YAML
The yaml backend looks for data sources on disk, in the directory specified in its :datadir setting. It expects each data source to be a text file containing valid YAML data, with a file extension of
.yaml. No other file extension (e.g. .yml) is allowed.

yaml data format examples:

---
# array
apache-packages:
  - apache2
  - apache2-common
  - apache2-utils

# string
apache-service: apache2

# interpolated facter variable
hosts_entry: "sandbox.%{fqdn}"

# hash
sshd_settings:
root_allowed: "no"
password_allowed: "yes"

# alternate hash notation
sshd_settings: {root_allowed: "no", password_allowed: "yes"}

# to return "true" or "false"
sshd_settings: {root_allowed: no, password_allowed: yes

-------------------------------

JSON


The json backend looks for data sources on disk, in the directory specified in its :datadir setting. It expects each data source to be a text file containing valid JSON data, with a file extension of
.json. No other file extension is allowed.

json data format examples:

{
"apache-packages" : [
"apache2",
"apache2-common",
"apache2-utils"
],

"hosts_entry" : "sandbox.%{fqdn}",

"sshd_settings" : {
    "root_allowed" : "no",
    "password_allowed" : "no"
  }
}

===============================

Commands:

puppet lookup <variable>                       it will search for the given variable

hiera <variable>                               search for the given variable in hiera
   -c <yaml conf file>                         path to an alternate hiera.yaml file
   -d                                          debug mode

hiera my_var ::fqdn=localhost.localdomain      searching for variable (my_var) in hierarchy level, where ::fqdn is mentioned in hiera.yaml file
$gccs = hiera('gcc::versions', undef)          in puppet code a variable can get a value using hiera (if hiera does not find variable, it wll get undef value)


===============================

My test setup

hiera config file:
# cat hiera.yaml
---
:backends:
  - yaml
:yaml:
  :datadir: /root/hieradata
:hierarchy:
  - "node/%{::fqdn}"
  - "osfamily/%{osfamily}"
  - common

(I checked with facter what is current fqdn and it showd localhost.localdomain so I created that yaml file.)

directory and file structure:
/root/hieradata
├── node
│   └── localhost.localdomain.yaml
├── osfamily
│   ├── Debian.yaml
│   └── RedHat.yaml
└── common.yaml


content of yaml files:
# cat localhost.localdomain.yaml
my_var: node
gcc_version:
 - '6.4.0'
 - '8.3.0'
 - '9.1.0'

# cat Debian.yaml
"tools::working_dir" : "/opt/debian"
my_var: debian

# cat RedHat.yaml
"tools::working_dir" : "/opt/redhat"

# cat common.yaml
my_var: common


Test results searching for variable: my_var

# hiera my_var                                     <--without any specification it will be found in common.yaml
common

# hiera -d my_var ::fqdn=localhost.localdomain     <--with debug mode and specifying where to look
DEBUG: 2020-03-20 17:41:17 +0100: Hiera YAML backend starting
DEBUG: 2020-03-20 17:41:17 +0100: Looking up my_var in YAML backend
DEBUG: 2020-03-20 17:41:17 +0100: Looking for data source node/localhost.localdomain
DEBUG: 2020-03-20 17:41:17 +0100: Found my_var in node/localhost.localdomain
node

# hiera my_var osfamily=Debian                    <--specifying in osfamily which yaml file to check
debian

# hiera my_var osfamily=RedHat                    <--same as above, but in RedHat.yaml my_var is missing, found in common.yaml
common

# hiera tools::working_dir osfamily=Debian        <--checking a value os a variable in a class
/opt/debian

# hiera unknon_var                                <--if variable does not exit
nil

# hiera unknon_var 1111                           <--if variable does not exist give a default value to it
1111


================================




No comments:

Post a Comment