Writing new data backends
Sections
You can extend Hiera to look up values in data sources, for example, a PostgreSQL database table, a custom web app, or a new kind of structured data file.
To teach Hiera how to talk to other data sources, write a custom backend.
Custom backends overview
A backend is a custom Puppet function that accepts a particular set of arguments and whose return value obeys a particular format. The function can do whatever is necessary to locate its data.
A backend function uses the modern Ruby functions API or the Puppet language. This means you can use different versions of a Hiera backend in different environments, and you can distribute Hiera backends in Puppet modules.
Different types of data have different performance characteristics. To make
sure Hiera performs well with every type of
data source, it supports three types of backends: data_hash
, lookup_key
, and data_dig
.
data_hash
data_hash
backend type if: The cache is alive for the duration of one compilation
The data is small
The data can be retrieved all at once
Most of the data gets used
The data is static
For more information, see data_hash backends.
lookup_key
lookup_key
backend
type if:The data set is big, but only a small portion is used
The result can vary during the compilation
hiera-eyaml
backend is
a lookup_key
function, because
decryption tends to affect performance; as a given node uses only a subset of the available
secrets, it makes sense to decrypt only on-demand.For more information, see lookup_key backends.
data_dig
For data sources that can access arbitrary elements of hash or array values before passing anything back to Hiera, like a database.
For more information, see data_dig backends.
The RichDataKey
and RichData
types
To simplify backend
function signatures, you can use two extra data type aliases: RichDataKey
and RichData
. These are only available to backend functions called by Hiera; normal functions and Puppet code can not use them.
For more information, see custom Puppet functions, the modern Ruby functions API.
data_hash
backends
A data_hash
backend function reads an entire data source at once, and returns its
contents as a hash.
data_hash
functions.
You can view their source on GitHub:
Arguments
data_hash
function with two
arguments: -
A hash of options
-
The options hash will contain a
path
when the entry in hiera.yaml is usingpath
,paths
,glob
,globs
, ormapped_paths
, and the backend will receive one call per path to an existing file. When the entry inhiera.yaml
is usinguri
oruris
, the options hash will have auri
key, and the backend function is called once per given uri. Whenuri
oruris
are used, Hiera does not perform an existence check. It is up to the function to type the options parameter as wanted.
-
-
A
Puppet::LookupContext
object
Return type
The function must either call the context object’s not_found
method, or return a hash of lookup
keys and their associated values. The hash can be empty.
function mymodule::hiera_backend(
Hash $options,
Puppet::LookupContext $context,
)
Copied!
Ruby example signature:
dispatch :hiera_backend do
param 'Hash', :options
param 'Puppet::LookupContext', :context
end
Copied!
The returned hash can include the lookup_options
key to configure merge behavior
for other keys. See Configuring merge behavior in Hiera data for more information. Values in the returned
hash can include Hiera interpolation
tokens like %{variable}
or %{lookup('key')}
; Hiera interpolates values as needed.
This is a significant difference between data_hash
and the other two backend types; lookup_key
and data_dig
functions have to explicitly
handle interpolation.
lookup_key
backends
A lookup_key
backend function looks up a single key and returns its value. For
example, the built-in hiera_eyaml
backend is a
lookup_key
function.
You can view its source on Git at eyaml_lookup_key.rb.
Arguments
lookup_key
function with three arguments: A key to look up.
A hash of options.
A
Puppet::LookupContext
object.
Return type
The function must either call the context object’s not_found
method, or return a value for the requested key. It may
return undef as a value.
function mymodule::hiera_backend(
Variant[String, Numeric] $key,
Hash $options,
Puppet::LookupContext $context,
)
Copied!
Ruby example
signature:dispatch :hiera_backend do
param 'Variant[String, Numeric]', :key
param 'Hash', :options
param 'Puppet::LookupContext', :context
end
Copied!
A
lookup_key
function can return a hash for
the the lookup_options
key to configure
merge behavior for other keys. See Configuring merge behavior in Hiera data for more information. To support Hiera interpolation tokens, for example,
%{variable}
or %{lookup('key')}
in your data, call context.interpolate
on your values before returning
them.
data_dig
backends
A data_dig
backend function is similar to a lookup_key
function, but instead of looking up a single key, it looks up a single sequence of keys and
subkeys.
key.subkey
notation. Use
data_dig
types in cases where: Lookups are relatively expensive.
The data source knows how to extract elements from hash and array values.
Users are likely to pass
key.subkey
requests to thelookup
function to access subsets of large data structures.
Arguments
data_dig
function with three arguments: An array of lookup key segments, made by splitting the requested lookup key on the dot (
.
) subkey separator. For example, a lookup forusers.dbadmin.uid
results in['users', 'dbadmin', 'uid']
. Positive base-10 integer subkeys (for accessing array members) are converted to Integer objects, but other number subkeys remain as strings.A hash of options.
A
Puppet::LookupContext
object.
Return type
not_found
method, or return a value for the requested sequence of
key segments. Note that returning undef (nil in Ruby) means that the key was found but that the value for that
key was specified to be undef. Puppet
language example
signature:function mymodule::hiera_backend(
Array[Variant[String, Numeric]] $segments,
Hash $options,
Puppet::LookupContext $context,
)
Copied!
Ruby example
signature:dispatch :hiera_backend do
param 'Array[Variant[String, Numeric]]', :segments
param 'Hash', :options
param 'Puppet::LookupContext', :context
end
Copied!
A data_dig
function can
return a hash for the the lookup_options
key to configure merge behavior for other keys. See Configuring merge behavior in Hiera data for more info.
To support Hiera interpolation
tokens like %{variable}
or %{lookup('key')}
in your data, call context.interpolate
on your values before returning
them.
Hiera calling conventions for backend functions
Hiera uses the following conventions when calling backend functions.
Hiera calls data_hash
one time per data source, calls lookup_key
functions one time per data source for
every unique key lookup, and calls data_dig
functions one time per data source for every unique sequence of
key segments.
However, a given hierarchy level can refer to multiple data sources
with the path
, paths
, uri
, uris
, glob
, and globs
settings. Hiera handles each hierarchy level as follows:
- If the
path
,paths
,glob
, orglobs
settings are used, Hiera determines which files exist and calls the function one time for each. If no files were found, the function is not be called. - If the
uri
oruris
settings are used, Hiera calls the function one time per URI. - If none of those settings are used, Hiera calls the function one time.
Hiera can call a function again for a
given data source, if the inputs change. For example, if hiera.yaml
interpolates a local variable in a file path, Hiera calls the function again for
scopes where that variable has a different value. This has a significant performance
impact, so you must interpolate only facts, trusted facts, and server facts in the
hierarchy.
The options hash
Hierarchy levels are configured in the hiera.yaml
file. When calling a backend function, Hiera passes a modified version of that configuration as
a hash.
path
, glob
,
uri
, ormapped_paths
have been set) the following keys: path
- The absolute path to a file on disk. It is present only ifpath
,paths
,glob
,globs
, ormapped_paths
is present in the hierarchy. Hiera will never call the function unless the file is present.uri
- A uri that your function can use to locate a data source. It is present only ifuri
oruris
is present in the hierarchy. Hiera does not verify the URI before passing it to the function.Every key from the hierarchy level’s
options
setting. List any options your backend requires or accepts. Thepath
anduri
keys are reserved.
cached_file_data
method to read them.For example, the following hierarchy level in hiera.yaml
results in several different options hashes, depending on
such things as the current node’s facts and whether the files exist:
- name: "Secret data: per-node, per-datacenter, common"
lookup_key: eyaml_lookup_key # eyaml backend
datadir: data
paths:
- "secrets/nodes/%{trusted.certname}.eyaml"
- "secrets/location/%{facts.whereami}.eyaml"
- "common.eyaml"
options:
pkcs7_private_key: /etc/puppetlabs/puppet/eyaml/private_key.pkcs7.pem
pkcs7_public_key: /etc/puppetlabs/puppet/eyaml/public_key.pkcs7.pem
Copied!
The various hashes would all be similar to this:
{
'path' => '/etc/puppetlabs/code/environments/production/data/secrets/nodes/web01.example.com.eyaml',
'pkcs7_private_key' => '/etc/puppetlabs/puppet/eyaml/private_key.pkcs7.pem',
'pkcs7_public_key' => '/etc/puppetlabs/puppet/eyaml/public_key.pkcs7.pem'
}
Copied!
In your function’s signature, you can validate the options hash by using
the Struct data type to restrict its contents. In particular, note you can disable all of the
path
, paths
, glob
, and globs
settings for your
backend by disallowing the path
key in the
options hash.
For more information, see the Struct data type.
The Puppet::LookupContext
object and methods
To support caching and other backends needs, Hiera provides a Puppet::LookupContext
object.
In Ruby functions, the context object is a normal Ruby object of class Puppet::LookupContext
, and you can call methods with standard Ruby syntax, for example context.not_found
.
In Puppet language functions, the context
object appears as the special data type Puppet::LookupContext
, that has methods attached.You can call the context’s
methods using Puppet’s chained function
call syntax with the method name instead of a normal function call syntax, for example,
$context.not_found
. For methods that take
a block, use Puppet’s lambda syntax
(parameters outside block) instead of Ruby’s block syntax (parameters inside block).
not_
found
()
Tells Hiera to halt this lookup and move on to the next data source. Call this method when your function cannot find a matching key or a given lookup. This method returns no value.
For data_hash
backends, return an empty hash. The empty hash will
result in not_found
, and will prevent
further calls to the provider. Missing data sources are not an issue when using path
, paths
, glob
, or globs
,
but are important for backends that locate their own data sources.
For lookup_key
and data_dig
backends, use not_found
when a requested key is not present in the data source or
the data source does not exist. Do not return undef
or nil
for missing keys,
as these are legal values that can be set in data.
interpolate(value)
Returns the provided value, but with any Hiera interpolation tokens (%{variable}
or %{lookup('key')}
) replaced by their value. This lets you opt-in to allowing Hiera-style interpolation in your backend’s
data sources. It works recursively on arrays and hashes. Hashes can interpolate into both
keys and values.
In data_hash
backends, support for interpolation is built in, and you do not need
to call this method.
In lookup_key
and data_dig
backends, call this method if you want to support interpolation.
environment_name()
Returns the name of the environment, regardless of layer.
module_name()
Returns
the name of the module whose hiera.yaml
called the function. Returns undef
(in Puppet) or nil
(in Ruby) if
the function was called by the global or environment layer.
cache(key, value)
Caches a value, in a per-data-source private cache. It also returns the cached value.
On future lookups in this data source, you can retrieve values by calling
cached_value(key)
. Cached values are
immutable, but you can replace the value for an existing key. Cache keys can be anything
valid as a key for a Ruby hash, including
nil
.
For
example, on its first invocation for a given YAML file, the built-in eyaml_lookup_key
backend reads the whole file and caches
it, and will then decrypt only the specific value that was requested. On subsequent lookups
into that file, it gets the encrypted value from the cache instead of reading the file from
disk again. It also caches decrypted values so that it won’t have to decrypt again if the
same key is looked up repeatedly.
The cache is useful for storing session keys or connection objects for backends that access a network service.
Each Puppet::LookupContext
cache lasts for the duration of the current catalog
compilation. A node can’t access values cached for a previous node.
Hiera creates a separate cache for
each combination of inputs for a function call, including inputs like name
that are configured in hiera.yaml
but not passed to the function. Each hierarchy level has
its own cache, and hierarchy levels that use multiple paths have a separate cache for each
path.
If any inputs to a function change, for example, a path interpolates a local variable whose value changes between lookups, Hiera uses a fresh cache.
cache_all(hash)
Caches
all the key-value pairs from a given hash. Returns undef
(in Puppet) or nil
(in Ruby).
cached_value(key)
Returns a previously cached value from the per-data-source private cache. Returns undef
or nil
if no value with this name has been cached.
cache_has_key(key)
Checks whether the cache has a value for a given key yet. Returns true
or false
.
cached_entries()
Returns everything in the per-data-source
cache as an iterable object. The returned object is not a hash. If you want a hash, use
Hash($context.all_cached())
in the Puppet language or Hash[context.all_cached()]
in Ruby.
cached_file_data(path)
Puppet syntax:
cached_file_data(path) |content| { ...
}
Ruby syntax:
cached_file_data(path) {|content| ...}
For best performance, use this method to read files in Hiera backends.
cached_file_data(path) {|content| ...}
returns the content
of the specified file as a string. If an optional block is provided, it passes the content
to the block and returns the block’s return value. For example, the built-in JSON backend
uses a block to parse JSON and return a
hash:context.cached_file_data(path) do |content|
begin
JSON.parse(content)
rescue JSON::ParserError => ex
# Filename not included in message, so we add it here.
raise Puppet::DataBinding::LookupError, "Unable to parse (#{path}): #{ex.message}"
end
end
Copied!
On repeated access to a given file, Hiera checks whether the file has changed on disk. If it hasn’t, Hiera uses cached data instead of reading and parsing the file again.
This method does not use the
same per-data-source
caches as cache(key, value)
and similar methods. It uses a
separate cache that lasts across multiple catalog compilations, and is tied to Puppet Server’s environment cache.
Since the cache can outlive a given node’s catalog compilation, do not do any
node-specific pre-processing (like calling context.interpolate
) in this method’s block.
explain() { ‘message’ }
Puppet syntax:
explain() || { 'message'
}
Ruby syntax:
explain() { 'message' }
In both Puppet and Ruby, the provided block must take zero arguments.
explain() {
'message' }
adds a message, which appears in debug messages or when using puppet
lookup --explain. The block provided to this function must return a string.
The explain method is useful for complex lookups where a function tries several different things before arriving at the value. The built-in backends do not use the explain method, and they still have relatively verbose explanations. This method is for when you need to provide even more detail.
Hiera never executes the explain block unless explain is enabled.