homebloglearn about stdlib a hidden puppet gem

Learn About stdlib: A Hidden Puppet Gem

Stdlib is one of the hidden gems of the Puppet module ecosystem. It's a module designed to provide a standard library of resources for developing Puppet modules. In essence, it provides a whole lot of functions to make writing manifests easier.

This blog post will walk through a fictitious Puppet implementation to highlight a few helpful resources that stdlib provides. Let's assume we are running a nuclear power plant and want to use Puppet to automate some of our systems ...


Our reactor module has a parameter called config_file that takes the full path of the configuration file for our reactor:

class { 'reactor':
    config_file => '/etc/app/critical_configuration.conf',

We decide that we want to make sure /etc/app exists inside the module itself so the file has a place to live. However, we don't want to have to do clever parsing of the $config_file which means we must try to chop off the end component, as we don't know what name the configuration file will have. Enter dirname, which allows us to:

$directory = dirname($config_file)
file { $directory:
  ensure => directory,

When we pass a full path to dirname(), it'll figure out how to appropriately chop off the filename and let us work with the remainder of the path.


Our reactor class from earlier has added a feature that notifies a number of people when a problem occurs. To accomplish this, we pass an array into the class and build the notification list from it.

class { 'reactor':
  notifications => [ 'engineer1', 'engineer2', 'engineer3' ]

In the module we have:

file { '/etc/app/notifications.conf':
  ensure  => present,
  content => template('reactor/notifications.conf.erb'),

This template takes each element of the array and writes them, one by one, on separate lines. Unfortunately, the reactor application isn't very smart. When it wants to notify users of a problem it starts by notifying the top person, and then, if they don't reply, it slowly works down the list. This means Engineer1 is drowning in alerts while the others, Engineer2 and Engineer3, are carousing at the local bar.

We can fix this! fqdn_rotate will rotate an array a random number of times based on the nodes' fqdn. This means that every node will randomize the array slightly differently, helping to distribute the alerts to all three engineers.

The randomizing remains the same based on the hostname, meaning that future puppet runs will keep the same ordering and won't continue rewriting the notifications list.


Our reactor system needs to check in with the monitoring system and let it know things are OK. In the past, one of our coworkers hardcoded the IP of the monitoring system into the application and, when we moved it, things broke and bad things happened. We decide to mitigate this risk by disallowing IP addresses in the module. When called with:

class { 'reactor':
  monitoring_system => ''

We protect ourselves with:

if has_ip_address($monitoring_system) {
  fail("No way! Not after the last chewing out.")
} else {
  class { 'reactor::monitor': monitoring_system => $monitoring_system }

The flipside of has_ip_address is also quite useful: is_domain_name.


Our development team has released yet another version of the reactor, codenamed "No meltdowns since 2013." We start working on the Puppet updates needed to deploy the new release. They've added a new access control system so monitoring alerts can be viewed in the browser rather than checking the emailed alerts setup earlier. Unfortunately for us, consistency wasn't part of this design and the configuration entry goes into critical_configuration.conf and looks like:

monitoring_acl = engineer1,engineer2,engineer3

Luckily, we already have that information in our notifications parameter (['engineer1', 'engineer2', 'engineer3']), but it's in the form of an array. We could write a bunch of code in the template to iterate over the array and write out the entries, but Puppet makes this easy for us with join. We can simply do

$monitoring_acl = join($notifications, ",")

This statement gives us the format we need for the configuration file by taking each entry of the array and joining them together with a ','. We leave early and join the engineers at the bar.


As time passes, even more functionality slips into our reactor. We have a new configuration file created by a hotshot developer requiring us to pass in a whole bunch of information to populate the template for the file. We realize that we have most of the information needed in the form of a yaml file that we can download each night onto our puppet master.

Rather than trying to define a large hash in Puppet and manually copying the information in, we can simply borrow the yaml file and write

$configuration_data = loadyaml('/etc/puppet/data/downloaded_report.yaml')

Doing this will load the enormous nested hash contained in the report and make it natively available in a variable within Puppet. We can then simply access @configuration_data['thing'] in the template and be finished in no time.


(Of all the functions we're looking at in this post, this one is my favorite!)

There is a tricky piece of logic in our reactor that we must model in Puppet. We want to set the maximum amount of disk storage allowed for backups, but due to regulatory requirements we have a number of criteria to consider. We write some facts: $critical_space, $important_space, and $junk_space. These facts return the appropriate amount of disk space. Each of the facts checks for an /etc/app/importance.conf file that contains the word 'critical' or 'important' or 'junk', and then returns the right amount of disk space if that word matches their name. Otherwise they don't run.

We want an easy way to select the correct amount of disk space, as well as leave a backup option if someone sets importance.conf to 'development', not knowing about these facts. We can use pick for all of this.

$diskspace = pick($critical_space, $important_space, $junk_space, '1G')

pick will select either the first value that is not undefined or an empty string. If all the entries match, it'll return the last value in pick() as the default.

(pick is commonly used to provide defaults. Puppetlabs-mysql uses it in pick($root_home, '/root') so that pluginsync failures won't blow up your MySQL installation.)


Once again, we need to work with our notifications array. A feature has been added that displays the current on-call user; the output should be engineer1@example.com. Unfortunately, our notifications array is ['engineer1', 'engineer2', 'engineer3'] and doesn't include an email address.

Here suffix saves the day by allowing us to use

$oncall_list = suffix($notifications, '@reactor-corp.com')

to populate $oncall_list with ['engineer1@reactor-corp.com']-style entries.


We've added a new piece of functionality that requires us to deploy many small hosts around the world to help monitor our reactor and make sure things are looking good. We have access to the most recent build's hostname in Puppet via a variable called $most_recent_host, but unfortunately we have to list every single hostname in the configuration file.

We don't want to constantly update the template every time a new host is added, and we don't want to have to build a giant array of hostnames. Is there anything we can do? range comes to the rescue and allows us to build our hostlist!

# $most_recent_host contains monitoring-host95
$hostlist = range("monitoring-host01", $most_recent_host)

Now an array with 95 entries is returned, from monitoring-host01 to monitoring-host95.


Another day, another feature at the reactor. This time we've used a trick from earlier (loadyaml) to acquire a huge number of tuning parameters for the reactor, as determined by our development team. The array looks like ['blowup=1.0', 'callthepress=0.5'].

We want to iterate over this array and write each value into an /etc/app/performance.conf file, but we notice that the developers have included duplicate values! The application isn't robust and if it tries to set an already-set value, it segfaults and the alarm klaxons start screaming out the sounds of trouble.

Luckily, unique lets us filter these duplicates out with a simple:

$unsortedarray = loadyaml('/app/tuning.yaml') 
$sortedarray = unique($unsortedarray)

We can now use $sortedarray safely in our template without any further segfaults!


We care deeply about safety at Reactor Corp and the last thing we want is to allow incorrect values into our configuration file. Luckily, in our main reactor class we can use validate, which is actually multiple functions. The validate_ functions allow us to ensure the parameters and data passed into our classes match what we expect, enabling us to do things like:

validate_re($oncall_list, '@reactor-corp.com$')

We ensured our config_file parameter was a full file, we ensured our notifications was a list rather than an accidental string, and we ensured that our on-call list contained a full email address in every entry.

Most of the validate_ functions are fairly obvious, from validate_bool to validate_hash. But validate_re is a bit more complex than the others. For a string or array, it'll take each value and filter it against the regular expression given at the end of the function. You have to quote the regular expression rather than surround it with '/'s, which often tricks unaware users. Puppet internally takes that string and converts it into a real regular expression for checking against the elements.

And with that, our reactor continues to run smoothly and we continue to clock-out and head to the bar with confidence.

Just as stdlib helps us run our fictional reactor with aplomb, we hope you, too, can harness the power of functions to improve your own Puppet implementation!