homeblogguest post a puffy corporate aquarium sequel

Guest Post: A Puffy in the Corporate Aquarium, The Sequel

After I wrote an article for the OpenBSD Journal about my job at M:tier explaining how we were deploying OpenBSD in several production setups, I've been asked several times to extend the part regarding our usage of Puppet. Most of our solution management is handled by Puppet, from unattended installation and configuration to daily maintenance and security updates. Unfortunately and for obvious reasons, I cannot go into the tiny details of our setup, but hopefully this article will share some light on how Puppet improved our deployment of OpenBSD and assorted applications amongst several sites over the world from a unique central place.

Generic Setup

Before going into more technical details, I'll explain how we deploy Puppet itself as well as the recipes and manifests.

Why Puppet?

Part of our job involves managing several customers with different sites in different countries. We are providing a generic IT solution (servers and desktops) that we have to tweak according to the customers wishes. From the start it was obvious that maintaining one setup per site would not be possible as it would involve a lot of (duplicate) work. After some brainstorming and testing Puppet was chosen because it allowed us to maintain a generic solution that could still be easily modified using simple variables and "if" constructs. Going this way we ended up with 95% of the recipes, manifests, and configurations being shared amongst all of our customers. Adding a new site or customer is just a matter of configuring the remaining 5%. All of our Puppet variables are stored within a redundant LDAP setup. We chose LDAP to allow easy modification from the local administrators over a graphical interface. It was important because these people did not know anything about Unix when we first implemented the setup. So for them, it's only a matter of opening the LDAP graphical interface and change a value to have it propagated wherever it needs to be.

The Quick and Dirty Picture

The way it works is as follows: On all sites, we have a synchronisation job that pulls all the latest packages from our central repository. One of these packages is called "mtier-puppet" and contains the puppetmaster configuration, recipes, and manifests. It is automatically installed on the puppetmaster that manages the local site. Note that we are using a fully redundant puppetmaster setup (they both install the same mtier-puppet package and synchronize puppet certificates using a puppet job). Both puppetmaster servers share the same virtual IP over CARP (Common Address Redundancy Protocol). All the files for all customers are stored within a version control system. When a configuration change is requested, a commit to the repository will automatically trigger the build of a new mtier-puppet package. This allows us to keep everything in one place which not only makes it easy to share files between different installations but also allows for quick comparison of the different Puppet configurations. This also means that when a enhancement or fix is needed on one site, all sites will benefit from it as well by only touching one file. We have also implemented a "plocal" recipe that allows the local staff to override their default setup (for testing and/or quickly changing a configuration without the need to wait for a new mtier-puppet package). The way Puppet is run during installation or production is exactly the same. For bootstrapping machines, we use a pxeboot setup that installs only the base OpenBSD system. When this is done, puppetd runs automatically and installs the necessary packages and configurations then run some post-installation commands like: 'squid -z' to initialize the Squid Proxy cache 'aide -i -V0' to create a reference sum of the installation 'update-desktop-database', 'gtk-update-icon-cache'... for the desktops etc.

Technical Aspects


We use the autosign feature of Puppet to free the local system administrators from the hassle of handling SSL certificates. This way we just have to clear the certificates if a machine gets reinstalled.

Fileserver Modules

Apart from the default fileserver modules we use several ones to fetch data from the server (anti-virus signatures, SSL certificates (https), etc).

Custom Modules

* parser (multi_source_template) A modified version of the available multi_source_template module. Our modification makes it possible to create a template files with a .plocal extension (see above). These files are always picked first by this function so in case we need a fast way to override the default templates on a machine or a site, we just have to create this file. Multi level walk-tree (using /etc/fstab for the example):
  • customer/site/machine/etc/fstab.plocal
  • customer/site/machine/etc/fstab
  • customer/site/generic/etc/fstab.plocal
  • customer/site/generic/etc/fstab
  • customer/generic/etc/fstab.plocal
  • customer/generic/etc/fstab
  • generic/etc/fstab.plocal
  • generic/etc/fstab
In addition if a .plocal change gets delegated to the original file, and the md5 sum of fstab.plocal equals fstab, the plocal file gets removed automatically. This is an extract of our multi_source_template.rb function:
       sources.each do |file|
           Puppet.debug("Looking for #{file} in #{environment}")
           if FileTest.exists?("#{file}")
               if FileTest.exists?("#{file}.plocal")
                   Puppet.info("Found #{file}.plocal in #{environment}")
                   if Digest::MD5.hexdigest(File.read(file)) == Digest::MD5.hexdigest(File.read("#{file}.plocal"))
                     Puppet.info("#{file}.plocal is identical to #{file} in #{environment} removing #{file}.plocal")
                     file = "#{file}.plocal"
* provider (package) The official support for the OpenBSD pkg tools in Puppet is not fully in shape yet, so we had to extend it to support all of our needs, for e.g. using a double `-' in a package name to allow for version-less FLAVORs (OpenBSD-specific concept that is in some ways similar the USE flags in Gentoo Linux); or forcing the "update" and "updatedepends" flags every time a package gets installed or updated.
require 'puppet/provider/package'

# This package provider is based on the OpenBSD pkg provider from puppet with some
# modifications to handle updates, and forces.

Puppet::Type.type(:package).provide :mtier, :parent => :openbsd, :source => :openbsd do
   include Puppet::Util::Execution
   desc "OpenBSD's form of ``pkg_add`` support."

   commands :pkginfo => "pkg_info", :pkgadd => "pkg_add", :pkgdelete => "pkg_delete"

   defaultfor :operatingsystem => :openbsd
   confine :operatingsystem => :openbsd

   def self.instances
       packages = []

           execpipe(listcmd()) do |process|
               # our regex for matching pkg_info output
               regex = %r{^(S+)-([^-s]+)s+(.+)}
               fields = [:name, :ensure, :flavor, :description]
               hash = {}

               # now turn each returned line into a package object
               process.each { |line|
                   if match = regex.match(line)
                       pkgname, hash[:description] = line.split(/s+/)
                       pkgname =~ /^(.*?)-(d.*)$/
                       stem = $1
                       rest = $2.split('-')

                       hash[:name] = stem
                       hash[:ensure] = rest.shift
                       hash[:flavor] = rest.join('-')
                       if hash[:flavor] != ''
                         hash[:name] = hash[:name] + "--" + hash[:flavor]

                       yup = nil
                       name = hash[:name]
                       hash[:provider] = self.name

                       packages << new(hash)

                       hash = {}
                       # Print a warning on lines we can't match, but move
                       # on, since it should be non-fatal
                       warning("Failed to match line %s" % line)

           return packages
       rescue Puppet::ExecutionFailure
           return nil

   def self.listcmd
       [command(:pkginfo), " -a"]

   def install
       should = @resource.should(:ensure)

       unless @resource[:source]
           raise Puppet::Error,
               "You must specify a package source for BSD packages"

       if @resource[:source] =~ //$/
           withenv :PKG_PATH => @resource[:source], :http_proxy => nil, :ftp_proxy => nil do
               output = pkgadd "-r", "-D", "update", "-D", "updatedepends", @resource[:name]
               if output =~ /Can't finds*(.+)/
                   raise Puppet::Error, output.chop!
           pkgadd @resource[:source]

   def query
       hash = {}
       info = pkginfo @resource[:name]

       # Search for the version info
       if info =~ /Information for (inst:)?#{@resource[:name]}(-S+)?/
           hash[:ensure] = $2 ? $2 : $1
           return nil

       # And the description
       if info =~ /Comment:s*n(.+)/
           hash[:description] = $1

       return hash

   def uninstall
       pkgdelete @resource[:name]


We basically use generic, machine and service-specific recipes, that are being evaluated from LDAP where we store the machines. Our classes.plocal directory also enables us to override the default templates just as we do for our template files (like fstab.plocal). There is nothing specific about this, we use dependencies and run-stages for some of the recipes. e.g. our "aide" recipe (aide is an intrusion detection environment):
class aide {
       class { paide: stage => post }

class paide {
       package {
                       ensure => installed,
                       source => "$protocol://$server/pub/OpenBSD/$operatingsystemrelease/packages/$hardwaremodel/",
                       require => File["/etc/ssl/pkgca.pem"];

       file {
                       ensure => file,
                       owner => root,
                       group => wheel,
                       mode => 0644,
                       require => Package["aide"],
                       content => multi_source_template("$templatedir/customer/$customer/$site/$hostname/etc/aide.conf",

       exec {
                       command => "aide -i -V0; mv /var/db/aide.db.new.gz /var/db/aide.db.gz",
                       timeout => 300,
                       user => "root",
                       path => "/bin:/sbin:/usr/sbin:/usr/bin:/usr/local/bin",
                       unless => "test -f /var/db/aide.db.gz",
                       require => [ File["/etc/aide.conf"], Package["aide"] ];


Puppet Modifications

By default, Puppet only tries to connect to LDAP once, then errors out. Our tests indicated that this was not enough so we had to bump this to at least 10 retries. * The defnode facter variable Since we have desktop and server machines enlisted in LDAP all of the nodes are getting pulled from there, but we have hundreds of workstations so there is no point in listing them all in LDAP and we cannot use the default node because that is used to store default settings applicable to all machines. So by specifying the "defnode" facter variable on puppetd invocation the server will use that to perform a Puppet run for a workstation or a laptop. This is the patch that we use for that feature.
--- lib/puppet/indirector/node/ldap.rb.orig     Thu Sep 23 01:17:21 2010
+++ lib/puppet/indirector/node/ldap.rb  Fri Nov 12 11:15:22 2010
@@ -1,3 +1,4 @@
+require 'facter'
 require 'puppet/node'
 require 'puppet/indirector/ldap'

@@ -29,6 +30,9 @@ class Puppet::Node::Ldap < Puppet::Indirector::Ldap
  def find(request)
    names = [request.key]
    names << request.key.sub(/..+/, '') if request.key.include?(".") # we assume it's an fqdn
+    defnode = Puppet::Node::Facts.find(request.key).values['defnode']
+    names << defnode if defnode
    names << "default"

    node = nil
@@ -174,6 +178,14 @@ class Puppet::Node::Ldap < Puppet::Indirector::Ldap
    parent_info = name2hash(parent) || raise(Puppet::Error.new("Could not find parent node '#{parent}'"))
    information[:classes] += parent_info[:classes]
    parent_info[:parameters].each do |param, value|
+      if (param =~ /^puppet[^class].*$/)
+          if information[:parameters][param].kind_of? String
+              information[:parameters][param] = information[:parameters][param].split("n")
+          end
+          if information[:parameters][param]
+              information[:parameters][param] += parent_info[:parameters][param].to_a
+          end
+      end
      # Specifically test for whether it's set, so false values are handled correctly.
      information[:parameters][param] = value unless information[:parameters].include?(param)


Due to the high connection rate to Puppet (especially during the installation phase where dozen of machines are installed concurrently) we had to start using nginx with mongrel as a frontend for Puppet.

In the End

Since most of the time we use a generic Puppet deployment, we had to pay attention in the way the recipes were evaluated to make sure the low-end machines (like the firewalls or routers) didn't have to spend too much CPU time in processing them (remember that we use a lot of shared manifests and recipes between for e.g. all "servers" which includes firewalls and routers). Note that all modifications we do to Puppet itself and that make sense are merged back to the official OpenBSD package and are available in the ports tree at: http://www.openbsd.org/cgi-bin/cvsweb/ports/sysutils/ruby-puppet/patches/ On and on it did take us quite some time to get a Puppet environment that was right for our needs but it was very much worth the effort because we now have a base that can be easily extended and updated. Moreover, it is now much easier to check the status of the updates on all the machines we manage. As we got acquainted more and more with Puppet we became really hooked to it and there is no way we will go back anytime soon: as far as we are concerned, choosing Puppet ended up being a total win and success for us. This article was written by Antoine Jacoutot (ajacoutot@openbsd.org) and Robert Nagy (robert@openbsd.org).