Published on 18 June 2014 by

Last week, I had the pleasure of attending the first-ever DockerCon in San Francisco, an exciting mix of current users reporting what they have done with Docker, a great overview of where Docker is today, and talks outlining Docker’s plans for the future. As acceptance of Docker moves from the bleeding edge to the realm of early adopters, questions about how to manage Docker — especially with configuration management tools like Puppet — come into focus.

At the Puppet Labs booth, my colleagues and I fielded many questions about how the role of configuration management changes in a containerized world. That question (which was also at the heart of my talk at DockerCon) naturally breaks into two issues. On the one hand, Docker hosts need to be managed, Docker Engine needs to be installed, and containers need to be started. On the other hand, building Docker images and handling the inside of running containers could easily send us back to the unhappy days of golden images.

Basic management of Docker hosts is right in Puppet’s wheelhouse, and Gareth Rushgrove’s Docker module on the Puppet Forge can perform all the basic tasks: It installs Docker Engine, sets up the Docker daemon and needed container images, and makes sure the desired containers are running. In other words, the Docker module does for Docker what Puppet does for many other subsystems and services.

An interesting issue is that of building container images and managing the contents of running containers. There is a clear advantage to using containers as a building block for running infrastructure in an immutable way: Once a container is launched, its configuration is never modified, and changes are instead made by replacing the container wholesale with an updated version. But even in a world where all containers are run immutably, you still need a configuration management tool — not at runtime, but at image build time.

One of the big challenges of configuration management is composing the setup of any one system from disparate sources in a way that both minimizes duplication and maximizes control of the elements that are common between systems, and those that are different.This issue crops up once your infrastructure starts to grow, and it happens fairly quickly — usually at tens of machines, not thousands. This is where older configuration management tools burdened the user with too much complexity, leading to brittle, hard-to-understand infrastructure. Resolving these issues of setup at scale is at the heart of Puppet’s design. It provides multiple mechanisms for cleanly expressing both what is different and what is common amongst all the systems in an infrastructure — mechanisms like fine-tuning resources based on facts, parameterized classes, and data injection with Hiera.

When running immutable infrastructure with containers, this core problem of composing the setup from disparate sources is not eliminated, but merely moved from the time containers are run to the time when container images are built. Puppet’s ability to weave the desired state of a container image from disparate sources greatly increases manageability across the entire infrastructure.

Fundamentally, there are two ways to use Puppet during a Docker build: masterless via puppet apply and with a master using Puppet’s agent. In the masterless scenario, the Dockerfile will look something like this:

…
ADD modules /tmp/modules
RUN yum -y install puppet; \
puppet apply --modulepath=/tmp/modules \
  -e "class { 'nginx': service_ensure => disable }”;
rm -rf /tmp/modules
…

Using puppet apply involves two steps: first copying all the files needed during the Puppet run into the build container, and then running puppet apply to have Puppet make all the necessary modifications to the image.

This approach is an easy way to get started with using Puppet, and is the subject of both James Turnbull’s blog post Building Puppet-based applications inside Docker and Thomas Doran’s presentation Taking control of chaos with Docker and Puppet. There are a few drawbacks to this approach as things get more complicated. Since it completely cuts out the puppet master, none of its features are used, especially not centralized data collection, which allows for sharing resources between nodes through PuppetDB and comprehensive reporting. More severely, it is likely that not all infrastructure is run in containers, and that a puppet master is used for the non-container part of the infrastructure. That makes container building a one-off operation, separate from how everything else is managed.

It would be more natural to use Puppet’s agent for building container images against the master. The GitHub repo at https://github.com/lutter/puppet-docker contains the details of setting up such a build. After following those instructions, the important lines in the Dockerfile look almost identical to those for using puppet apply:

…
ADD puppet /tmp/puppet-docker
RUN yum -y install puppet; \
/tmp/puppet-docker/bin/puppet-docker
…

The puppet directory on the build host will contain the SSL certificates that will be used during the build. The puppet-docker script puts the SSL certificates in the right place, sets some facts that make it possible to distinguish a container build from other runs of puppet, and finally runs the puppet agent.

Puppet and Docker address different aspects of a comprehensive software deployment strategy. Using the techniques laid out here makes it possible to reap the benefits of both Docker and Puppet: application isolation and a concise binary artifact, courtesy of Docker, and seamless management and tight control over the nuances of each system in the entire infrastructure, courtesy of Puppet.

Have any questions about Puppet and Docker together, or interesting insights from your own experience? Please share in the comments below.

David Lutterkort is a principal engineer at Puppet Labs.

Learn More

Share via:
Posted in:
Tagged:
The content of this field is kept private and will not be shown publicly.

Restricted HTML

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.