October 20, 2021

Puppet Directory Guide: What Each Directory Does

Products & Services
How to & Use Cases

Like any toolset, Puppet comes with its own conventions. Beyond the Puppet language itself, there are particular directories whose purpose you need to understand to use Puppet effectively. It's not magic, but it can seem that way to new users.

This article explains what each Puppet directory does, so you can find what you're looking for and place files where they need to be.

Table of Contents

What is Puppet Directory?

A Puppet Directory is a configuration that allows you to organize your file systems into a hierarchy which provides structure to files, subdirectories, etc.

Overview of Puppet Directories

The /etc/puppetlabs Directory

Puppet installs by default into multiple directories. Some of these directories can be moved; others can't be moved (i.e., there's no support for moving them). I’m going to give a quick summary here.

This used to be /etc/puppetlabs/puppet for Puppet Enterprise and /etc/puppet for open source Puppet prior to Puppet 4. Now with the all-in-one agent, we’ve unified the naming scheme.

[root@primary puppetlabs]# pwd
/etc/puppetlabs
[root@primary puppetlabs]# tree -L 1
.
├── activemq            # ActiveMQ configuration for mcollective
├── client-tools        # Orchestration tools configuration
├── code                # Puppet code. Don’t modify directly
├── code-staging        # Utilized by Code Manager.  Don’t touch
├── console-services        # Console services configuration
├── enterprise          # Used by PE during install and upgrades
├── license.key         # Enterprise license key
├── mcollective         # Mcollective configuration
├── nginx           # Nginx configuration for puppet server
├── orchestration-services  # Orchestration services configuration
├── puppet          # Puppet Agent configuration
├── puppetdb            # PuppetDB configuration
├── puppetserver        # PuppetServer configuration
├── pxp-agent           # Orchestration Agent configuration
└── r10k                # Code Manager/r10k configuration

There are some minor differences between platforms, but in most cases the other directories and files used by Puppet are:

/opt/puppetlabs         # All the internal Puppet stuff, binaries, etc
/var/log/messages       # The Puppet Agent logs here on *nix systems
/var/log/puppetlabs     # All other logging
/tmp                # Used by the installer, issues if set ‘noexec’

Some of these can be redirected to a different location if absolutely required, but it’s generally discouraged.

Windows Puppet Directory

Most of the file trees are the same in Windows as they are in Linux, with the exception that they are prepended as follows:

C:\Program Files\Puppet Labs        # /opt/puppetlabs equivalent
C:\ProgramData\PuppetLabs       # /etc/puppetlabs equivalent

NOTE:The C:\ProgramData folder is hidden by default.

Additionally, instead of logging going to a log file, it ends up in the application log.

Configurable Paths

There are a few configurable paths within Puppet. The vast majority of the time, the default path is fine, but it’s useful to understand what the possible paths are, and how they interact.

Modulepath

Puppet code is distributed in modules. Those modules can exist in multiple directories, although in general, you should use only the /etc/puppetlabs/code/environments/$environment/modules directory and the /etc/puppetlabs/code/environments/$environment/site directory if you are using roles and profiles. To see your current modulepath, run the following command:

[root@primary puppet]# puppet config print modulepath

You'll see this:

/etc/puppetlabs/code/environments/production/modules:/etc/puppetlabs/code/environments/
production/site:/etc/puppetlabs/code/modules:/opt/puppetlabs/puppet/modules

This can be modified in the global puppet.conf file and the environment-specific environment.conf file.

When Puppet looks for a module, it goes through each of the above directories, in order. If the same module name exists in two directories, Puppet will use the code from the first module. (You generally don’t want a module name to exist in two directories, and if it does, you can expect unusual behavior.)

The example path above includes four directories:

  • /etc/puppetlabs/code/environments/production/modules - Default module directory. The first listed directory is also the default location for puppet module install, r10k and Code Manager to install modules into. This should be utilized for component modules and Forge modules. We recommend that you do not create any modules in here named role or profile.
  • /etc/puppetlabs/code/environments/production/site - A best practice, but not the default setting. This provides a separate directory in which to create two special modules labeled role and profile. This directory can also contain other internally developed modules that follow the release cadence of the control repository.
  • /etc/puppetlabs/code/modules - A global modules directory that’s a holdover from before directory environments existed. There is really no good reason to use this directory, but it's worth mentioning because it’s still within the modulepath.
  • /opt/puppetlabs/puppet/modules - This is where Puppet puts modules that are essential for Puppet Enterprise. For stability and to ensure smooth upgrades, we ask that you keep these exactly as we installed them. All of these modules are named pe-<name> to prevent namespace collisions with other modules.

Hiera datadir

This directory is the location for Hiera data. This is configured for each backend in the Hiera config file at /etc/puppetlabs/puppet/hiera.yaml. The default for the yaml backend with Puppet Enterprise is /etc/puppetlabs/code/environments/%{environment}/hieradata, which should be fine for most use cases. The actual Hiera directory structure is completely customizable.

NOTE:Any changes to hiera.yaml require you to restart Puppet Server. My recommendation is to use the puppet/hiera module to manage the hiera.yaml configuration, which will handle restarting the service.

Environments

Code Manager (and r10k, on which Code Manager is based) uses Git repositories in a unique and sometimes confusing way. Traditionally, when you use a version control system like Git, branches represent mutually exclusive permutations of the base code, often created for development of features before being merged into a main branch. It's rare for one system to use two different branches of code at the same time.

With Code Manager however, branches in the control repository represent environments. The code, and the agents that use it, are all separated from each other, and each branch is represented by its own directory, complete with all of its files. You can't use traditional Git tools directly from the environments directory on the main to update the control repo.

Let me be clear: This would be a horrible idea anyway, as it pushes untested changes immediately into production. This defeats the point of infrastructure as code. Edit your code elsewhere, and use pull requests to get it into QA and later production.

Now for an additional directory structure oddity. Best practice is that the control repo does not include a modules directory. Instead, you should use a Puppetfile that lists all the modules to be included, and if possible, the version of each module. This allows for easier consumption of Forge modules and a tighter, more streamlined control repo. The exception to this practice is the site directory for modules that have been developed internally. The site directory can follow the same release cadence as the control repository itself, in order to avoid constant double commits. The most common examples of this are the role and profile modules that teams create for their own use.

Module Structure

Here is the bulk of the magic that often causes confusion. Modules have a specific Puppet directory structure, and various components of Puppet expect things to be in certain directories.

Below is an example module. You’ll notice that there are a lot of files related to the testing of code and the pipeline tools used during development (Travis, Rspec, Guard, Rubocop). Not every module will have these exact files or directories; some will have fewer, and others will have more or different files and directories.

[root@primary spec]# pwd
/etc/puppetlabs/code/environments/production/modules/azure
[root@primary spec]# tree -aL 1
.
├── .fixtures.yml       # Dependencies config for unit testing
├── .gitignore          # List of files for git to ignore during commits
├── .rspec          # Rspec configuration
├── .rspec_parallel     # Rspec parallel configuration
├── .rubocop.yml        # Rubocop configuration
├── .travis.yml         # Travis configuration
├── CHANGELOG.md        # Markdown file of changes
├── CONTRIBUTING.md     # Markdown file of how to contribute
├── Gemfile         # List of gem dependencies for testing
├── Guardfile           # Configuration file for Guard
├── LICENSE         # Module license
├── README.md           # Readme
├── Rakefile            # Unit testing configuration
├── data                # Data in code (Hiera 5) directory
├── examples            # Examples directory
├── hiera.yaml          # Data in code (Hiera 5) configuration file
├── lib             # Lib directory
├── manifests           # Manifests directory
├── metadata.json       # Metadata for Puppet Forge
├── rakelib         # Extensions for unit testing
├── spec                # Spec directory
└── templates           # Template directory

The rest of this section dives deeper into some important directories.

Puppet Directory Listing

The manifests Puppet Directory

This is often the first directory people discover. It contains all the Puppet code in .pp files. It is important to realize that Puppet will read and execute all code in every file in these directories, regardless of whether it meets convention. Therefore it’s crucial that all Puppet code, including any “include” statements, be wrapped inside a class or defined type. The only exception is for "produces" and "consumes" statements related to orchestration. Additionally, there should only be one class or defined type per manifest. Finally, there should be a special manifest named init.pp, inside of which you have a single class that matches the name of the module. The remaining classes should be in manifest files, with the class being defined as modulename::classname or modulename::subdirectory::classname. The example below illustrates this naming scheme.

[root@primary manifests]# pwd
/etc/puppetlabs/code/environments/production/modules/firewall/manifests
[root@primary manifests]# tree
.
├── init.pp         # class firewall
├── linux
│   ├── archlinux.pp        # class firewall::linux::archlinux
│   ├── debian.pp       # class firewall::linux::debian
│   ├── gentoo.pp       # class firewall::linux::gentoo
│   └── redhat.pp       # class firewall::linux::redhat
├── linux.pp            # class firewall::linux
└── params.pp           # class firewall::params

The files Puppet Directory

This is the second most commonly encountered directory. This is a directory used by the Puppet file server. Of all of the communication that occurs over TCP port 8140 with the Puppet server, one endpoint is a file server. By default, the file server looks through every module in the modulepath for a files directory, and adds it to puppet://<SERVER NAME>/modules/<MODULE NAME>/<FILE PATH>.

Another thing Puppet does to make things easier, but sometimes a bit confusing, is to omit <SERVER NAME>. Puppet assumes the name of the Puppet -main- generating the catalog is the correct -master- to serve the file as well. (This is almost always correct). So the path is puppet:///modules/<MODULE NAME>/<FILE PATH>. It’s important to note that the modules portion of the path does not refer to the modules directory, but to the modules file server. Even modules in other directories that are still in the modulepath will use the puppet:///modules/<MODULE NAME>/<FILE PATH> path.

The templates Puppet Directory

After using Puppet for managing configuration files, people run into the requirement to customize those files for their needs. The template function and directory provide this functionality. While the files directory uses the source attribute of the file type, keeping the file data out of the catalog, the templates function uses the content attribute and places the entire file contents into the catalog.

Prior to Puppet 4, templates could only be in .erb (embedded Ruby) format. Now a more human readable .epp (embedded Puppet) format is also available. Any variables available within the scope of the class calling the template function will be available inside the template as well. This includes all top scope variables (most usefully, facts), which are loaded automatically inside the class.

The examples / tests Puppet Directory

Previously known as tests, this directory has recently been renamed to examples to better illustrate its purpose. However many modules still have that directory named tests.

This directory is useful for when you are writing component modules, as it provides both a convenient place to include code you can use for testing, as well as a way to communicate how best to write wrapper code that utilizes the module. This directory isn’t used automatically by Puppet.

The spec Puppet Directory

This is a very important directory, as it includes unit and acceptance tests. Testing infrastructure code to ensure predictable outcomes is a core component of DevOps. As I often tell my clients, managing infrastructure as code gives you the amazing ability to quickly roll back changes to a previous state if you suffer an outage in production — but that also means you must write a new test to ensure that specific type of outage doesn’t happen again.

I’ll provide a quick explanation of each directory in-line in this example code:

[root@primary spec]# pwd
/etc/puppetlabs/code/environments/production/modules/ntp/spec
[root@primary spec]# tree
.
├── acceptance              # Directory for beaker acceptance tests
│   ├── class_spec.rb           # Acceptance tests
│   ├── disable_monitoring_spec.rb  # Acceptance tests
│   ├── nodesets            # Node definitions for testing
│   │   ├── centos-7-x64.yml
│   │   ├── debian-8-x64.yml
│   │   ├── default.yml     # Default node for acceptance testing
│   │   └── docker          # BTW we support docker!
│   │       ├── centos-7.yml
│   │       ├── debian-8.yml
│   │       └── ubuntu-14.04.yml
│   ├── ntp_config_spec.rb      # These
│   ├── ntp_install_spec.rb     # are
│   ├── ntp_parameters_spec.rb  # a
│   ├── ntp_service_spec.rb     # bunch
│   ├── preferred_servers_spec.rb   # of
│   ├── restrict_spec.rb        # acceptance
│   └── unsupported_spec.rb     # tests
├── classes
│   └── ntp_spec.rb         # Unit tests
├── fixtures                # Dependencies for unit testing
│   ├── manifests
│   └── modules
│       └── my_ntp
│           └── templates
│               └── ntp.conf.erb
├── spec_helper.rb          # Base definition for unit testing
├── spec_helper_acceptance.rb   # Base definition for acceptance testing
└── spec_helper_local.rb        # Extra definition for unit testing

The lib Puppet Directory

One of Puppet's advantages is that it’s not limited to the resources created and released by Puppet Inc. New types and providers, functions, report handlers, and more can be easily created inside the lib directory. Anything that's in the lib directory will be synced to all the agents in the environment via PluginSync — and this will happen for every module in the modulepath. Directory structure is very important here, but again, is documented elsewhere.

As I did above, I'm referencing each directory in-line in the example code below.

[root@primary lib]# pwd
/etc/puppetlabs/code/environments/production/modules/sqlserver/lib
[root@primary lib]# tree
.
├── facter                  # Facter extensions
│   ├── sqlserver_features.rb
│   └── sqlserver_instances.rb
├── puppet                  # Puppet extensions
│   ├── parser
│   │   └── functions               # Custom functions
│   │       ├── sqlserver_is_domain_or_local_user.rb
│   │       ├── sqlserver_upcase.rb
│   │       ├── sqlserver_validate_hash_uniq_values.rb
│   │       ├── sqlserver_validate_instance_name.rb
│   │       ├── sqlserver_validate_on_off.rb
│   │       ├── sqlserver_validate_range.rb
│   │       ├── sqlserver_validate_size.rb
│   │       └── sqlserver_validate_svrroles_hash.rb
│   ├── property                # Custom properties for providers
│   │   ├── sqlserver_login.rb
│   │   └── sqlserver_tsql.rb
│   ├── provider                # Custom providers to match types
│   │   ├── sqlserver_features
│   │   │   └── mssql.rb
│   │   ├── sqlserver_instance
│   │   │   └── mssql.rb
│   │   ├── sqlserver.rb
│   │   └── sqlserver_tsql
│   │       └── mssql.rb
│   └── type                    # Custom types
│       ├── sqlserver_features.rb
│       ├── sqlserver_instance.rb
│       └── sqlserver_tsql.rb
└── puppet_x                    # Code that doesn’t fit in puppet
    └── sqlserver
        ├── features.rb
        ├── server_helper.rb
        └── sql_connection.rb

The facts.d Puppet Directory

Not everyone wants or needs to learn enough Ruby to write Puppet facts. Fortunately, Puppet supports the creation of facts in any language that is executable on a target system. This allows facts to be written in Bash, PowerShell, Python and more. Anything that returns a key/value pair can be placed in this directory, and will be synced to all of the clients and executed. Be careful not to place any massive long-running scripts in here, as they will slow down every Puppet run.

The data Puppet Directory

There's something new in the latest versions of Hiera: the ability to build Hiera data structures inside of module code. This optional practice is quickly beginning to replace the older params.pp structure because it's more flexible.

Utilizing the Puppet Directory Structure

Directories matter in Puppet. Skillful placement is important if you want to get the most out of Puppet. I hope this post has helped to answer your questions and reduce any confusion about how to utilize the Puppet directory structure. If you have any further questions, please ask them in the comments section below.

DISCOVER MODULES ON THE FORGE

Learn More

This blog was originally published on March 29, 2017, and has since been updated for relevance and accuracy.