Infrastructure repair with Bolt

Not to beat on our own drum or anything, but Puppet is a great tool for managing and configuring your entire infrastructure. It allows you to ensure trusted and consistent state across all your nodes and lets you update that configuration in the click of a button to deploy changes to your Puppet codebase.

But what happens when the agent cannot reach the master for these configuration updates?

Puppet 5.5.16 & 6.4.4 and prior had a long-standing bug that prevented the agent from using a proxy when the HTTP_PROXY environment variable was defined. This issue, and a number of other http proxy issues, were recently fixed so the agent now correctly respects those variables/settings. However, in some cases, this means that the Puppet agent may attempt to connect to Puppet Server via the previously ignored proxy.

In many environments, an http proxy is configured to only allow connections from internal hosts to external hosts, and it will reject any attempt to "reflect" off of the proxy from an internal host to another internal host. In these environments, Puppet agents may no longer be able to connect to its Puppet Server after upgrading to 5.5.17, 6.4.4 or 6.8.0+. And since the agent can't get a catalog, you can't use Puppet to remedy the issue.

You can, however, use Bolt to remedy the issue! Bolt is well suited to solve a problem like this because it does not rely on agents getting a catalog from Puppet Server. That means that we can use it for out-of-band infrastructure repair.

Configuring agents to not use a proxy

If you're in such an environment and you've upgraded Puppet then it's likely that you've lost Puppet control over your agents. If your proxy doesn't allow internal connections, then agent runs will fail.

To resolve this issue, the agent should be configured to bypass the proxy and connect directly to the Puppet Server by adding the FQDN of the Puppet Server to the NO_PROXY environment variable. This can also be accomplished using the no_proxy setting in the latest releases, however that setting will be overridden by the HTTP_PROXY environment variable until a fix is released in 6.9.0 (and backported to 5.5.18 & 6.4.5).

This guide will show you how to use Bolt to ensure that both the system environment variable NO_PROXY and the no_proxy puppet setting includes your Puppet Server's FQDN. Note that changes to the global NO_PROXY environment variable will affect all child processes that Puppet executes or services that it starts.

Bolt target group setup

Note: We need to run different commands to check and set environment variables for Windows and Linux nodes. To make it easier to do so, we'll configure target groups for each based on PuppetDB queries. If you already have similar groups configured in your infrastructure, then you can skip this section.

If you don't already have Bolt running, then you can find instructions for that on the installation page. To configure the target groups, we'll use the Bolt PuppetDB plugin. Before we can use the plugin we need to configure Bolt to connect to PuppetDB. For this example I will authenticate with PuppetDB using a PE RBAC token.

I obtain a token with puppet-access login -l 0, obtain an ssl ca cert and save them to a directory called proxy_patch. In my case the ca cert of interest was obtained by copying the cert stored at /etc/puppetlabs/puppet/ssl/certs/ca.pem from my Puppet master host to my laptop.

Now I create a Bolt configuration file bolt.yaml with the following configuration:

Now that I have Bolt configured to connect to PuppetDB I can write a Bolt inventoryfile to organize connection information for Puppet agent nodes queried from the database. Target information is stored in a bolt inventoryfile. We will use the version 2 format for inventory which has added support for the PuppetDB plugin.

In the example inventoryfile we configure the PuppetDB plugin to query for targets that are non-Windows. For each fact set that is returned a target uri is set to facts.networking.hostname. It is important to note that values set under the config section are dynamically generated for each target that matches the query. The static information about how to connect the dynamically generated targets is in the group level config section. So in this case I set the transport to ssh. The ssh transport is configured to use the ssh login root, and to use a private key stored in my .ssh directory. You can find more information about configuring Bolt transports at (https://puppet.com/docs/bolt/latest/bolt_configuration_options.html).

At this point we can verify that we can connect to the agent nodes. We will try running a simple Bolt command to echo the hostname for all the targets generated using the plugin. For example:

Now that we have connected to our Linux nodes, let's also add configuration for our Windows nodes.

Notice in the static configuration for the Windows agents there is another plugin reference. In this case we use the prompt plugin to get the WinRM password from the Bolt operator. It is important to note that the prompt plugin is at the group level so the user will only be prompted once when the windows-agents targets are requested and that value will be used to authenticate with all targets.

Bolt's puppet_conf task

Bolt ships with some useful modules for managing infrastructure. We can use the puppet_conf module which contains a task for getting and setting Puppet configuration. We can examine the task information with the following command:

Now that we have got two target groups for our Windows nodes and Linux nodes, we can use Bolt to check the settings and environment variables for all the nodes in these groups.

First, let's use a task to check if nodes have the no_proxy setting with the puppet_conf Bolt task. In this example, we're looking at the Linux nodes, but the task will be the same for Windows nodes:

We see that the setting does not include our Puppet Server FQDN.

Similarly we can check if the NO_PROXY environment variable is set. Here we are checking our Windows nodes. To do the same on our Linux nodes, we'd use the command echo $NO_PROXY instead:

We see that nothing is printed and thus the environment variable is unset.

Fixing the issue

Now that we have an idea of what we need to accomplish, it is time to use Bolt's most powerful capability: the plan. We want to set global environment variables on both Windows and Linux nodes as well as configure Puppet settings.

Plans live in modules, so let's create a module called proxy_patch under site-modules and create a file called init.pp.

Save the following plan to init.pp:

The plan accepts two parameters $nodes and $no_proxy_fqdn_list. The $nodes parameter represents the targets we wish to run plan against. The FQDN list represents a comma separated list containing the FQDN of your Puppet Server. It is important to note with this implementation the NO_PROXY environment variable will always be replaced with the $no_proxy_fqdn_list argument. You may consider modifying the plan to add some logic to query the contents of NO_PROXY and append the $no_proxy_fqnd_list argument as you see fit.

The first step of the plan partitions the targets based on operating system. We want to use different modules for managing system environment variables based on target OS. We accomplish environment variable management by applying Puppet manifest code. Specifically, we use the windows_env resource to set the NO_PROXY environment variable on Windows targets and the file_line resource from the stdlib module to manage /etc/environment on our Linux targets. In both cases we notify the Puppet service that environment variables have changed.

Once we have set the environment variable we can use a task from the puppet_conf module to configure the no_proxy Puppet setting. Note that the task is cross-platform, so we do not need different invocations based on target OS!

Before we can run the plan, we need to download the modules puppet-windows_env and puppetlabs-stdlib (the puppet_conf module ships with the Bolt system packages). In order to do that save the following to a file called Puppetfile in the same directory as bolt.yaml.

We can use Bolt to install those modules with: bolt puppetfile install.

Now that we have the required modules we can run the plan. We invoke the plan and run it against all targets in our inventory (which is passed to the $nodes plan parameter) and the FQDN of our Puppet Server.

Now let's verify the environment variables were set as expected for both the Windows and Linux based targets.

We have confirmed that the environment variables have been updated with a Bolt command! Now let's check the Puppet setting with the puppet_conf task

Now we've also confirmed that we have updated the no_proxy settings and can breathe easy knowing our Windows and Linux Puppet agents will not be cut off from communicating with our Puppet Server by attempting to use a proxy connection instead of connecting directly.

Cas Donoghue is a software engineer at Puppet.

Learn more

Puppet sites use proprietary and third-party cookies. By using our sites, you agree to our cookie policy.