Certificate improvements in Puppet 6
Welcome to SSLandia, everyone.
Most Puppet users are aware that Puppet uses x509 certificates under the hood, but the details of the certificate implementation and how to work with it effectively are less well understood. We've made substantial changes to this part of the codebase in Puppet 6 and Puppet Enterprise 2019, so now seems like a good time to get the word out about the changes and improvements, as well as provide some background detail on the implementation in general.
The cycle of life
First off, it's important to understand that Puppet uses certificates in client-authenticated mode. This differs from the SSL/TLS encryption you use when browsing websites. Browsing uses one-way authentication, where your browser checks the server's certificate to make sure it's issued from a trusted CA, matches the hostname of the website you're connecting to (to avoid phishing scams), and isn't expired.
But the web server doesn't perform the same level of checking against your browser because that would require every computer and mobile device also have a valid certificate. Clearly that's infeasible for general-purpose web browsing, but issuing a certificate to every agent node is exactly how Puppet is able to do two-way authentication: the agents check the validity of the master's cert, and the master checks the validity and identity of each agent connection as well.
These identities are embedded in the certificates that Puppet's built-in certificate authority (CA) issues to nodes. The certificates begin their lives as a Certificate Signing Request (CSR), which the agents automatically generate along with a public and private key pair, when they first start up. Each agent submits its CSR to the Puppet master and checks on subsequent runs to see if there's a signed certificate waiting for it; if so it will download the signed certificate and use it to request a catalog and proceed as normal.
Many modern Puppet installations take advantage of policy-based autosigning to help issue certificates to nodes in a scalable, secure manner, but manually issuing certificates is fine too. The master and agent both verify that the other end's certificate:
- hasn't been revoked, by checking in the Certificate Revocation List (CRL). (More on this later.)
- isn't expired, by comparing the current date and time to the start and end dates enumerated in the certificate's ValidityPeriod field,
- has been issued by a trusted Certificate Authority
Additionally, since the agent is initiating the connection, it verifies that the hostname of the server it's connecting to is either the Subject of the server's certificate or it's listed in the SubjectAltNames field, which acts as a list of valid aliases for the server. Contrary to some forum post folklore, the reverse is not true: your agents do not need to have DNS names or
/etc/hosts entries which map to their certificate identities in order to be trusted by the puppet masters.
When an agent or master shuffles off to the Great Data Center in the Sky, its certificate goes too. Usually this is the job of a decommissioning script (or Puppet Task!) which purges the node's data from PuppetDB, does cleanup of any systems external to puppet, and revokes the cert - causing its serial number to be put into a cryptographically signed file enumerating all of the certs which are no longer to be trusted – the aforementioned Certificate Revocation List, or CRL. This both prevents a decommissioned node from participating in the Puppet infrastructure and makes its hostname/identity available for reissuance if the node is simply getting re-provisioned.
So what's the problem?
With all that knowledge fresh in our minds, we have a good starting point to talk about the recent improvements. In this post, I'll talk about Puppet 6 versus earlier versions of Puppet; for Puppet Enterprise users this maps to PE 2019.0 versus everything earlier. Where there are PE-specific features in the implementation, I'll use the PE numbering, but it's safe to assume that everything in "Puppet 6" also applies to Puppet Enterprise.
Over the years, Puppet's certificate systems had accreted a number of strange bugs and behaviors. We introduced a newly-written Clojure-based CA back in Puppet 4, but in order to maintain compatibility it had deliberately preserved some of these quirks. When we were scoping out changes for Puppet 6, our goals were to:
- Build new features that users had requested but which were difficult or impossible to do without breaking compatibility with earlier versions, primarily to support Intermediate certificate authorities as a first-class deployment scenario.
- Unify and clean up the user experience and command line interfaces for working with certificates, so they would support real-world workflows rather than just be a confusing collection of commands.
- Set the stage for future work and make it easier to continue evolving these features
The feature presentation
Let's talk more about each of these. There's a lot of technical detail here, so buckle in!
Previously, when a Puppet master started up, it would go through a process similar to the agent certificate bootstrap described above: check to see whether it ought to serve as a CA, and if so, generate a private/public key pair. It would use that key pair to generate and sign a root certificate authority and signing certificate, then immediately issue itself a server certificate to answer client requests. As Luke Kanies put it in a recent tweet when we were discussing these changes:
The problem, as he noted, is that above a certain threshold of complexity, this simple system began to run into trouble. As larger organizations have adopted Puppet, its PKI has come under scrutiny from security teams, and one of the primary complaints about this system is that is this conflation of "self signed root" and "signing certificate" violates security policy.
In response, we've reworked the certificate generation process to prefer intermediate certificate authorities by default. This allows you to keep the root key behind, err, lock and key, rather than needing to have it "live" to sign incoming requests.
As a nice side effect, sites which require our PKI to be subservient to an internal, corporate root CA, have an easy affordance to drop a pre-generated intermediate CA key and cert in place and circumvent the bootstrapping process completely. (Previously, this often required administrators to delete all of the auto-generated certs after they'd already been put in place, a common source of frustration).
Visually, the difference between the old and new signing chain looks like this:
Now with the new intermediate CA and infrastructure CRL, the chain of trust looks like this:
(A big shout-out to Steven Hasegawa on our amazing UX team for these images!)
puppetserver ca CLI tool provides two commands to aid in setting up an intermediate CA:
setup. If a user wants to set up an intermediate CA with an external root cert, they can supply a certificate bundle consisting of their root cert plus a CA signing cert signed by that root, a CRL file containing the root’s CRL and the CRL for the new CA cert, and the private key or the intermediate signing cert, and use
import command to drop these files in place and trigger the rest of the CA setup.
The PE installer uses the
setup command to create a root cert and an intermediate signing cert for Puppet Server. This means that for new PE 2019 installations, the default CA is always an intermediate CA.
A user can also call
setup to create a default intermediate CA for their install. Both
import should be run before starting Puppet Server for the first time, as the previous (Puppet 5) behavior of the server creating a self-signed CA cert on startup is still in place.
import will not overwrite an existing CA. You can always delete their CA per the instructions for regenerating the CA if they decide to switch to an intermediate CA setup later.
Improved policy autosigning
In the early days of Puppet, when we still used stone axes and knives to set up our public-key infrastructure, the only options an administrator had for certificate signing were (a) manually sign requests when new agents came online, or (b) use an
autosign.conf which matched hostnames or IP ranges that ought to be automatically signed.
As we moved out of the Stone Age and learned to smelt bronze (metaphorically speaking), we developed the concept of policy-based autosigning, which allowed administrators to write an external script that the puppetserver would call upon receiving a new CSR. The puppetserver passes the content of the CSR to the script on STDIN, and the script can make a decision (hence the "policy" part of the name) based on some business logic, whether to sign the certificate or not. The most popular policy autosigner in the Puppet community was written by Daniel Drier and lives at policy signer website.
This works very well for most people but it had a couple of limitations, put in place to prevent security loopholes; namely, signing certificates with subject alternative names or auth extensions was completely disallowed. This limitation prevented a specific use-case for people who had highly dynamic Puppet infrastructures because they could not autosign certificates for freshly scaled-up compile masters (which, by definition, had subject alt names for the "VIP" or load-balanced name that clients connected to).
This is still disabled by default for security reasons, but users who need it can now turn it on by setting
allow-authorization-extensions to true in the
certificate-authority section of Puppet Server’s config (usually located in
ca.conf). Once these have been configured,
puppetserver ca sign --certname <name> can be used to sign certificates with these additions.
As you may recall from our walkthrough of the certificate lifecycle, once nodes are no longer active, their certificates should be revoked. Previously, the Puppet CA did this by using the standard Certificate Revocation List that is built into the Ruby openssl libraries. One unintended consequence of this simple strategy is that sites with large numbers or nodes, or who churn through a lot of nodes due to frequent re-provisioning, end up with extremely large CRL files, which we found could affect performance both at agent startup (as it read in the file) and when creating network connections (as it validated the remote end's certificate against an extremely large list... containing mostly other agents).
So, the Puppet Server CA may now create a smaller CRL that contains only revocations of those nodes that agents are expected to talk to during normal operations (like compile masters or hosts that agents connect to as part of agent side functions) and distribute that CRL to agents, rather than the CRL that contains all agent revocations.
Once enabled, you'll need to manage a file at
$cadir/infra_inventory.txt. It should be a newline separated list of the certnames, that when revoked should be added to the Infra CRL. The certnames must match existing certificates issued and maintained by the Puppet Server CA. Setting the value "certificate-authority.enable-infra-crl" to true (see Puppet Server's config syntax for details) will cause Puppet Server to update both its Full CRL and its Infra CRL with the certs that match those certnames when revoked. Now when agents first check in they will receive a CRL that includes only the revocations of certnames listed in the infra_inventory.txt.
This feature is disabled by default, as the definition of what constitutes an "infrastructure" node is site specific and sites with a single master configuration have no need for the additional work. See details in https://tickets.puppetlabs.com/browse/SERVER-2231.
New command line interactions
As I mentioned at the beginning, the command line experience for interacting with the certificate subsystems in Puppet was... not wonderful. There were a number of commands that had accumulated over the years, and it wasn't clear why we had both
puppet cert and
puppet certificate ... and
puppet ca, for that matter; much less clear was when you should use which tool.
Good news, everybody
In Puppet 6, we removed the various puppet subcommands (cert, ca, certificate, certificate_request, and certificate_revocation_list) that used to be used for interacting with the CA and certificates. We replaced these tools with two new tools with clearly-defined roles:
puppetserver ca for CA tasks like signing, listing, and revoking certs, and
puppet ssl, for agent-side tasks like requesting and downloading certs.
puppetserver ca replaces all the previous CLI tools for interacting with the CA and SSL artifacts.
puppet certificate_request, and
puppet certificate_revocation_list have all been removed in Puppet 6, and replaced with this and
puppet ssl (see below).
Because these commands utilize Puppet Server’s API, all except
import need the server to be running in order to work.
Since this tool is shipped as a gem alongside Puppet Server, it can be updated out-of-band to pick up improvements and bug fixes. To upgrade it, the user can run this command:
/opt/puppetlabs/puppet/bin/gem install -i /opt/puppetlabs/puppet/lib/ruby/vendor_gems puppetserver-ca
puppet ssl has been streamlined to include just those operations which are applicable for an agent which needs to submit a CRL, download a signed certificate, and verify its chain of trust against the master. Check out the documentation for the subcommand for details on its usage, available here.
Hopefully all of this sounds great to you and you're wondering "how can I get it?" In the simplest case, the answer is "Upgrade in place!" Your existing CA will work fine with the new versions and you'll be able to immediately start using the new command line tools, enable the infrastructure CRL, and set up a new policy autosigner without making any other changes.
If you want to take advantage of the Intermediate CA feature, you will need to roll out new certificates. We have a how-to guide for regenerating your certificates and a cert regen module to help with the process.
We're really excited about being able to fix these kinds of long-standing issues with deliberately designed experiences. Please give the Puppet 6 / PE 2019 certificate systems a spin and let us know, via #puppet-dev on slack or the puppet-dev mailing list, how it works for you.
Eric Sorenson is a product manager at Puppet.