Troubleshooting potential issues in Puppet Enterprise: How to learn more
Q: I've run into an issue with my Puppet Enterprise installation that I’d like to learn more about. What are some things I can do to get more information around an issue and potentially treat it?
A: There are several ways to get more information about issues affecting your Puppet Enterprise installation. Here are some potential areas of investigation and what you can do to resolve any issues you may find:
Check Puppet Enterprise service statuses
Are Puppet Enterprise services up and running? On 2016.5 and later you can get the status of services by running
puppet infrastructure status on your master or compile masters. Alternatively, you could run
curl -k https://localhost:8140/status/v1/services on the master node to get a JSON formatted output of the current status of Puppet Enterprise services or check the Puppet Services status in the upper right corner of the console. Be aware that the Puppet Services status indicator in the console can sometimes show false positives when experiencing a load issue.
Review the logs
Puppet's logging can be surprisingly descriptive. Most logs are located in a subfolder of
/var/log/puppetlabs/, however by default the Puppet agent writes to
/var/log/messages. If you're having an issue with PuppetDB, for instance, take a look at the most recent entries in the PuppetDB log. An “Error 500” message during an agent run means that you should consult the Puppet Server log.
If you know the issue affects a specific component but the logs aren’t descriptive enough, you can turn up logging. Here’s how to increase the logging level for the agent and turn up logging for various Puppet Master services.
Search through the knowledge base
When reading documentation pay particular attention to the version of Puppet Enterprise referenced in each document to make sure it's applicable to your installation. When searching the knowledge base use parts of the error message or symptoms in order to help finetune your search. When searching the documentation focus on the service or setting that seems to be having the problem.
Check the known issues page for your version to see if this is an issue that’s already known. If you’re on an older version, check the resolved issues page for later releases to see if the issue you’re experiencing has been fixed in a later version.
If you don’t find what you’re looking for in the documentation or the knowledge base, please let us know as we are always looking to improve our content. There is also a vibrant community of Puppet users that love answering questions; check out the Community page of our website for more information.
Is the issue performance-based?
Are you seeing out of memory (OOM) errors, services going down, connection timeouts in your logs? Sometimes these issues can occur because of a lack of resources. Please consult the Installation requirements documentation to confirm that the system meets the minimum requirements.
Support also highly recommends installing the puppet_metrics_collector module and collecting metrics on performance. For some examples of how to use and interpret the data, take a look at Nick's knowledge base article on fixing performance issues. You can also gather a metrics package by running
/opt/puppetlabs/bin/puppet-metrics-collector create-tarball. This tarball can be submitted to Puppet Support, who can then provide recommendations on tuning. In Puppet Enterprise version 2017.1 and newer these metrics are automatically collected in the Support Script if the module is installed.
We’ve got extensive documentation on performance tuning in the configuring Puppet Enterprise section. Another common performance issue is the “thundering herd” where agent check-ins bunch up, creating periods of high demand and resource utilization on the master. This manifests itself as slow agent runs and even agent run timeouts. Take a look at this article on identifying and treating a thundering herd condition.
Collect the support script
The support script is a useful tool that packages up various system metrics and PE-specific logs and packages for ease of transfer and readability. For more information about the support script, take a look at this section in the documentation. The command varies in older versions of Puppet, but on versions 2016.2.0 and newer, run
/opt/puppetlabs/bin/puppet enterprise support to collect the support script. In versions 2016.4 and earlier this command can only be run on monolithic masters or the master, console and database components of a split install, though you can copy
/opt/puppetlabs/server/data/enterprise/modules/pe_support_script/files/puppet-enterprise-support to compile masters and MCO hubs and spokes to manually run the script. In versions 2016.5 and later you run this command on any Puppet Enterprise nodes running Red Hat, Ubuntu or SLES operating systems.
If you open a ticket with Support, they may ask you for a support script, but it can also be a useful tool for you, as it gathers a lot of information about the system into a single location.
Open a ticket
Support is here to help you! If you’ve gone through the troubleshooting steps above and the issue is still present, open a ticket with Support and we’ll work with you to identify and resolve it.
When opening a ticket it is helpful to include:
- What is the issue? What is the impact of the issue on business operations?
- How did you notice the issue was occuring?
- Did this issue suddenly appear or has it been happening for a while? If it suddenly started, what changes were made (if any) around the same time?
- Is this affecting only specific agents? All agents?
- If the issue is related to a module, what module is it? What version is the module?
- Is this an upgrade? If so, what version are you upgrading from? Upgrading to? Do you have a backup?
- Does this issue have to do with code you’ve written yourself or from an unsupported third party? (Please be aware that troubleshooting custom code is outside of the bounds of support and as such we will provide a best effort attempt but ultimately may not be able to help you resolve the issue. The Puppet Community can be a great resource for input on coding issues — specifically the Slack channel, a publicly accessible chat room full of Puppet practitioners.)
Paul Schaffer is an associate support engineer at Puppet.
Check out some of the resources described in this post: