Getting started with Graphite
Puppet Enterprise can export many metrics to Graphite, a third-party monitoring application that stores real-time metrics and provides customizable ways to view them. After Graphite support is enabled, Puppet Server exports a set of metrics by default that is designed to be immediately useful to Puppet administrators.
To use Graphite with Puppet Enterprise, you must:
- Install and configure a Graphite server.
- Enable Puppet Server's Graphite support
Grafana provides a web-based customizable dashboard that's compatible with Graphite, and the Grafanadash module installs and configures it by default.
Using the Grafanadash module
The Grafanadash module quickly installs and configures a basic test instance of Graphite with the Grafana extension. When installed on a dedicated Puppet agent, this module provides a quick demonstration of how Graphite and Grafana can consume and display Puppet Server metrics.
-
SELinux can cause issues with Graphite and Grafana, so the module temporarily disables SELinux. If you reboot the machine after using the module to install Graphite, you must disable SELinux again and restart the Apache service to use Graphite and Grafana.
-
The module disables the iptables firewall and enables cross-origin resource sharing on Apache, which are potential security risks.
Installing the Grafanadash module
Install the Grafanadash module on a *nix agent. The
module's grafanadash::dev
class installs
and configures a Graphite server, the Grafana extension, and a default
dashboard.
- Install a *nix PE agent to serve as the Graphite server.
-
On the Puppet agent node, run:
sudo puppet module install puppetlabs-grafanadash
-
On the Puppet agent node, run:
sudo puppet apply -e 'include grafanadash::dev'
Running Grafana
Grafana is a dashboard that can interpret and visualize Puppet Server metrics over time, but you must configure it to do so.
Grafana runs as a web dashboard, and the Grafanadash module configures it at port 10000 by default. However, there are no Puppet metrics displayed by default. You must create a metrics dashboard to view Puppet's metrics in Grafana, or edit and import a JSON-based dashboard such as the sample Grafana dashboard that we provide.
-
In a web browser on a computer that can reach the Puppet agent node, navigate to
http://<AGENT_HOSTNAME>:10000
. -
Open the
sample_metrics_dashboard.json
file in a text editor on the same computer you're using to access Grafana. -
Throughout the file, replace our sample setting of
primary.example.com
with the hostname of your primary server. This value must be used as themetrics_server_id
setting, as configured below. - Save the file.
- In the Grafana UI, click search (the folder icon), then Import, then Browse.
- Navigate to and select the edited JSON file.
This loads a dashboard with nine graphs that display various metrics exported from the Puppet Server to the Graphite server. However, these graphs remain empty until you enable Puppet Server's Graphite metrics.
Enabling Puppet Server's Graphite support
Use the PE Master node group in the console to configure Puppet Server's metrics output settings.
- In the console, click Node groups, and in the PE Infrastructure group, select the PE Master group.
-
On the Classes tab, in the
puppet_enterprise::profile::master
class, add these parameters:-
Set
metrics_graphite_enabled
totrue
(default is false). -
Set
metrics_server_id
to the primary server hostname. -
Set
metrics_graphite_host
to the hostname for the agent node on which you're running Graphite and Grafana. -
Set
metrics_graphite_update_interval_seconds
to a value to set Graphite's update frequency in seconds. This setting is optional, and the default value is60
.
-
Set
-
Verify that these parameters are
set to their default values, unless your Graphite server
uses a non-standard port:
-
Set
metrics_jmx_enabled
totrue
(default value). -
Set
metrics_graphite_port
to2003
(default value) or the Graphite port on your Graphite server. -
Set
profiler_enabled
totrue
(default value).
-
Set
- Commit changes.
Sample Grafana dashboard graphs
Use the sample Grafana dashboard as your starting point and customize it to suit your needs. You can click on the title of any graph, and then click edit to adjust the graphs as you see fit.
Graph name | Description |
---|---|
Active requests | This graph serves as a "health check" for the Puppet Server. It shows a flat line that represents the number of CPUs you have in your system, a metric that indicates the total number of HTTP requests actively being processed by the server at any moment in time, and a rolling average of the number of active requests. If the number of requests being processed exceeds the number of CPUs for any significant length of time, your server might be receiving more requests than it can efficiently process. |
Request durations | This graph breaks down the average response times for different types of requests made by Puppet agents. This indicates how expensive catalog and report requests are compared to the other types of requests. It also provides a way to see changes in catalog compilation times when you modify your Puppet code. A sharp curve upward for all of the types of requests indicates an overloaded server, and they should trend downward after reducing the load on the server. |
Request ratios | This graph shows how many requests of each type that Puppet Server has handled. Under normal circumstances, you should see about the same number of catalog, node, or report requests, because these all happen one time per agent run. The number of file and file metadata requests correlate to how many remote file resources are in the agents' catalogs. |
External HTTP Communications | This graph tracks the amount of time it takes Puppet Server to send data and requests for common operations to, and receive responses from, external HTTP services, such as PuppetDB. |
File Sync | This graph tracks how long Puppet Server spends on File Sync operations, for both its storage and client services. |
JRubies | This graph tracks how many JRubies are in use, how many are free, the mean number of free JRubies, and the mean number of requested JRubies. If the number of free JRubies is often less than one, or the mean number of free JRubies is less than one, Puppet Server is requesting and consuming more JRubies than are available. This overload reduces Puppet Server's performance. While this might simply be a symptom of an under-resourced server, it can also be caused by poorly optimized Puppet code or bottlenecks in the server's communications with PuppetDB if it is in use. If catalog compilation times have increased but PuppetDB performance remains the same, examine your Puppet code for potentially unoptimized code. If PuppetDB communication times have increased, tune PuppetDB for better performance or allocate more resources to it. If neither catalog compilation nor PuppetDB communication times are degraded, the Puppet Server process might be under-resourced on your server. If you have available CPU time and memory, increase the number of JRuby instances to allow it to allocate more JRubies. Otherwise, consider adding additional compilers to distribute the catalog compilation load. |
JRuby Timers | This graph tracks several JRuby pool metrics.
|
Memory Usage | This graph tracks how much heap and non-heap memory that Puppet Server uses. |
Compilation | This graph breaks catalog compilation down into various phases to show how expensive each phase is on the primary server. |
Example Grafana dashboard excerpt
The following example shows only the targets
parameter of a
dashboard to demonstrate the full names of Puppet's exported Graphite metrics
(assuming the Puppet Server instance has a domain of
primary.example.com
) and a way to add targets directly to an
exported Grafana dashboard's JSON content.
"panels": [
{
"span": 4,
"editable": true,
"type": "graphite",
...
"targets": [
{
"target": "alias(puppetlabs.primary.example.com.num-cpus,'num cpus')"
},
{
"target": "alias(puppetlabs.primary.example.com.http.active-requests.count,'active requests')"
},
{
"target": "alias(puppetlabs.primary.example.com.http.active-histo.mean,'average')"
}
],
"aliasColors": {},
"aliasYAxis": {},
"title": "Active Requests"
}
]
See the sample Grafana dashboard for a detailed example of how a Grafana dashboard accesses these exported Graphite metrics.