Get started with Graphite
Graphite is a third-party monitoring application that stores real-time metrics and provides customizable ways to view them. Puppet Enterprise (PE) can export many metrics to Graphite. After enabling Graphite support, Puppet Server exports a set of metrics by default that is designed to be immediately useful to Puppet administrators.
grafanadash
and puppet-graphite
modules are not Puppet-supported. We recommend using another method to View and manage Puppet Server metrics, such as the puppet_operational_dashboards
module, our Splunk plugin, or the
Metrics API.
- Install and configure a Graphite server.
- Enable Puppet Server's Graphite support.
- (Optional) Use the Grafana dashboard extension for Graphite to visualize metrics. To see a demonstration of this setup, Use the grafanadash module.
Use the grafanadash
module
Grafana
provides a web-based, customizable, Graphite-compatible dashboard. The grafanadash
module installs and configures a basic Graphite test instance with
the Grafana extension. When installed on a Puppet agent, the purpose of this module is to demonstrate how
Graphite and Grafana can consume and display Puppet Server metrics.
- SELinux can cause issues with Graphite and Grafana, so the module temporarily disables SELinux. If you reboot the machine after
using the
grafanadash
module to install Graphite, you must disable SELinux again and restart the Apache service to use Graphite and Grafana. - The module disables the iptables firewall and enables cross-origin resource sharing on Apache, which are potential security risks.
puppet_operational_dashboards
module or the Metrics API.Install the grafanadash
module
Install the grafanadash
module on a dedicated *nix agent. The module's grafanadash::dev
class installs and configures a Graphite server, the Grafana
extension, and a default dashboard.
- Install a dedicated *nix PE agent to serve as the Graphite server. For instructions, refer to Installing agents.
-
As root on the agent node, run:
sudo puppet module install puppetlabs-grafanadash
-
As root on the agent node, run:
sudo puppet apply -e 'include grafanadash::dev'
Run Grafana
Grafana runs as a web-based dashboard, and the grafanadash
module configures it to use port 10000 by default. To
view Puppet Server metrics in Grafana,
you must configure a metrics dashboard.
Enable Puppet Server's Graphite support
Use the PE Master node group in the Puppet Enterprise (PE) console to configure Puppet Server's metrics output settings.
- In the PE console, go to .
-
On the Classes tab, locate the
puppet_enterprise::profile::master
class, and add these parameters:-
Set
metrics_graphite_enabled
totrue
(the default is false). -
Set
metrics_server_id
to the primary server hostname. -
Set
metrics_graphite_host
to the hostname of the agent node where you're running Graphite and Grafana. -
Set
metrics_graphite_update_interval_seconds
to an integer representing a number of seconds. This is the frequency at which Graphite updates, and the default value is60
seconds.
-
Set
-
Verify that these parameters are
set to their default values, unless your Graphite server
uses a non-standard port:
-
Confirm
metrics_jmx_enabled
is set totrue
. -
Confirm
metrics_graphite_port
is set to2003
or the Graphite port on your Graphite server. -
Confirm
profiler_enabled
is set totrue
.
-
Confirm
- Commit changes.
Sample Grafana dashboard graphs
In the Run Grafana steps, you used a JSON file to set up a sample Grafana dashboard. You can customize this dashboard by clicking the title of any graph and clicking Edit.
Graph name | Description |
---|---|
Active requests | This graph serves as a "health check" for the Puppet Server. It shows
a flat line that represents the number of CPUs you have in your system,
a metric that indicates the total number of HTTP requests actively being
processed by the server at any moment in time, and a rolling average of
the number of active requests. If the number of requests being processed exceeds the number of CPUs for any significant length of time, your server might be receiving more requests than it can efficiently process. |
Request durations | This graph breaks down the average response times for different types
of requests made by Puppet agents. This indicates how expensive catalog
and report requests are compared to the other types of requests. It also
provides a way to see changes in catalog compilation times when you
modify your Puppet code. A sharp upward curve for all request types indicates an overloaded server. Expect these to trend downward after the server load is reduced. |
Request ratios | This graph shows how many requests of each type that Puppet Server has handled. Under normal circumstances, you'll see about the same number of catalog, node, or report requests, because these all happen once per agent run. The number of file and file metadata requests correlate to how many remote file resources are in the agents' catalogs. |
External HTTP Communications | This graph tracks the amount of time it takes Puppet Server to send data and requests for common operations to, and receive responses from, external HTTP services, such as PuppetDB. |
File Sync | This graph tracks how long Puppet Server spends on File Sync operations, for both its storage and client services. |
JRubies | This graph tracks how many JRubies are in use, how many are free, the
mean number of free JRubies, and the mean number of requested
JRubies. If the number of free JRubies is often less than one, or the mean number of free JRubies is less than one, Puppet Server is requesting and consuming more JRubies than are available. This overload reduces Puppet Server's performance. While this might simply be a symptom of an under-resourced server, it can also be caused by poorly optimized Puppet code or bottlenecks in the server's communications with PuppetDB if it is in use. If catalog compilation times have increased but PuppetDB performance remains the same, examine your Puppet code for potentially unoptimized code. If PuppetDB communication times have increased, tune PuppetDB for better performance or allocate more resources to it. If neither catalog compilation nor PuppetDB communication times are degraded, the Puppet Server process might be under-resourced on your server. If you have available CPU time and memory, increase the JRuby max active instances to allow it to allocate more JRubies. Otherwise, consider adding additional compilers to distribute the catalog compilation load. |
JRuby Timers | This graph tracks these JRuby pool metrics:
|
Memory Usage | This graph tracks how much heap and non-heap memory that Puppet Server uses. |
Compilation | This graph breaks catalog compilation down into various phases to show how expensive each phase is on the primary server. |
Example Grafana dashboard excerpt
targets
parameter of a
dashboard. It demonstrates:- The full names of Puppet's exported Graphite metrics
- A way to add targets directly to an exported Grafana dashboard's JSON content
primary.example.com
."panels": [
{
"span": 4,
"editable": true,
"type": "graphite",
...
"targets": [
{
"target": "alias(puppetlabs.primary.example.com.num-cpus,'num cpus')"
},
{
"target": "alias(puppetlabs.primary.example.com.http.active-requests.count,'active requests')"
},
{
"target": "alias(puppetlabs.primary.example.com.http.active-histo.mean,'average')"
}
],
"aliasColors": {},
"aliasYAxis": {},
"title": "Active Requests"
}
]
Refer to the complete Grafana dashboard JSON sample file for a complete, detailed example of how a Grafana dashboard accesses these exported Graphite metrics.