Metrics endpoints

Puppet Server is capable of tracking advanced metrics to give you additional insight into its performance and health.

The HTTPS metrics endpoints are available on port 8140 of the master server:

curl -k https://<DNS NAME OF YOUR MASTER>:8140/status/v1/services?level=debug`

Note: These API endpoints are a tech preview. The metrics described here are returned only when passing the level=debug URL parameter, and the structure of the returned data might change in future versions.

These metrics fall into three categories:

JRuby metrics (/status/v1/services/pe-jruby-metrics)
HTTP route metrics (/status/v1/services/pe-master)
Catalog compilation profiler metrics (/status/v1/services/pe-puppet-profiler)

All of these metrics reflect data for the lifetime of the current Puppet Server process and reset whenever the service is restarted. Any time-related metrics report milliseconds unless otherwise noted.

Like the standard status endpoints, the metrics endpoints return machine-consumable information about running services. This JSON response includes the same keys returned by a standard status endpoint request (see JSON endpoints). Each endpoint also returns additional keys in an experimental section.

GET /status/v1/services/pe-jruby-metrics

The /status/v1/services/pe-jruby-metrics endpoint returns JSON containing information about the JRuby pools from which Puppet Server fulfills agent requests.

You must query it at port 8140 and append the level=debug URL parameter.

Query parameters

No parameters are supported. Defaults to using the critical status level.

Response codes

The server uses the following response codes:

200 if and only if all services report a status of running
503 if any service’s status is unknown or error

Response keys

The metrics are returned in two subsections of the experimental section: jruby-pool-lock-status and metrics.

The response's experimental/jruby-pool-lock-status section contains the following keys:


Key	Definition
`current-state`	The state of the JRuby pool lock, which should be either `:not-in-use` (unlocked), `:requested` (waiting for lock), or `:acquired` (locked).
`last-change-time`	The date and time of the last `current-state` update, formatted as an ISO 8601 combined date and time in UTC.

The response's experimental/metrics section contains the following keys:


Key	Definition
`average-borrow-time`	The average amount of time a JRuby instance spends handling requests, calculated by dividing the total duration in milliseconds of the `borrowed-instances` value by the `borrow-count` value.
`average-free-jrubies`	The average number of JRuby instances that are not in use over the Puppet Server process’s lifetime.
`average-lock-held-time`	The average time the JRuby pool held a lock, starting when the value of `jruby-pool-lock-status/current-state` changed to `:acquired`. This time mostly represents file sync syncing code into the live codedir, and is calculated by dividing the total length of time that Puppet Server held the lock by the value of `num-pool-locks`.
`average-lock-wait-time`	The average time Puppet Server spent waiting to lock the JRuby pool, starting when the value of `jruby-pool-lock-status/current-state` changed to`:requested`). This time mostly represents how long Puppet Server takes to fulfill agent requests, and is calculated by dividing the total length of time that Puppet Server waits for locks by the value of `num-pool-locks`.
`average-requested-jrubies`	The average number of requests waiting on an available JRuby instance over the Puppet Server process’s lifetime.
`average-wait-time`	The average time Puppet Server spends waiting to reserve an instance from the JRuby pool, calculated by dividing the total duration in milliseconds of requested-instances by the requested-count value.
`borrow-count`	The total number of JRuby instances that have been used.
`borrow-retry-count`	The total number of times that a borrow attempt failed and was retried, such as when the JRuby pool is flushed while a borrow attempt is pending.
`borrow-timeout-count`	The number of requests that were not served because they timed out while waiting for a JRuby instance.
`borrowed-instances`	A list of the JRuby instances currently in use, each reporting: `duration-millis`: The length of time that the instance has been running. `reason/request`: A hash of details about the request being served. `request-method`: The HTTP request method, such as POST, GET, PUT, or DELETE. `route-id`: The route being served. For routing metrics, see the HTTP metrics endpoint. `uri`: The request’s full URI. `time`: The time (in milliseconds since the Unix epoch) when the JRuby instance was borrowed.
`num-free-jrubies`	The number of JRuby instances in the pool that are ready to be used.
`num-jrubies`	The total number of JRuby instances.
`num-pool-locks`	The total number of times the JRuby pools have been locked.
`requested-count`	The number of JRuby instances borrowed, waiting, or that have timed out.
`requested-instances`	A list of the requests waiting to be served, each reporting: `duration-millis`: The length of time the request has waited. `reason/request`: A hash of details about the waiting request. `request-method`: The HTTP request method, such as POST, GET, PUT, or DELETE. `route-id`: The route being served. For routing metrics, see the HTTP metrics endpoint. `uri`: The request’s full URI. `time`:The time (in milliseconds since the Unix epoch) when Puppet Server received the request.
`return-count`	The total number of JRuby instances that have been used.

For example:

"pe-jruby-metrics": {
    "detail_level": "debug",
    "service_status_version": 1,
    "service_version": "2.2.22",
    "state": "running",
    "status": {
        "experimental": {
            "jruby-pool-lock-status": {
                "current-state": ":not-in-use",
                "last-change-time": "2015-12-03T18:59:12.157Z"
            },
            "metrics": {
                "average-borrow-time": 292,
                "average-free-jrubies": 0.4716243097301104,
                "average-lock-held-time": 1451,
                "average-lock-wait-time": 0,
                "average-requested-jrubies": 0.21324752542875958,
                "average-wait-time": 156,
                "borrow-count": 639,
                "borrow-retry-count": 0,
                "borrow-timeout-count": 0,
                "borrowed-instances": [
                    {
                        "duration-millis": 3972,
                        "reason": {
                            "request": {
                                "request-method": "post",
                                "route-id": "puppet-v3-catalog-/*/",
                                "uri": "/puppet/v3/catalog/hostname.example.com"
                            }
                        },
                        "time": 1448478371406
                    }
                ],
                "num-free-jrubies": 0,
                "num-jrubies": 1,
                "num-pool-locks": 2849,
                "requested-count": 640,
                "requested-instances": [
                    {
                        "duration-millis": 3663,
                        "reason": {
                            "request": {
                                "request-method": "put",
                                "route-id": "puppet-v3-report-/*/",
                                "uri": "/puppet/v3/report/hostname.example.com"
                            }
                        },
                        "time": 1448478371715
                    }
                ],
                "return-count": 638
            }
        }
    }
}

GET /status/v1/services/pe-master

The /status/v1/services/pe-master endpoint returns JSON containing information about the routes that agents use to connect to this server.

You must query it at port 8140 and append the level=debug URL parameter.

Query parameters

No parameters are supported. Defaults to using the critical status level.

Response codes

The server uses the following response codes:

200 if and only if all services report a status of running
503 if any service’s status is unknown or error

Response keys

The response's experimental/http-metrics section contains a list of routes, each containing the following keys:


Key	Definition
`aggregate`	The total time Puppet Server spent processing requests for this route.
`count`	The total number of requests Puppet Server processed for this route.
`mean`	The average time Puppet Server spent on each request for this route, calculated by dividing the `aggregate` value by the `count`.
`route-id`	The route being served. The request returns a route with the special `route-id` of "total", which represents the aggregate data for all requests along all routes.

Routes for newer versions of Puppet Enterprise and newer agents are prefixed with puppet-v3, while Puppet Enterprise 3 agents' routes are not. For example, a PE 2017.3 route-id might be puppet-v3-report-/*/, while the equivalent PE 3 agent's route-id is :environment-report-/*/.

For example:

"pe-master": {
    {...},
    "status": {
        "experimental": {
            "http-metrics": [
                {
                    "aggregate": 70668,
                    "count": 234,
                    "mean": 302,
                    "route-id": "total"
                },
                {
                    "aggregate": 28613,
                    "count": 13,
                    "mean": 2201,
                    "route-id": "puppet-v3-catalog-/*/"
                },
                {...}
            ]
        }
    }
}

GET /status/v1/services/pe-puppet-profiler

The /status/v1/services/pe-puppet-profiler endpoint returns JSON containing statistics about catalog compilation. You can use this data to discover which functions or resources are consuming the most resources or are most frequently used.

You must query it at port 8140 and append the level=debug URL parameter.

The Puppet Server profiler is enabled by default, but if it has been disabled, this endpoint's metrics are not available. Instead, the endpoint returns the same keys returned by a standard status endpoint request and an empty status key.

Query parameters

No parameters are supported. Defaults to using the critical status level.

Response codes

The server uses the following response codes:

200 if and only if all services report a status of running
503 if any service’s status is unknown or error

Response keys

If the profiler is enabled, the response returns two subsections in the experimental section:

experimental/function-metrics, containing statistics about functions evaluated by Puppet Server when compiling catalogs.
experimental/resource-metrics, containing statistics about resources declared in manifests compiled by Puppet Server.

Each function measured in the function-metrics section also has a function key containing the function's name, and each resource measured in the resource-metrics section has a resource key containing the resource's name.

The two sections otherwise share these keys:


Key	Definition
`aggregate`	The total time spent handling this function call or resource during catalog compilation.
`count`	The number of times Puppet Server has called the function or instantiated the resource during catalog compilation.
`mean`	The average time spent handling this function call or resource during catalog compilation, calculated by dividing the `aggregate` value by the `count`.

For example:

"pe-puppet-profiler": {
    {...},
    "status": {
        "experimental": {
            "function-metrics": [
                {
                    "aggregate": 1628,
                    "count": 407,
                    "function": "include",
                    "mean": 4
                },
                {...},
            "resource-metrics": [
                {
                    "aggregate": 3535,
                    "count": 5,
                    "mean": 707,
                    "resource": "Class[Puppet_enterprise::Profile::Console]"
                },
                {...},
            ]
        }
    }
}