Monitoring current infrastructure state
When nodes fetch their configurations from the Puppet master, they send back inventory data and a report of their run. This information is summarized on the Status page in the console.
The Status page displays the most recent run status of each of your nodes so you can quickly find issues and diagnose their causes. You can also use this page to gather essential information about your infrastructure at a glance, such as how many nodes your Puppet master is managing, and whether any nodes are unresponsive.
Node run statuses
The Status page displays the run status of each node following the most recent Puppet run. There are 10 possible run statuses.
Nodes run in enforcement mode
- With failures
- This node’s last Puppet run failed, or Puppet encountered an error that prevented it from making changes.
- With corrective changes
- During the last Puppet run, Puppet found inconsistencies
between the last applied catalog and this node’s configuration, and corrected
those inconsistencies to match the catalog.Note: Corrective change reporting is available only on agent nodes running PE 2016.4 and later. Agents running earlier versions report all change events as "with intentional changes."
- With intentional changes
- During the last Puppet run, changes to the catalog were successfully applied to the node.
- Unchanged
- This node's last Puppet run was successful, and it was fully compliant. No changes were necessary.
Nodes run in no-op mode
- With failures
- This node’s last Puppet run in no-op mode failed, or Puppet encountered an error that prevented it from simulating changes.
- Would have corrective changes
- During the last Puppet run, Puppet found inconsistencies between the last applied catalog and this node’s configuration, and would have corrected those inconsistencies to match the catalog.
- Would have intentional changes
- During the last Puppet run, catalog changes would have been applied to the node.
- Would be unchanged
- This node’s last Puppet run was successful, and the node was fully compliant. No changes would have been necessary.
Nodes not reporting
- Unresponsive
- The node hasn't reported to the Puppet master recently. Something might be
wrong. The cutoff for considering a node unresponsive defaults to one hour, and
can be configured via the
puppet_enterprise::console_services::no_longer_reporting_cutoff
parameter. See Configure the PE console and console-services for more information. - Have no reports
- Although Puppet Server is aware of this node's existence, the node has never
submitted a Puppet report for one or more of the following reasons: it's a
newly commissioned node; it has never come online; or its copy of Puppet is not
configured correctly.Note: Expired or deactivated nodes are displayed on the Status page for seven days. To extend the amount of time that you can view or search for these nodes, change the
node-ttl
setting in PuppetDB. Changing this setting affects resources and exported resources.
Special categories
In addition to reporting the run status of each node, the Status page provides a secondary count of nodes that fall into special categories.
- Intended catalog failed
- During the last Puppet run, the intended catalog for this node failed, so Puppet substituted a cached catalog, as per your configuration settings.
- Enforced resources found
- During the last Puppet run in no-op mode, one or more
resources was enforced, as per your use of the
noop => false
metaparameter setting.
How Puppet determines node run statuses
Puppet uses a hierarchical system to determine a single run status for each node. This system gives higher priority to the activity types most likely to cause problems in your deployment, so you can focus on the nodes and events most in need of attention.
During a Puppet run, several activity types might occur on a single node. A node's run status reflects the activity with the highest alert level, regardless of how many events of each type took place during the run. Failure events receive the highest alert level, and no change events receive the lowest.
Run status | Definitely happened | Might also have happened |
---|---|---|
Failure | Corrective change, intentional change, no change | |
Corrective change | Intentional change, no change | |
Intentional change | No change | |
No change |
For example, during a Puppet run in enforcement mode, a node with 100 resources receives intentional changes on 30 resources, corrective changes on 10 resources, and no changes on the remaining 60 resources. This node's run status is "with corrective changes."
Node run statuses also prioritize run mode (either enforcement or no-op) over the state of individual resources. This means that a node run in no-op mode is always reported in the Nodes run in no-op column, even if some of its resource changes were enforced. Suppose the no-op flags on a node's resources are all set to false. Changes to the resources are enforced, not simulated. Even so, because it is run in no-op mode, the node's run status is "would have intentional changes."
Filtering nodes on the Status page
You can filter the list of nodes displayed on the Status page by run status and by node fact. If you set a run status filter, and also set a node fact filter, the table takes both filters into account, and shows only those nodes matching both filters.
Clicking Remove filter removes all filters currently in effect.
The filters you set are persistent. If you set run status or fact filters on the Status page, they continue to be applied to the table until they're changed or removed, even if you navigate to other pages in the console or log out. The persistent storage is associated with the browser tab, not your user account, and is cleared when you close the tab.
Filter by node run status
The status counts section at the top of the Status page shows a summary of the number of nodes with each run status as of the last Puppet run. Filter nodes by run status to quickly focus on nodes with failures or change events.
Filter by node fact
You can create a highly specific list of nodes for further investigation by using the fact filter tool.
For example, you can check that nodes you've updated have successfully changed, or find out the operating systems or IP addresses of a set of failed nodes to better understand the failure. You might also filter by facts to fulfill an auditor's request for information, such as the number of nodes running a particular version of software.
Filtering nodes in your node list
Filter your node list by node name or by PQL query to more easily inspect them.
Filter your node list by node name
Filter your nodes list by node name to inspect them as a group.
Filter your nodes by PQL query
Filter your nodes list using a common PQL query.
Filtering your nodes list by PQL query enables you to manage them by specific factors, such as by operating system, report status, or class.
- Enter a query that selects the target you want. See the Puppet Query Language (PQL) reference for more information.
- Click Common queries. Select one of the queries
and replace the defaults in the braces (
{ }
) with values that specify the target you want.Target PQL query All nodes nodes[certname] { }
Nodes with a specific resource (example: httpd) resources[certname] { type = "Service" and title = "httpd" }
Nodes with a specific fact and value (example: OS name is CentOS) inventory[certname] { facts.os.name = "<OS>" }
Nodes with a specific report status (example: last run failed) reports[certname] { latest_report_status = "failed" }
Nodes with a specific class (example: Apache) resources[certname] { type = "Class" and title = "Apache" }
Nodes assigned to a specific environment (example: production) nodes[certname] { catalog_environment = "production" }
Nodes with a specific version of a resource type (example: OpenSSL is v1.1.0e) resources[certname] {type = "Package" and title="openssl" and parameters.ensure = "1.0.1e-51.el7_2.7" }
Nodes with a specific resource and operating system (example: httpd and CentOS) inventory[certname] { facts.operatingsystem = "CentOS" and resources { type = "Service" and title = "httpd" } }
Monitor PE services
PE includes console and command line tools for monitoring the status of core services.
Component or service | Status monitor | Status command |
---|---|---|
Activity service | ✓ | ✓ |
Agentless Catalog Executor (ACE) service | ✓ | |
Bolt service | ✓ | |
Classifier service | ✓ | ✓ |
Code Manager service | ✓ | ✓ |
Orchestrator service | ✓ | ✓ |
Puppet Communications Protocol (PCP) broker | ✓ | |
PostgreSQL | ✓ | |
Puppet Server | ✓ | ✓ |
PuppetDB | ✓ | ✓ |
Role-based access control (RBAC) service | ✓ | ✓ |
View the Puppet Services status monitor
The Puppet Services status monitor provides a visual overview of the current state of core services, and can be used to quickly determine whether an unresponsive or restarting service is causing an issue with your deployment.
- In the console, click Status.
- Click Puppet Services status to open the monitor.
puppet
infrastructure status
command
The puppet infrastructure status
command displays
errors and alerts from PE components and services.
The command reports separately on the master and any compilers or replicas in your environment. You must run the command as root.