November 5, 2021

Getting to know the Puppet Query Language (PQL)

Products & Services
How to & Use Cases

You may have already guessed, but PQL stands for Puppet Query Language. Much like SQL, PQL allows you to search a database and return raw data that is being stored. This article will walk through what PQL is and show a few different ways to use it.

Table of Contents

What is Puppet PQL?

PQL, or "Puppet Query Language," is an easy-to-use syntax that acts as a wrapper for making REST API calls to the PuppetDB endpoint using AST to the PuppetDB FQDN:8081 with authentication or localhost:8080 without authentication, which can get messy and difficult to use... and even more difficult to debug.

👋 Keep up with the latest from Puppet — don't miss our State of DevOps Report with a free download.

You’re probably asking, how can I use it? Well, you can read the documentation for more information but for now
 let’s continue reading this blog.

How Do I Use PQL?

First, you will need a token. If you’re using Code Manager you should already have a valid token. If the existing token doesn’t have PuppetDB query permissions, follow the documentation to create one. 

It’s important to note that the default lifespan of a token is five minutes, so you’ll want to add a lifetime to do so: puppet access login -l 1d (1d = 1 day and 1y = 1 year). Now that we’ve discussed how to access it, let’s show how we can physically use it.

Below is the basic syntax of a Puppet query:

puppet query “<endpoint>[return value(s)]{filter criteria}”

This should be run on the MoM. However, you can install PE Client Tools on any Puppet agent to use it there too. You can get all the details in the documentation for Client Tools installation.

Puppet query

Denotes an easy-to-use wrapper for curling the API call directly to the DB.

Endpoint

This is where you choose the endpoint that lives under /pdb/query/v4. The available endpoints can be seen by reading through the list available in the menu bar under the "Query API Version 4."

Return value(s)

In this section, we select the information we want to return from the endpoint we have selected.

Filter criteria

Here we can narrow down the results and introduce a clause filter. (Check out the Examples Section at the bottom of this blog post for guidance on and examples of queries.)

Why Should I Use PQL?

The console is populated with numerous different API calls directly to the PuppetDB and calls to the Postgres, but PQL can be used to expand on this and provide further, more detailed information.

At this point you’re probably thinking “Hey, PQL is pretty cool
 but what would I use it for?” Let me answer your question
 The beauty of PQL is that it can be used for almost anything PuppetDB related.

You want to limit a task to a run on certain subset of nodes? You want to find how many and what nodes have a certain fact value? You want to see how many of your license count you’re currently using? PQL to the rescue.

Honestly, we couldn’t list all of the uses of PQL (there are too many!), but below you’ll find the examples which will provide some inspiration

Examples of How to Use PQL

Basic queries

These basic queries hit a single endpoint based on certain criteria specified.

Certnames of nodes (inactive and active)

`puppet query "nodes[certname]{node_state = 'inactive' or node_state = 'active'}"`

Total number of nodes where an agent ran in the Production environment

puppet query "nodes[count()]{catalog_environment = 'production'}"

The latest report for a node

puppet query "reports{latest_report? = true and certname = ''}"

Return the certname of nodes that Have not checked into the server within the last hour

puppet query "reports[certname]{ latest_report? = true and receive_time < \"$(date -u -d'1 hours ago' '+%Y-%m-%dT%TZ')\"}"

Check the file resource managing of the PE license

puppet query "resources{title = '/etc/puppetlabs/license.key'}"

List all resources on a particular node

puppet query "resources[]{certname = ''}"

Return the: certname, timestamp, resource type, resource title, message and old and new values from events where the status of the event was success

puppet query "events[certname, timestamp, resource_type, resource_title, message, old_value, new_value]{latest_report? = true and status='success'}"

List all events that occurred on a node between run that started at 9am on 14/08/18 Zulu Time and 11am on 14/08/18 Zulu Time

puppet query "events[]{certname = ‘$(puppet config print certname)’ and (run_start_time >= '2018-08-14T09:00:00Z' and run_start_time <= '2018-08-14T11:00:00Z')} "

Return the count of nodes where the osfamily is RedHat

puppet query "facts[count()]{name='osfamily' and value='RedHat'}"

Return the certnames of all nodes where the osfamily is RedHat

puppet query "facts[certname]{name='osfamily' and value='RedHat'}"

Return the certname of all nodes that are running Puppet version 5.5.3

puppet query "facts[certname]{name='puppetversion' and value='5.5.3'}"

Return the certnames of nodes that have a specified class applied (The MCO class in this example)

puppet query "resources[certname]{type = 'Class' and title = Puppet_enterprise::Mcollective::Service}"

Nested queries

These are more complex queries that nest and hit more than one endpoint to return more specific data.

This query first returns nodes in the infrastructure, then uses a second query to filter results. The second query hits the fact_contents endpoint to return all nodes that possess a fact called test_fact. These are then negated from the first query using the ‘!’ notation. This will return all nodes that do not have a specified fact present (“test_fact” in this instance).

puppet query 'nodes[certname]{ ! certname in fact_contents[certname]{name ~ "test_fact"}}'

This query returns events that occured within a report but also hits the reports endpoint to filter results to include only those where noop = false.

puppet query "events[certname, timestamp, resource_type, resource_title, message, old_value, new_value]{ report in reports[hash]{ noop = false }}"

This again returns all nodes before using a nested endpoint to find all nodes running a service called postgresqld. It then negates them using the ‘!’ notation

puppet query 'nodes['certname']{ ! certname in resources['certname']{type="Service" and title~"postgresqld"}}'

Learn more

We hope we have answered what PQL is and why you should use it. If you want to learn more, visit our super friendly Community Slack Channel for more advice.

Daniel is a technical support engineer at Puppet, working out of the Belfast office. This blog was originally published on November 5, 2018 and has since been updated for accuracy and relevance.