Reference guide
Experimental Feature: This featureset is experimental and is subject to rapid development and change.
Puppet Query Language (PQL) is a query language designed with PuppetDB and Puppet data in mind. It provides a string-based query language as an alternative to the AST query language PuppetDB has always supported.
Other resources you may also find useful include:
Executing PQL queries using the PuppetDB CLI
See the PuppetDB CLI documentation for more on its usage.
The following examples use the PuppetDB CLI to execute a query:
Without SSL:
puppet query 'nodes { certname = "macbook-pro.local" }' \
--urls http://puppetdb.example.com:8080
This requires that PuppetDB be
configured to accept non-SSL connections. By default, it will
only accept unencrypted traffic from localhost
.
With SSL:
puppet query 'nodes { certname = "macbook-pro.local" }' \
--urls https://puppetdb.example.com:8081 \
--cacert /etc/puppetlabs/puppet/ssl/certs/ca.pem \
--cert /etc/puppetlabs/puppet/ssl/certs/thisnode.pem \
--key /etc/puppetlabs/puppet/ssl/private_keys/thisnode.pem
This requires that you specify a certificate (issued by the same CA PuppetDB trusts), a private key, and a CA certificate.
Note: The PuppetDB CLI can be configured using a config file at
$HOME/.puppetlabs/client-tools/puppetdb.conf
with default values for the server urls and SSL credentials.
Query Structure
A PQL query has the following structure:
<entity> [<projection>] { <filter> <modifiers> }
Which is broken up into the following parts:
entity
: Required. The entity context that this query executes on.projection
: Optional. Restricts the output to a selection of fields or function results.filter
: Optional. The filter to match on for thisentity
.modifiers
: Optional. Contains modifiers for the query.
As an example, if you wanted to query for nodes
related data, but only see the certname
field
for each node, with a regex filter across certname:
nodes[certname] { certname ~ "^web" }
In this case, this would return only the certname field of nodes starting with
web
.
Whitespace
In PQL, whitespace is optional, except around word operators like and
and or
.
Use whitespace to make your queries more human readable. For example the following
two queries are identical and will give you the same results, but the one with spaces
is a much more readable way to write PQL.
nodes[certname,latest_report_status]{report_timestamp<="2016-08-03 00:00:00"}
nodes[certname, latest_report_status]{ report_timestamp <= "2016-08-03 00:00:00" }
Entities
The entity or context of a query (or subquery) defines what results you will get returned when performing a query, and provides the main context for any projections or filters in the query. There are many entities; for a full list see the entities documentation.
For PQL queries, the entity context is the minimal amount of information one must provide, as it defines the results returned. For example, if you wanted to see all node information, you could provide a query as follows:
nodes {}
And it would be enough to return all node data, without filtering or pagination.
The entity context can also be used within a subquery; for more details, see:
The
in
operator, which can take a subquery.
Projection
The projection part of a query provides a mechanism to choose a subset of fields that are returned or to modify the way those fields are displayed with the usage of functions.
The projection lives within the brackets of a query:
nodes[<projection>] {}
And is a comma separated list of fields and functions:
facts[name, count()] { group by name }
If you provide a projection with no fields or functions, then all fields will be displayed. So the following two examples are equivalent:
facts[] {}
facts {}
Fields
Entity field selection is an optional capability to ensure that only certain fields are returned in a response.
For a basic query, if you don't provide any entity fields, all data gets returned. However this can be inefficient for both the database and the network. Providing an entity field reduces the number of fields in the response.
The entity field section of a query can contain a number of field names separated by a comma:
entity[field1, field2, field3] {}
As an example, to return only certname
for all nodes:
nodes[certname] {}
Or to return both the name
and value
of all facts:
facts[name, value] {}
Only fields that are available for the entity type can be returned by PQL today.
Functions
Today, PQL only supports aggregate functions in the projection.
Aggregate functions perform a calculation on a set of values and return a single value. Functions are provided in the projection much like fields are:
entity[function(argument)] {}
As an example, to query how many objects exist that start with a certname of
web
you could use the following filter and function combination:
nodes[count()] { certname ~ "web.*" }
PQL supports the same functions as the AST-based language:
count()
Returns the number of objects returned by the query, instead of returning the actual results.
avg(<field>)
Returns the average value for the values held in the <field>
argument.
sum(<field>)
Returns the sum of values for the values held in the <field>
argument.
min(<field>)
Returns the minimum value for all the values held in the <field>
argument.
max(<field>)
Returns the maximum value for all the values held in the <field>
argument.
Filter
Filtering a query allows you to reduce the number of responses from PuppetDB based on a filter.
In a basic query, a filter is optional, and is provided in the <filter>
area
as a set of boolean and conditional operators that make up a filter. For
example:
entity { field1 = 'mystring' and field2 < 3 }
You can also modify boolean operator precedence by using parentheses:
entity { !(field1 = 'mystring' and field2 < 3) or field3 = 'mars' }
All filters are made up of a series of conditions, combined together with boolean operators.
Conditional operators
Conditions provide the basic tests that are preformed to decide if a filter is true or not. The following operators are available within PQL:
Equality: =
Matches the field value, with the literal value provided.
nodes { certname = "foo" }
Numeric comparison: >=
, <=
, >
, <
These operators allow for numeric comparison, and will return true if the field, the value and the operator combination are true:
>
- greater than>=
- greater than or equal to<
- less than<=
- less than or equal to
Some examples of their usage:
facts { value >= 4 }
facts { value < 4 }
facts { value <= 4 }
facts { value < 4 }
The operator will only work on numbers. Any other types will return errors.
Regexp: ~
For strings you can match using a regular expression pattern by using the ~
operator and a valid regular expression:
nodes { certname ~ "foo.*" }
The regexp must not be surrounded by the slash characters (
/rexegp/
) that delimit regexps in many languages.Every backslash character must be escaped with an additional backslash. Thus, a sequence like
\d
would be represented as\d
, and a literal backslash (represented in a regexp as a double-backslash\
) would be represented as a quadruple-backslash (\\
).
Note: Regular expression matching is performed by the database backend, so the available regexp features are determined by PostgreSQL. For best results, use the simplest and most common features that can accomplish your task.
Array Match: in
The in
operator matches a field or set of fields against either an array or a
subquery.
The in
operator can be used in two ways. The simplest way is to see if a
field contains one of the values provided in a list of literal values:
nodes { certname in ["foo", "bar", "baz"] }
The operator can also be used to ensure the values of a field match the fields returned from a subquery, which has the form of a nested PQL query within the filter:
nodes { certname in facts[certname] { value = "foo" } }
With the subquery form, we can even match on multiple fields, as long as the fields being matched and the subqueries projection fields match:
facts { [certname, name] in fact_contents[certname, name] { value ~ "a" } }
Null detection: is null
, is not null
Null values in PuppetDB are treated differently to other values. So to detect if
a field is a null, instead of doing an exact match comparison, you must use
either the is null
or is not null
operator.
To test if a field contains a null
:
nodes { deactivated is null }
Or conversely, to test if a field does not contain a null
value:
nodes { deactivated is not null }
Regexp array match: ~>
The array matches using the regular expressions provided within in each element. Array indexes are coerced to strings.
For example, the following query would query the path element, matching any ethernet MAC address:
fact_contents { path ~> ["networking","interfaces",".*","mac"] }
The following example will match against the size of any disk on the system:
fact_contents { path ~> ["disks",".*","size"] }
Boolean operators
Boolean operators are used within PQL filters to join conditons together to perform the filtering test within PuppetDB.
There are only 3 boolean operators today, in order of natural precedence:
!
- performs a logicalnot
or negationand
- performs a logicaland
or conjuctionor
- performs a logicalor
or disjunction
Grouping
By default PQL binary operators are evaluated using the following natural order
of precedence: !
, and
, and or
. To override this precedence, you can group
conditions together explicitly using parenetheses:
facts { name ~ "^operating" and ( name ~ "system" or value = "FlowerOS" ) }
In this case the or
gets evaluated before the and
despite the natural order
of precedence.
You can nest as many levels of grouping as required.
Literal types
Each field for an entity supports matching using a conditional against a provided literal value.
Strings
PQL supports legal UTF-8 strings as literals for comparison.
There are two types of literal strings:
single quoted - no escaping, just straight text
double quoted - supports escape characters
Double quoted strings follow the same rules as JSON strings, and can accept escape characters:
\n
- newline\r
- carriage return\b
- backspace\f
- formfeed\t
- tab\uXXXX
- unicode character
For example to match on a string with a newline in it:
facts { value = "first line\nsecondline" }
However if you wanted to match the literal \n
set of characters and not have
it translated to a newline, you could do:
facts { value = 'first line\nstill on first line' }
Booleans
Booleans are represented by using the bare words: true
or false
.
Numbers
Numbers in PQL can be either integers or reals:
4
4.1
-10245
-124.012
For real numbers, scientific notation is expressed using E notation:
4.1E123
-3.2E-123
E notation follows the same rules as JSON, but currently is only accepted for real numbers, not integers.
Lists
Lists are groups of other literal values, and are expressed using brackets with elements separated by commas:
['a', 'b', 'c', 'd']
Currently lists are only supported with the in
operator.
Implicit Subqueries
Implicit subqueries work the same way as the in
operator,
however the relationship between some entities is clear. When an implicit
relationship exists between two entity types, you can avoid the overhead of
having to provide the join columns like with the in
operator by using implicit
subqueries instead.
An implicit subquery looks like a query, embedded within the filter of a PQL query:
nodes {
facts { name = "operatingsystem" and value = "Debian" }
}
In this example, while the query context is set to nodes
, we will only return
nodes
that have a fact name
of operatingsystem
and value
of Debian
(so
only Debian nodes).
This often allows you to avoid having to know which fields are
required, unlike the in
operator, but be aware that only some relationships are well
defined. See the entities documentation for each entity to learn which
implicit subqueries are provided automatically.
Also, implicit subqueries are like any other conditional operator, and therefore
can be combined with basic filters. The following query combines the fact
subquery as before, included with a certname
match on the node itself:
nodes {
facts { name = "operatingsystem" and value = "Debian" } and
certname ~ "^web"
}
They can even be combined with other implict subqueries, to provide more complex
matching capabilities. This query combines everything we've discussed so far,
and adds a resource
subquery for Package[tomcat]
:
nodes {
facts { name = "operatingsystem" and value = "Debian" } and
resources { type = "Package" and title = "tomcat" } and
certname ~ "^web"
}
Group By
A group by
clause will condense into a single row all selected rows that share
the same values for the grouped expressions.
For example, to only show a list of fact names, you can group by the name
field:
facts[name] { group by name }
Combined with aggregate functions, you can effectively roll up results and
aggregate the rolled-up values. For example, to calculate how many facts exist for
each fact name across all facts, where the certame starts with web
:
facts[name, count(value)] { certname ~ "^web.*" group by name }
Grouping on function results is also supported:
reports[count(), to_string(receive_time, "DAY")]{group by to_string(receive_time, "DAY")}
Paging
PQL supports restriction of the result set via the SQL-like paging clauses
limit
, offset
, and order by
.
limit
and offset
Limit and offset clauses are supplied with integer arguments and may appear in any order within the braced section of a PQL query. Offset always takes precedence over limit:
reports {certname = "foo.com" limit 10}
reports {certname = "foo.com" limit 10 offset 10}
reports {certname = "foo.com" offset 10 limit 10}
Note that because there is no default ordering for results returned by PuppetDB,
limit
and offset
are generally only useful in combination with order by
.
order by
An order by
clause will order the result set on a selection of columns in
ascending or descending order. The argument to an order by
is a
comma-separated list of fields, each optionally appended with a keyword asc
or
desc
. If no keyword is supplied, asc
is assumed:
reports {certname = "foo.com" order by receive_time}
reports {certname ~ "web.*" order by receive_time, certname desc}
reports {certname ~ "web.*" order by receive_time desc, certname desc limit 10}