Regular expressions

A regular expression (sometimes shortened to “regex” or “regexp”) is a pattern that can match some set of strings, and optionally capture parts of those strings for further use.

You can use regular expression values with the =~ and !~ match operators, case statements and selectors, node definitions, and functions like regsubst for editing strings, or match for capturing and extracting substrings. Regular expressions act like any other value, and can be assigned to variables and used in function arguments.

Syntax

Regular expressions are written as patterns enclosed within forward slashes. Unlike in Ruby, you cannot specify options or encodings after the final slash, like /node .*/m.
if $host =~ /^www(\d+)\./ {
  notify { "Welcome web server #$1": }
}
Puppet uses Ruby’s standard regular expression implementation to match patterns. Other forms of regular expression quoting, like Ruby’s %r{^www(\d+)\.}, are not allowed. You cannot interpolate variables or expressions into regex values.

If you are matching against a string that contains newlines, use \A and \z instead of ^ and $, which match the beginning and end of a line. This is a common mistake that can cause your regexp to unintentionally match multiline text.

Some places in the language accept both real regex values and stringified regexes — that is, the same pattern quoted as a string instead of surrounded by slashes.

Regular expression options

Regular expresions in Puppet cannot have options or encodings appended after the final slash. However, you can turn options on or off for portions of the expression using the (?<ENABLED OPTION>:<SUBPATTERN>) and (?-<DISABLED OPTION>:<SUBPATTERN>) notation. The following example enables the i option while disabling the m and x options:
$packages = $operatingsystem ? {
  /(?i-mx:ubuntu|debian)/        => 'apache2',
  /(?i-mx:centos|fedora|redhat)/ => 'httpd',
}
The following options are available:
i
Ignore case.
m
Treat a new line as a character matched by .
x
Ignore whitespace and comments in the pattern.

Regular expression capture variables

Within conditional statements and node definitions, substrings withing parentheses () in a regular expression are available as numbered variables inside the associated code section. The first is $1, the second is $2, and so on. The entire match is available as $0.

These are not normal variables, and have some special behaviors:
  • The values of the numbered variables do not persist outside the code block associated with the pattern that set them.

  • You can’t manually assign values to a variable with only digits in its name; they can only be set by pattern matching.

  • In nested conditionals, each conditional has its own set of values for the set of numbered variables. At the end of an interior statement, the numbered variables are reset to their previous values for the remainder of the outside statement. This causes conditional statements to act like local scopes, but only with regard to the numbered variables.

The Regexp data type

The data type of regular expressions is Regexp. By default, Regexp matches any regular expression value. If you are looking for a type that matches strings which match arbitrary regular expressions, see the Pattern type. You can use parameters to restrict which values Regexp matches.

Parameters

The full signature for Regexp is:
Regexp[<SPECIFIC REGULAR EXPRESSION>]
The parameter is optional.
Position Parameter Data type Default value Description
1 Specific regular expression Regexp none If specified, this results in a data type that only matches one specific regular expression value. Specifying a parameter here doesn’t have a practical use.
Examples:
Regexp
Matches any regular expression.
Regexp[/<regex>/]
Matches the regular expression /<regex>/ only.

Regexp matches only literal regular expression values. Don't confuse it with the abstract Pattern data type, which uses a regular expression to match a limited set of String values.