BreadcrumbHomeResourcesBlog How To Use Bolt Tasks For Better Splunk Compliance Reporting December 12, 2021 How to Use Bolt Tasks for Better Splunk Compliance ReportingHow to & Use CasesSecurity & ComplianceBy Dimitri TischenkoDue to its declarative DSL and resource abstraction, Puppet is an excellent solution for enforcing compliance across any server estate. Puppet modules provide automatic hardening and middleware configurations. Configuration drift is reported as corrective changes to Puppet Enterprise Console. But there's an alternative approach for compliance testing: using Puppet tasks and plans with Bolt, then sending the findings to a Splunk report.Table of ContentsHow Does Splunk Help with Compliance?How to Use Bolt Tasks for Better Splunk Compliance ReportingUsing Compliance Results in a Splunk ReportHow Does Splunk Help with Compliance?Splunk is kind of a one-stop solution for collecting audit trails and reporting on compliance. It lets you report on data from any machine source and visualize events on those machines. In terms of compliance reporting, it helps you find and address obstacles and errors in the app lifecycle sooner, from coding to staging to testing to deployment.More and more organizations are required to scan their infrastructure and determine whether it complies with standardized security and hardening compliance benchmarks like CIS, NIST, PCI or BSI. The "scanning", "testing" or "issue detection" step is often separated from "remediating" or "enforcing". That's where Splunk and Bolt come in.How to Use Bolt Tasks for Better Splunk Compliance ReportingCommunity-developed Puppet modules offer two basic approaches to testing for compliance: using custom facts or using noop runs (a noop run is a run of the Puppet agent which reports on required changes but does not enforce them). The advantage of those approaches is that testing and enforcing compliance can be bundled in one compliance module. There are also disadvantages: using large numbers of facts can lead to scalability issues and noop runs can overwhelm PuppetDB with a large number of noop change events.Some additional advantages of using Bolt tasks for compliance testing:You can choose the language you are familiar with and which is supported on the target systemsIt's easy to collaborate on benchmark sets, tasks are easy to explain and to maintainThe system is flexible and can be integrated with any reporting platformCompliance tests can be run using Bolt or via Puppet Enterprise orchestratorLet's take a look at how to use Bolt tasks to send compliance information to Splunk, and how Splunk reports can use that compliance data.Related: Check out our podcast on Bolt: Uniting Models and TasksBuilding Compliance Testing TasksBefore we start experimenting, let's create an empty module which will contain our tasks. We will use Puppet Development Kit, or PDK (see PDK installation instructions).mkdir ~/modules cd ~/modules pdk new module bolt_compliance cd bolt_complianceAs an example, let's create a task testing for control 1.1.2 of the CIS Red Hat Enterprise Linux 7 Benchmark. This control checks whether the /tmp directory is a separate filesystem, as in:# mount | grep /tmp tmpfs on /tmp type tmpfs (rw,nosuid,nodev,noexec,relatime)If the grep fails, the /tmp directory doesn't occur in the list of mounted filesystems and the control fails.It's easy to create a Bash task to check for this control. To create an empty task, we use pdk:pdk new task cis_rhsl7_1_1_2This will create an empty task script file and a metadata file. The metadata file should contain a short task description and any parameters the task accepts. We don't have any parameters but we could put something like "1.1.2 Ensure separate partition exists for /tmp" into the metadata description field.Modify the tasks/cis_rhsl7_1_1_2.sh like so:#!/bin/bash # # CIS Red Hat Enterprise Linux 7 Benchmark Server Level 1 # # 1.1.2 Ensure separate partition exists for /tmp # if mount | grep /tmp > /dev/null ; then echo "Control passed: /tmp is a separate filesystem" else echo "Control failed: /tmp is not a separate filesystem" fiWe can now run a task using Bolt like the following: $ bolt task run bolt_compliance::cis_rhel7_1_1_2 -n localhost Started on localhost... Finished on localhost: Control failed: /tmp is not a separate filesystem { } Successful on 1 node: localhost Ran on 1 node in 0.11 secondsNote that the opening and closing braces with nothing in between mean that there is no structured (json) output from the task. We only print a message in the task which is provided as text output.So the basic building block is done. We can write a control-specific task for any control using Bash or any other language available on the systems we want to test for compliance. For example, here is an implementation of the CIS control 5.1.1 in Python.pdk new task cis_rhel7_5_1_1Note that pdk will create a Bash template file cis_rhel7_5_1_1.sh. We need to rename is to cis_rhel7_5_1_1.py because we are going to use Python for this task.#!/usr/bin/env python # # CIS Red Hat Enterprise Linux 7 Benchmark Server Level 1 # # 5.1.1 Ensure cron daemon is enabled (Scored) import subprocess import json command = 'systemctl is-enabled crond' result = {} try: output = subprocess.check_output(command, shell=True) result['_output'] = "control passed: crond enabled - " + output result['compliant'] = True except subprocess.CalledProcessError as e: result['_output'] = "control failed: crond disabled - " + e.output result['compliant'] = False print(json.dumps(result))Note that in this implementation we provide structured output instead of only text. By providing a separate compliant key we make our lives easier later while searching for compliance issues using Splunk.After we have created our tasks, we should run pdk validate to check the validity of our metadata files and task naming conventions.Writing a Plan to Run Compliance ControlsOur next step is creating a plan to run a series of controls. We call it run.pp and put in in the plans subdirectory of our module. We need to accept two parameters: an array of controls, and an array of nodes to run the controls on. For the array of controls, we choose the data type Array[String[1]], and for the nodes we use the built-in TargetSpec data type, which allows us to be flexible in the way we specify the nodes.plan bolt_compliance::run( Array[String[1]] $controls, TargetSpec $nodes, ) { notice("Running controls: ${controls}") $controls.each | $control | { notice("Running control: ${control}") $result = run_task("bolt_compliance::cis_rhel7_${control}", $nodes) notice("Result for control ${control}: ${result}") }Let's test our plan:$ bolt plan run bolt_compliance::run --params '{"controls": ["1_1_2", "5_1_1"]}' -n localhost Starting: plan bolt_compliance::run Running controls: [1_1_2, 5_1_1] Running control: 1_1_2 Starting: task bolt_compliance::cis_rhel7_1_1_2 on localhost Finished: task bolt_compliance::cis_rhel7_1_1_2 with 0 failures in 0.01 sec Result for control 1_1_2: [{"node":"localhost","status":"success","result":{"_output":"Control failed: /tmp is not a separate filesystem\n"}}] Running control: 5_1_1 Starting: task bolt_compliance::cis_rhel7_5_1_1 on localhost Finished: task bolt_compliance::cis_rhel7_5_1_1 with 0 failures in 0.05 sec Result for control 5_1_1: [{"node":"localhost","status":"success","result":{"_output":"control failed: crond disabled; Command 'systemctl is-enabled crond' returned non-zero exit status 127\n"}}] Finished: plan bolt_compliance::run in 0.09 sec Plan completed successfully with no resultNote that since controls parameter is an array, we need to specify it using the json syntax on the command line. Alternatively, we can create a json file and refer to it in our parameters:$ cat params.json { "controls": ["1_1_2", "5_1_1"] } $ bolt plan run bolt_compliance::run --params @params.json -n localhostCreating a Splunk Report with the HTTP Event CollectorThe next step is to figure out how to report our findings to Splunk. We will use Splunk's HTTP Event Collector, or HEC service, documented here: Splunk HEC Service. We create a "compliance_report" index and save the provided token.Using Postman for experimenting with sending events to Splunk HEC, we find out that we need to send a POST request to https://<splunk-uri>/services/collector with the following headers:Content-type: application/json Authorization: Splunk <token>The request body should be a json object containing a key "event" with another object as value, like so:{ "event": { "key1": "value1", "key2": "value1" } }We get the response{ "text": "Success", "code": 0 }After we verify that our event posted successfully, we can verify this in Splunk's Search and Reporting app that our event has been indexed: Using Compliance Results in a Splunk ReportNow we need to get the output from the compliance testing tasks, create Splunk events from the output, and send the events to create a Splunk report on your compliance. We can write a custom Plan function for this, but in this case we choose to use a task. With a task we have more flexibility – we can determine the node the task runs on so we can circumvent firewall issues if the workstation running Bolt doesn't have required connectivity.A task sending output to Splunk could look something like this. First, the metadata file:{ "puppet_task_version": 1, "supports_noop": false, "description": "Send json data to a Splunk HEC", "parameters": { "splunk_endpoint": { "description": "The Splunk HTTP Event Collector endpoint", "type": "String[1]" }, "splunk_token": { "description": "The Splunk HTTP Event Collector token", "type": "String[1]" }, "data": { "description": "The data to be sent to Splunk", "type": "Hash" } } }The task should accept three parameters: the Splunk endpoint, the Splunk token and the data we want to send.An example implementation of the send_to_splunk task in Python:#!/usr/bin/env python import sys import json import requests params = json.load(sys.stdin) splunk_endpoint = params['splunk_endpoint'] splunk_token = params['splunk_token'] data = params['data'] headers = { 'Content-Type': 'application/json', 'Authorization': 'Splunk ' + splunk_token } # warning: don't use `verify=False` in production! response = requests.post( splunk_endpoint, headers=headers, json=data, verify=False) result = {} result['_output'] = response.text print(json.dumps(result))Modifying the Plan to Send Data to SplunkThe final step of this adventure is to modify our plan to send data to Splunk as follows:plan bolt_compliance::run( Array[String[1]] $controls, TargetSpec $nodes, ) { notice("Running controls: ${controls}") $default_task_args = { splunk_endpoint => 'https://my.splunk.endpoint/services/collector', splunk_token => 'my-splunk-token', } $controls.each | $control | { notice("Running control: ${control}") # run the $control task on the $nodes $result = run_task("bolt_compliance::cis_rhel7_${control}", $nodes) notice("Result for control ${control}: ${result}") $result.each | $result | { # we take $result.value which is our task's output and merge it with some extra data $result_hash = $result.value + { target => $result.target.name, # add host name to the event control => $control, # add the control ID to the event message => $result.message # add the textual message to the event } # construct the Splunk event $task_args = $default_task_args + { data => { event => $result_hash } } # send the event to Splunk $splunk_result = run_task('bolt_compliance::send_to_splunk', 'localhost', $task_args) notice("Result from Splunk: ${splunk_result}") } } } Let's test the new plan on 2 test CentOS 7 nodes defined in our inventory.yaml so we can refer to them using the symbolic name all:$ bolt plan run bolt_compliance::run --params '{"controls": ["1_1_2", "5_1_1"]}' -n all Starting: plan bolt_compliance::run Running controls: [1_1_2, 5_1_1] Running control: 1_1_2 Starting: task bolt_compliance::cis_rhel7_1_1_2 on macs6lesp8kcgfl.delivery.puppetlabs.net, jwpaw4v58f8shqq.delivery.puppetlabs.net Finished: task bolt_compliance::cis_rhel7_1_1_2 with 0 failures in 5.08 sec Result for control 1_1_2: [{"node":"macs6lesp8kcgfl.delivery.puppetlabs.net","status":"success","result":{"_output":"Control failed: /tmp is not a separate filesystem"}},{"node":"jwpaw4v58f8shqq.delivery.puppetlabs.net","status":"success","result":{"_output":"Control failed: /tmp is not a separate filesystem"}}] Starting: task bolt_compliance::send_to_splunk on localhost Finished: task bolt_compliance::send_to_splunk with 0 failures in 0.84 sec Result from Splunk: [{"node":"localhost","status":"success","result":{"_output":"{\"text\":\"Success\",\"code\":0}"}}] Starting: task bolt_compliance::send_to_splunk on localhost Finished: task bolt_compliance::send_to_splunk with 0 failures in 0.88 sec Result from Splunk: [{"node":"localhost","status":"success","result":{"_output":"{\"text\":\"Success\",\"code\":0}"}}] Running control: 5_1_1 Starting: task bolt_compliance::cis_rhel7_5_1_1 on macs6lesp8kcgfl.delivery.puppetlabs.net, jwpaw4v58f8shqq.delivery.puppetlabs.net Finished: task bolt_compliance::cis_rhel7_5_1_1 with 0 failures in 7.61 sec Result for control 5_1_1: [{"node":"macs6lesp8kcgfl.delivery.puppetlabs.net","status":"success","result":{"compliant":true,"_output":"control passed: crond enabled - enabled\n"}},{"node":"jwpaw4v58f8shqq.delivery.puppetlabs.net","status":"success","result":{"compliant":false,"_output":"control failed: crond disabled - disabled\n"}}] Starting: task bolt_compliance::send_to_splunk on localhost Finished: task bolt_compliance::send_to_splunk with 0 failures in 0.83 sec Result from Splunk: [{"node":"localhost","status":"success","result":{"_output":"{\"text\":\"Success\",\"code\":0}"}}] Starting: task bolt_compliance::send_to_splunk on localhost Finished: task bolt_compliance::send_to_splunk with 0 failures in 0.83 sec Result from Splunk: [{"node":"localhost","status":"success","result":{"_output":"{\"text\":\"Success\",\"code\":0}"}}] Finished: plan bolt_compliance::run in 16.12 secIf we do a search in Splunk now, we see this output: Getting Rid of Hard-Wired Splunk ConfigurationsIt's easy to move the Splunk configuration items from the plan code into a separate configuration file. We just need to replace this code: $default_task_args = { splunk_endpoint => 'https://my.splunk.endpoint/services/collector', splunk_token => 'my-splunk-token', }With this code: $default_task_args = loadyaml('splunk-config.yaml')splunk-config.yaml should contain our configuration items like so:splunk_endpoint: "https://my.splunk.endpoint/services/collector" splunk_token: "my-splunk-token"The function loadyaml() is supplied by the puppetlabs-stdlib module, so we need to install it on the system where we run the plan:puppet module install puppetlabs-stdlibWe also need to make sure that Bolt knows where to find the puppetlabs-stdlib module. We can supply the --modulepath parameter to every Bolt invocation, but the easier way is to put this configuration into bolt.yaml. The modulepath should also include the path to the directory containing the bolt_compliance module, ~/modules in this example:modulepath: "~/modules:~/.puppetlabs/etc/code/modules"START USING PUPPET ENTERPRISE FREEThis blog was originally published on December 12, 2018 and has since been updated for relevance and accuracy.Learn MoreUse a hands-on lab for learning how to write Tasks in the Puppet ecosystemFind out how Puppet and Splunk improve reporting speed and scaleHow to deploy an application with BoltStart automating in a few steps with BoltRead the whitepaper on using Bolt and Splunk to automate manual tasksHow Puppet policy as code automates compliance and security standards for your organization
Dimitri Tischenko Principal Sales Engineer, Puppet by Perforce Dimitri Tischenko is a principal sales engineer at Puppet.