Guidelines for running Comply at scale

You can run Puppet Comply on a maximum of 100,000 nodes. Before you run Comply at scale, review the guidelines for configuring the environment and running the scan. The process of running Comply at scale was tested by Puppet in a controlled environment. Because many factors affect performance, results in your system environment might vary.

System requirements and configuration for large-scale environments

To support environments with more than 10,000 nodes, your Comply installation needs a total of at least 16GB of memory and 100GB of storage space available.

Depending on your node count, scan frequency, and desired retention period, you may also need to adjust the "Comply PostgreSQL capacity" and "Comply PostgreSQL memory" values under the "Additional Support" configuration section. The table below contains recommended values based on certain landmark node counts. Note that these values assume one scan per week and the default data retention period of 14 weeks.

Table 1.
14 weeks data retention at 1 scan per week PostgreSQL capacity PostgreSQL memory
1,000 nodes Default Default
10,000 nodes 25Gi Default
50,000 nodes 75Gi 4Gi
75,000 nodes 105Gi 8Gi
100,000 nodes 135Gi 12Gi

For higher node counts, add 6Gi to PostgreSQL capacity for each additional 5,000 nodes. For longer retention periods, divide calculated storage requirement by default retention period (14) to determine per week storage requirement and then multiply by desired retention period.

Note: Comply PostgreSQL capacity cannot be modified after initial installation without contacting Puppet support.

Configure the scan process

To help optimize the scan process, follow the guidelines:
  • In Puppet orchestrator, set the task_concurrency parameter to a value appropriate for your environment and number of nodes. This value sets the maximum number of task or plan actions that can run concurrently in the orchestrator. If you set the parameter to 250 and run a scan of 5000 nodes, the orchestrator will be fully consumed until the scans are completed on all 5000 nodes. (For more information about optimizing performance, see Tune task and plan performance in Puppet Enterprise (PE).)
  • Schedule scans to coincide with periods of minimal workflow to help ensure adequate network throughput.
  • Plan adequate time for the initial inventory ingestion from Puppet Enterprise (PE). In lab testing, the ingestion of 100,000 nodes took 20 minutes.
  • If you have a large number of nodes, consider configuring ad hoc and scheduled scans in smaller batches of up to 10,000 nodes.

Upgrade Comply in a large-scale environment

Before you upgrade Comply in an environment with thousands of nodes, review the limitations and consider the best strategy for your environment.

During the standard upgrade process, a new version of the CIS-CAT Pro Assessor is downloaded to each Puppet-managed node. However, Comply supports a limited number of concurrent downloads of the assessor. In lab testing, a maximum of about 120 concurrent downloads was achieved. Thus, if you initiate an upgrade of thousands of nodes, not all nodes are updated on the first run.

You can resolve the issue in one of the following ways:
  • Run Puppet manually on a maximum of 120 nodes. Repeat the process until all nodes are updated.
  • Configure Comply to host the assessor file on an internal web server and then upgrade Comply. If you choose this option you need to ensure that you host the correct assessor bundle based on your operating system.
To host the assessor file internally and upgrade Comply, complete the following steps:
  1. If you have not already, download the appropriate assessor bundle for your operating system. The assessor bundles are located at:
    • https://<COMPLY_FQDN>/files/assessor/linux
    • https://<COMPLY_FQDN>/files/assessor/mac
    • https://<COMPLY_FQDN>/files/assessor/windows
  2. In the Puppet Enterprise (PE) console, click Node Groups > PE Infrastructure > PE Agent > Classes.
  3. In the Add new class field, select the Comply class.
  4. In the Parameter name field, select scanner_source.
  5. Set the value of the scanner source to the URL where the assessor will be hosted. For example, the URL can have the following structure, where server-hosting-assessor-ip specifies the IP address of the server that will host the assessor and os specifies either mac, linux, or windows:
    http://server-hosting-assessor-ip/assessor/os/assessor.zip
  6. Commit the changes.
  7. In the PE console, click Run > Puppet.
  8. Complete the upgrade process by selecting the relevant nodes and running the job.

Optimize scanning and reporting at scale

You can compare the results of your scanning and reporting processes against the results obtained in lab testing. If performance is not adequate in your environment, determine the cause of bottlenecks and address the issues.

Comply has been tested and is able to process reports from up to 100,000 nodes in a single scan. Processing this number of reports can take up to 120 minutes depending on system resources. However, total scan time may be significantly longer based on Puppet orchestrator concurrency limits as well as the amount of time the CIS-CAT Pro Assessor takes to run on individual nodes. If you have a large number of nodes, consider configuring ad hoc and scheduled scans in smaller batches of up to 10,000 nodes.

Node results raw data exports can take up to 36 minutes for 90,000 nodes, or up to 4 minutes per batch of 10,000 nodes. Allow additional time if generating several exports of over 10,000 nodes concurrently.

The assessor run times are affected by the host type. In general, scans on Microsoft Windows systems take longer than scans on *nix systems. Run times can vary significantly, depending on many other factors. For example, run times are longer for nodes with many user accounts and for nodes with many types of software installed. Results obtained in the lab represent an optimal use case.

To help understand performance issues, you can analyze log files. For more information, see Access logs.

For more information on configuration, workflow, and best practices, visit https://www.puppet.com/docs/patterns-and-tactics/latest/patterns-and-tactics.html.