PE known issues

We've updated our documentation to remove harmful terminology.
Sections

These are the known issues in PE 2019.8.

Installation and upgrade known issues

These are the known issues for installation and upgrade in this release.

Initial agent run after upgrade can fail with many environments

In installations with many environments, where file sync can take several minutes, the orchestration service fails to reload during the first post-upgrade Puppet run. As a workaround, re-run the Puppet agent until the orchestration service loads properly. To prevent encountering this error, you can clean up unused environments before upgrading, and wait several minutes after the installer completes to run the agent.

Installing Windows agents with the .msi package fails with a non-default INSTALLDIR

When installing Windows agents with the .msi package, if you specify a non-default installation directory, agent files are nonetheless installed at the default location, and the installation command fails when attempting to locate files in the specified INSTALLDIR.

Console install script installs non-FIPS agents on FIPS Windows nodes

The command provided in the console to install Windows nodes installs a non-FIPS agent regardless of the node's FIPS status. As a workaround, download the appropriate FIPS Windows agent or run the command below from the target node — supplying your primary hostname — to fetch the FIPS agent package from your primary server.
[System.Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12; [Net.ServicePointManager]::ServerCertificateValidationCallback = {$true}; $webClient = New-Object System.Net.WebClient; $webClient.DownloadFile('https://<PRIMARY_HOSTNAME>:8140/packages/current/windowsfips-x86_64/puppet-agent-x64.msi', 'puppet-agent-x64.msi');

With the FIPS Windows .msi package downloaded to the target node, you can then install the agent by using administrative privileges to either run the installer as administrator, or from the CLI, run msiexec /qn /norestart /i <PACKAGE_NAME>.msi

Upgrade fails with cryptic errors if agent_versions are configured for your infrastructure pe_repo class

If you've configured the agent_version parameter for the pe_repo class that matches your infrastructure nodes, upgrade can fail with a timeout error when the installer attempts to download a non-default agent version. As a workaround, before upgrading, remove from the console (in the PE Master node group), Hiera, or pe.conf any agent_version parameters for the pe_repo class that matches your infrastructure nodes.

Upgrade fails with Hiera data based on certificate extensions

If your Hiera hierarchy contains levels based off certificate extensions, like {{trusted.extensions.pp_role}}, upgrade can fail if that Hiera entry is vital to running services, such as {{java_args}}. The failure is due to the puppet infrastructure recover_configuration command, which runs during upgrade, failing to recognize the hierarchy level. As a workaround, place any vital settings, like java_args, in your pe.conf file during upgrade.

Upgrade fails when using custom database certificate parameters

When upgrading an infrastructure with an unmanaged PostgreSQL database to this version of PE, the upgrade can fail and result in downtime if the databases use custom certificates not issued by the PE certificate authority. This is the case even if you configured the database_properties parameters in pe.conf to set a custom sslrootcert. The upgrader ignores these custom certificate settings.

To manually override the upgrader's certificate settings, run these commands, replacing <NEW CERT> with the name of your custom certificate.
rpm -iUv /opt/puppetlabs/server/data/packages/public/2019.3.0/el-7-x86_64-6.12.0/pe-modules-2019.3.0.2-1.el7.x86_64.rpm
                    
sed -i 's#/etc/puppetlabs/puppet/ssl/certs/ca.pem#/etc/puppetlabs/puppet/ssl/<NEW CERT>.pem#'
/opt/puppetlabs/server/data/environments/enterprise/modules/pe_manager/lib/puppet_x/puppetlabs/meep/configure/postgres.rb
                        
./puppet-enterprise-installer - --force
Note: The first step in the code block above includes paths specific to the PE version you're upgrading to. Replace these version information as needed.

Converting legacy compilers fails with an external certificate authority

If you use an external certificate authority (CA), the puppet infrastructure run convert_legacy_compiler command fails with an error during the certificate-signing step.
Agent_cert_regen: ERROR: Failed to regenerate agent certificate on node <compiler-node.domain.com>
Agent_cert_regen: bolt/run-failure:Plan aborted: run_task 'enterprise_tasks::sign' failed on 1 target
Agent_cert_regen: puppetlabs.sign/sign-cert-failed Could not sign request for host with certname <compiler-node.domain.com> using caserver <master-host.domain.com>
To work around this issue when it appears:
  1. Log on to the CA server and manually sign certificates for the compiler.
  2. On the compiler, run Puppet: puppet agent -t
  3. Unpin the compiler from PE Master group, either from the console, or from the CLI using the command: /opt/puppetlabs/bin/puppet resource pe_node_group "PE Master" unpinned="<COMPILER_FQDN>"
  4. On your primary server, in the pe.conf file, remove the entry puppet_enterprise::profile::database::private_temp_puppetdb_host
  5. If you have an external PE-PostgreSQL node, run Puppet on that node: puppet agent -t
  6. Run Puppet on your primary server: puppet agent -t
  7. Run Puppet on all compilers: puppet agent -t

Converted compilers can slow PuppetDB in geo-diverse installations

In configurations that rely on high-latency connections between your primary servers and compilers – for example, in geo-diverse installations – converted compilers running the PuppetDB service might experience significant slowdowns. If your primary server and compilers are distributed among multiple data centers connected by high-latency links or congested network segments, reach out to Support for guidance before converting legacy compilers.

Disaster recovery known issues

These are the known issues for disaster recovery in this release.

Puppet runs can take longer than expected in failover scenarios

In disaster recovery environments with a provisioned replica, if the primary server is unreachable, a Puppet run using data from the replica can take up to three times as long as expected (for example, 6 minutes versus 2 minutes).

FIPS known issues

These are the known issues with FIPS-enabled PE in this release.

Puppet Server FIPS installations don’t support Ruby’s OpenSSL module

FIPS-enabled PE installations don't support extensions or modules that use the standard Ruby Open SSL library, such as hiera-eyaml or the splunk_hec module. As a workaround, you can use a non-FIPS-enabled primary server with FIPS-enabled agents, which limits the issue to situations where only the agent uses the Ruby library.

Errors when using puppet code and puppet db commands on FIPS-compliant platforms

When the pe-client-tools packages are run on FIPS-compliant platforms, puppet code and puppet db commands fail with SSL handshake errors. To use puppet db commands on a FIPS-compliant platforms, install the puppetdb_cli Ruby gem with the following command:
/opt/puppetlabs/puppet/bin/gem install puppetdb_cli --bindir /opt/puppetlabs/bin/
To use puppet code commands on a FIPS-compliant platforms, use the Code Manager API. Alternatively, you can use pe-client-tools on a non-FIPS-compliant platform to access a FIPS-enabled primary server.

Configuration and maintenance known issues

These are the known issues for configuration and maintenance in this release.

Backups fail if a Puppet run is in progress

The puppet-backup command fails if a Puppet run is in progress. As a workaround, run the command manually only when there's no active Puppet run.

Restarting or running Puppet on infrastructure nodes can trigger an illegal reflective access operation warning

When restarting PE services or performing agent runs on infrastructure nodes, you might see the warning Illegal reflective access operation ... All illegal access operations will be denied in a future release in the command-line output or logs. These warnings are internal to PE service components, have no impact on their functionality, and can be safely disregarded.

Orchestration services known issues

These are the known issues for the orchestration services in this release.

Running plans during code deployment can result in failures

If a plan is running during a code deployment, things like compiling apply block statements or downloading and running tasks that are part of a plan might fail. This is because plans run on a combination of PE services, like orchestrator and puppetserver, and the code each service is acting on might get temporarily out of sync during a code deployment.

Tasks fail when specifying both as the input method

In task metadata, using both for the input method causes the task run to fail.

Orchestrator creates an extra JRuby pool

During startup, the orchestrator creates two JRuby pools - one for scheduled jobs and one for everything else. This is because the JRuby pool is not yet available in the configuration passed to the post-migration-fa function, which creates its own JRuby pool in response. These JRuby pools accumulate over time because the stop function doesn't know about them.

Unfinished sync reports as finished when clients share the same identifier

Because the orchestrator and puppetserver file-sync clients share the same identifier, Code Manager reports an unfinished sync as "all-synced": true. Whichever client finishes polling first notifies the storage service that the sync is complete, regardless of the other client's sync status. This might cause attempts to access tasks and plans before the newly-deployed code is available.

Refused connection in orchestrator startup causes PuppetDB migration failure

A condition on startup fails to delete stale scheduled jobs, preventing the orchestrator service from starting.

Console and console services known issues

These are the known issues for the console and console services in this release.

Gateway timeout errors in the console

Using facts to filter nodes might produce either a "502 Bad Gateway" or "Gateway Timeout" error instead of the expected results.

Patching known issues

These are the known issues for patching in this release.

Patch task misreports success when it times out on Windows nodes

If the pe_patch::patch_server task takes longer than the timeout setting to apply patches on a Windows node, the debug output notes the timeout, but the task erroneously reports that it completed successfully.

Code management known issues

These are the known issues for Code Manager, r10k, and file sync in this release.

Changing a file type in a control repo produces a checkout conflict error

Changing a file type in a control repository – for example, deleting a file and replacing it with a directory of the same name – generates the error JGitInternalException: Checkout conflict with files accompanied by a stack trace in the Puppet Server log. As a workaround, deploy the control repo with the original file deleted, and then deploy again with the replacement file or directory.

Enabling Code Manager and multithreading in Puppet Server deadlocks JRuby

Setting the new environment_timeout parameter to any non-zero value – including the unlimited default when Code Manager is enabled – interferes with multithreading in Puppet Server and can result in JRuby deadlocking after several hours.

Default SSH URL with TFS fails with Rugged error

Using the default SSH URL with Microsoft Team Foundation Server (TFS) with the rugged provider causes an error of "unable to determine current branches for Git source." This is because the rugged provider expects an @ symbol in the URL format.

To work around this error, replace ssh:// in the default URL with git@

For example, change:
ssh://tfs.puppet.com:22/tfs/DefaultCollection/Puppet/_git/control-repo
to
git@tfs.puppet.com:22/tfs/DefaultCollection/Puppet/_git/control-repo

GitHub security updates might cause errors with shellgit

GitHub has disabled TLSv1, TLSv1.1 and some SSH cipher suites, which can cause automation using older crypto libraries to start failing. If you are using Code Manager or r10k with the shellgit provider enabled, you might see negotiation errors on some platforms when fetching modules from the Forge. To resolve these errors, switch your configuration to use the rugged provider, or fix shellgit by updating your OS package.

Timeouts when using --wait with large deployments or geographically dispersed compilers

Because the --wait flag deploys code to all compilers before returning results, some deployments with a large node count or compilers spread across a large geographic area might experience a timeout. Work around this issue by adjusting the timeouts_sync parameter.

r10k with the Rugged provider can develop a bloated cache

If you use the rugged provider for r10k, repository pruning is not supported. As a result, if you use many short-lived branches, over time the local r10k cache can become bloated and take up significant disk space.

If you encounter this issue, run git-gc periodically on any cached repo that is using a large amount of disk space in the cachedir. Alternately, use the shellgit provider, which automatically garbage collects the repos according to the normal Git CLI rules.

Code Manager and r10k do not identify the default branch for module repositories

When you use Code Manager or r10k to deploy modules from a Git source, the default branch of the source repository is always assumed to be primary server. If the module repository uses a default branch that is not primary server, an error occurs. To work around this issue, specify the default branch with the ref: key in your Puppetfile.

After an error during the initial run of file sync, Puppet Server won't start

The first time you run Code Manager and file sync on a primary server, an error can occur that prevents Puppet Server from starting. To work around this issue:

  1. Stop the pe-puppetserver service.
  2. Locate the data-dir variable in /etc/puppetlabs/puppetserver/conf.d/file-sync.conf.
  3. Remove the directory.
  4. Start the pe-puppetserver service.

Repeat these steps on each primary server exhibiting the same symptoms, including any compilers.

Puppet Server crashes if file sync can't write to the live code directory

If the live code directory contains content that file sync didn’t expect to find there (for example, someone has made changes directly to the live code directory), Puppet Server crashes.

The following error appears in puppetserver.log:

2016-05-05 11:57:06,042 ERROR [clojure-agent-send-off-pool-0] [p.e.s.f.file-sync-client-core] Fatal error during file sync, requesting shutdown.
org.eclipse.jgit.api.errors.JGitInternalException: Could not delete file /etc/puppetlabs/code/environments/development
        at org.eclipse.jgit.api.CleanCommand.call(CleanCommand.java:138) ~[puppet-server-release.jar:na]

To recover from this error:

  1. Delete the environments in code dir: find /etc/puppetlabs/code -mindepth 1 -delete
  2. Start the pe-puppetserver service: puppet resource service pe-puppetserver ensure=running
  3. Trigger a Code Manager run by your usual method.

Code Manager can't recover from Puppetfile typos in URL

When you have a git typo in your Puppetfile, subsequent code deploys continuously fail until you manually delete deployer caches, even after the Puppetfile error is corrected.

SSL and certificate known issues

These are the known issues for SSL and certificates in this release.

How helpful was this page?

If you leave us your email, we may contact you regarding your feedback. For more information on how Puppet uses your personal information, see our privacy policy.

Puppet sites use proprietary and third-party cookies. By using our sites, you agree to our cookie policy.