PE known issues
These are the known issues in PE 2021.0.
Installation and upgrade known issues
These are the known issues for installation and upgrade in this release.
Compiler upgrade fails with no-op configured
Upgrade fails on compilers running in no-op mode. As a workaround, remove the no-op setting on compilers before upgrading.
Compiler upgrade fails with client certnames defined
Existing settings for client certnames can cause upgrade to fail on compilers,
typically with the error Value does not match schema: {:client-certnames
disallowed-key}. As a workaround, manually remove any client-certnames
settings for compilers from /etc/puppetlabs/puppetserver/conf.d/file-sync.conf
.
Windows agent installation fails with a manually transferred certificate
Performing a secure installation on Windows nodes by manually transferring the primary server CA certificate fails with the connection error: Could not establish trust relationship for the SSL/TLS secure channel.
Initial agent run after upgrade can fail with many environments
In installations with many environments, where file sync can take several minutes, the orchestration service fails to reload during the first post-upgrade Puppet run. As a workaround, re-run the Puppet agent until the orchestration service loads properly. To prevent encountering this error, you can clean up unused environments before upgrading, and wait several minutes after the installer completes to run the agent.
Installing Windows agents with the .msi package
fails with a non-default INSTALLDIR
When installing Windows agents with the .msi package,
if you specify a non-default installation directory, agent files are nonetheless
installed at the default location, and the installation command fails when
attempting to locate files in the specified INSTALLDIR
.
Upgrade fails with cryptic errors if agent_versions
are configured for your infrastructure pe_repo
class
If you've configured the agent_version
parameter for
the pe_repo
class that matches your infrastructure
nodes, upgrade can fail with a timeout error when the installer attempts to download
a non-default agent version. As a workaround, before upgrading, remove from the
console (in the PE Master node group), Hiera, or pe.conf
any
agent_version
parameters for the pe_repo
class that matches your infrastructure
nodes.
Converting legacy compilers fails with an external certificate authority
puppet infrastructure
run convert_legacy_compiler
command fails with an error during the
certificate-signing step.
Agent_cert_regen: ERROR: Failed to regenerate agent certificate on node <compiler-node.domain.com> Agent_cert_regen: bolt/run-failure:Plan aborted: run_task 'enterprise_tasks::sign' failed on 1 target Agent_cert_regen: puppetlabs.sign/sign-cert-failed Could not sign request for host with certname <compiler-node.domain.com> using caserver <master-host.domain.com>
- Log on to the CA server and manually sign certificates for the compiler.
- On the compiler, run Puppet:
puppet agent -t
- Unpin the compiler from PE Master group, either from
the console, or from the CLI using the command:
/opt/puppetlabs/bin/puppet resource pe_node_group "PE Master" unpinned="<COMPILER_FQDN>"
- On your primary server, in the
pe.conf
file, remove the entrypuppet_enterprise::profile::database::private_temp_puppetdb_host
- If you have an external PE-PostgreSQL node, run Puppet on that node:
puppet agent -t
- Run Puppet on your primary server:
puppet agent -t
- Run Puppet on all compilers:
puppet agent -t
Converted compilers can slow PuppetDB in geo-diverse installations
In configurations that rely on high-latency connections between your primary servers and compilers – for example, in geo-diverse installations – converted compilers running the PuppetDB service might experience significant slowdowns. If your primary server and compilers are distributed among multiple data centers connected by high-latency links or congested network segments, reach out to Support for guidance before converting legacy compilers.
Upgrading to newer version with versioned deploys causes Puppet Server to crash
If versioned_deploys
is enabled when upgrading to version 2019.8.6
or 2021.1, then the Puppet Server crashes.
Disaster recovery known issues
These are the known issues for disaster recovery in this release.
Puppet runs can take longer than expected in failover scenarios
In disaster recovery environments with a provisioned replica, if the primary server is unreachable, a Puppet run using data from the replica can take up to three times as long as expected (for example, 6 minutes versus 2 minutes).
FIPS known issues
These are the known issues with FIPS-enabled PE in this release.
Puppet Server FIPS installations don’t support Ruby’s OpenSSL module
FIPS-enabled PE installations don't support extensions
or modules that use the standard Ruby Open SSL
library, such as hiera-eyaml or the splunk_hec
module. As a workaround, you can use a non-FIPS-enabled primary server with
FIPS-enabled agents, which limits the issue to situations where only the agent uses
the Ruby library.
Errors when using puppet code
and puppet db
commands on FIPS-compliant platforms
pe-client-tools
packages are run on
FIPS-compliant platforms, puppet code
and puppet db
commands fail with SSL handshake errors. To
use puppet db
commands on a FIPS-compliant
platforms, install the puppetdb_cli
Ruby gem with the following command:
/opt/puppetlabs/puppet/bin/gem install puppetdb_cli --bindir /opt/puppetlabs/bin/
To
use puppet code
commands on a FIPS-compliant
platforms, use the Code Manager API. Alternatively, you
can use pe-client-tools
on a non-FIPS-compliant
platform to access a FIPS-enabled primary server. Configuration and maintenance known issues
These are the known issues for configuration and maintenance in this release.
Restarting or running Puppet on infrastructure nodes can trigger an illegal reflective access operation warning
When restarting PE services or performing agent runs on infrastructure nodes, you might see the warning Illegal reflective access operation ... All illegal access operations will be denied in a future release in the command-line output or logs. These warnings are internal to PE service components, have no impact on their functionality, and can be safely disregarded.
Backup fails with an error about the stockpile
directory
The puppet-backup create
command fails under certain
conditions with an error that the /opt/puppetlabs/server/data/puppetdb/stockpile
directory is
inaccessible.
Orchestration services known issues
These are the known issues for the orchestration services in this release.
Running plans during code deployment can result in failures
If a plan is running during a code deployment, things like compiling apply block statements or downloading and running tasks that are part of a plan might fail. This is because plans run on a combination of PE services, like orchestrator and puppetserver, and the code each service is acting on might get temporarily out of sync during a code deployment.
Pantomime dependency in the orchestrator
The version of pantomime in the orchestrator had a third party vulnerability (tika-core). Because of the vulnerability, pantomime usage was removed from the orchestrator, but pantomime still exists in the orchestration-services build.
Console and console services known issues
These are the known issues for the console and console services in this release.
Console reboot task fails
Rebooting a node using the reboot task in the console fails due to the removal of win32 gems in Puppet 7. As a workaround, update the reboot module to version 4.0.2 or later.
Gateway timeout errors in the console
Using facts to filter nodes might produce either a "502 Bad Gateway" or "Gateway Timeout" error instead of the expected results.
Injection attack vulnerability in csv exports
There's a vulnerability in the console where .csv
files could contain malicious user input when exported.
Patching known issues
These are the known issues for patching in this release.
Patching fails on Windows nodes when run during a fact generation
Add-LogEntry : Exception caught: Lock file found, it appears PID is another copy of pe_patch_fact_generation or pe_patch_groupsAs a workaround, re-run the patching task or plan when fact generation completes.
Patching fails on Windows nodes with non-default agent location
On Windows nodes, if the Puppet agent is installed to a location other than the default C: drive, the patching task or plan fails with the error No such file or directory.
Patching fails with excluded yum packages
In the patching task or plan, using yum_params
to
pass the --exclude
flag in order to exclude certain
packages can result in task or plan failure if the only packages requiring updates
are excluded. As a workaround, use the versionlock
command (which requires installing the yum-plugin-versionlock package) to lock the
packages you want to exclude at their current version. Alternatively, you can fix a
package at a particular version by specifying the version with a package resource
for a manifest that applies to the nodes to be patched.
Code management known issues
These are the known issues for Code Manager, r10k, and file sync in this release.
Changing a file type in a control repo produces a checkout conflict error
Changing a file type in a control repository – for example, deleting a file and replacing it with a directory of the same name – generates the error JGitInternalException: Checkout conflict with files accompanied by a stack trace in the Puppet Server log. As a workaround, deploy the control repo with the original file deleted, and then deploy again with the replacement file or directory.
Enabling Code Manager and multithreading in Puppet Server deadlocks JRuby
Setting the new environment_timeout
parameter to any non-zero value – including the unlimited
default when Code Manager is enabled – interferes
with multithreading in Puppet Server and
can result in JRuby deadlocking after
several hours.
Default SSH URL with TFS fails with Rugged error
Using the default SSH URL with Microsoft Team Foundation Server
(TFS) with the rugged
provider
causes an error of "unable to determine current branches for Git
source." This is because the rugged
provider expects an @
symbol in the
URL format.
To work around
this error, replace ssh://
in the default URL with
git@
ssh://tfs.puppet.com:22/tfs/DefaultCollection/Puppet/_git/control-repo
to
git@tfs.puppet.com:22/tfs/DefaultCollection/Puppet/_git/control-repo
GitHub security updates might cause errors with shellgit
GitHub has disabled TLSv1, TLSv1.1 and some SSH cipher suites,
which can cause automation using older crypto libraries to start
failing. If you are using Code Manager or
r10k with the shellgit
provider enabled, you might
see negotiation errors on some platforms when fetching modules from
the Forge. To resolve these errors,
switch your configuration to use the rugged
provider, or fix shellgit
by updating your OS package.
Timeouts when using --wait
with large
deployments or geographically dispersed compilers
Because the --wait
flag deploys code to all compilers before returning results,
some deployments with a large node count or compilers spread across
a large geographic area might experience a timeout. Work around this
issue by adjusting the timeouts_sync
parameter.
r10k with the Rugged provider can develop a bloated cache
If you use the rugged
provider for r10k, repository pruning is not
supported. As a result, if you use many short-lived branches, over
time the local r10k cache can become
bloated and take up significant disk space.
If you encounter this issue, run git-gc
periodically on any cached repo that is
using a large amount of disk space in the cachedir. Alternately, use
the shellgit
provider, which
automatically garbage collects the repos according to the normal Git
CLI rules.
Code Manager and r10k do not identify the default branch for module repositories
When you use Code Manager or r10k to
deploy modules from a Git source, the default branch of the source
repository is always assumed to be main. If the module repository
uses a default branch that is not main, an error occurs. To
work around this issue, specify the default branch with the
ref:
key in your Puppetfile.
After an error during the initial run of file sync, Puppet Server won't start
The first time you run Code Manager and file sync on a primary server, an error can occur that prevents Puppet Server from starting. To work around this issue:
- Stop the
pe-puppetserver
service. - Locate the data-dir variable in
/etc/puppetlabs/puppetserver/conf.d/file-sync.conf
. - Remove the directory.
- Start the
pe-puppetserver
service.
Repeat these steps on each primary server exhibiting the same symptoms, including any compilers.
Puppet Server crashes if file sync can't write to the live code directory
If the live code directory contains content that file sync didn’t expect to find there (for example, someone has made changes directly to the live code directory), Puppet Server crashes.
The following error appears in puppetserver.log
:
2016-05-05 11:57:06,042 ERROR [clojure-agent-send-off-pool-0] [p.e.s.f.file-sync-client-core] Fatal error during file sync, requesting shutdown.
org.eclipse.jgit.api.errors.JGitInternalException: Could not delete file /etc/puppetlabs/code/environments/development
at org.eclipse.jgit.api.CleanCommand.call(CleanCommand.java:138) ~[puppet-server-release.jar:na]
To recover from this error:
- Delete the environments in code dir:
find /etc/puppetlabs/code -mindepth 1 -delete
- Start the
pe-puppetserver
service:puppet resource service pe-puppetserver ensure=running
- Trigger a Code Manager run by your usual method.
Code Manager can't recover from Puppetfile typos in URL
When you have a git typo in your Puppetfile, subsequent code deploys continuously fail until you manually delete deployer caches, even after the Puppetfile error is corrected.
File sync fails to copy symlinks if versioned deploys is enabled
If you enable versioned deploys, then the file sync fails to copy symlinks and incorrectly copies the symlinks' targets instead. This copy failure crashes the Puppet Server.