To configure high availability, you must provision and enable a replica to serve as backup during failovers. If your master is permanently disabled, you can then promote a replica.
puppet infrastructure
commands, many of which require a valid admin RBAC token. For details about these commands, on the command line, run puppet infrastructure help <ACTION>
, for example, puppet infrastructure help provision
. Provision a replica
Provisioning a replica duplicates specific components and services from the master to the replica.
Ensure you have a valid admin RBAC token.
When provisioning is complete, you must enable the replica to complete your HA configuration.
Manually copy PuppetDB to speed replication
For large PuppetDB installations, you can speed initial replication by manually copying the database from the master to the replica. If you have already started automatic provisioning, you can manually copy your PuppetDB at any time during sync.
The size of your PuppetDB correlates with the number of nodes and resources in your Puppet catalogs. To optionally examine the size of your database, on the PuppetDB PostgreSQL node, run sudo -u pe-postgres /opt/puppetlabs/server/bin/psql -c '\l+ "<DB_NAME>"'
.
pe-puppetdb
.After manual export and restore, PuppetDB automatically updates the replica with any changes that occurred on the master in the meantime.
Enable a replica
Enabling a replica activates most of its duplicated services and components, and instructs agents and infrastructure nodes how to communicate in a failover scenario.
Back up your classifier hierarchy, because enabling a replica alters classification.
Ensure you have a valid admin RBAC token.
Managing agent communication in geo-diverse installations
Typically, when you enable a replica using puppet infrastructure enable replica
, the configuration tool automatically sets the same communication parameters for all agents. In geo-diverse installations, with load balancers or compile masters in multiple locations, you must manually configure agent communication settings so that agents fail over to the appropriate load balancer or compile master.
--skip-agent-config
flag when you enable a replica, for example: puppet infrastructure enable replica example.puppet.com --skip-agent-config
To manually configure which load balancer or compile master agents communicate with, use one of these options:
- CSR attributes
-
For each node, include a CSR attribute that identifies the location of the node, for example
pp_region
orpp_datacenter
. -
Create child groups off of the PE Agent node group for each location.
-
In each child node group, include the
puppet_enterprise::profile::agent
module and set theserver_list
parameter to the appropriate load balancer or compile master hostname. -
In each child node group, add a rule that uses the trusted fact created from the CSR attribute.
-
-
Hiera
For each node or group of nodes, create a key/value pair that sets the
puppet_enterprise::profile::agent::server_list
parameter to be used by the PE Agent node group. - Custom method that sets the
server_list
parameter inpuppet.conf
.
Promote a replica
If your master can’t be restored, you can promote the replica to master to establish the replica as the new, permanent master.
Enable a new replica using a failed master
After promoting a replica, you can use your old master as a new replica, effectively swapping the roles of your failed master and promoted replica.
-
The
puppet infrastructure run
command leverages built-in Bolt plans to automate certain management tasks. To use this command, you must be able to connect using SSH from your master to any nodes that the command modifies. You can establish an SSH connection using key forwarding, a local key file, or by specifying keys in.ssh/config
on your master. For more information, see Bolt OpenSSH configuration options. -
You must have token-based authentication configured.
puppet infrastructure run enable_ha_failover
, specifying these parameters: -
host
— Hostname of the failed master. This node becomes your new replica. -
caserver
— Hostname of the promoted replica that's serving as your new master. -
topology
— Architecture used in your environment, eithermono
ormono-with-compile
-
replication_timeout_secs
— Optional number of seconds allowed to complete provisioning and enabling of the new replica before the command fails.
puppet infrastructure run enable_ha_failover host=<FAILED_MASTER_HOSTNAME> caserver=<PROMOTED_REPLICA_HOSTNAME> topology=mono
Forget a replica
Forgetting a replica cleans up classification and database state, preventing degraded performance over time.
Ensure you have a valid admin RBAC token. See Generate a token using puppet-access.
Run the forget
command whenever a replica node is destroyed, even if you plan to replace it with a replica with the same name.
Reinitialize a replica
If you encounter certain errors on your primary master replica after provisioning, you can reinitialize the replica. Reinitializing destroys and re-creates replica databases, as specified.
Reinitialization is not intended to fix slow queries or intermittent failures. Reinitialize your replica only if it’s inoperational or you see replication errors.