Backing up PAM using snapshots

Snapshots are point-in-time backups of your Puppet Application Manager (PAM) deployment, which can be used to roll back to a previous state or restore your installation into a new cluster for disaster recovery.

Full and partial snapshots

There are two options available when you're creating a snapshot for your Puppet Application Manager (PAM) deployment, full snapshots (also known as instance snapshots) and partial (or application) snapshots. For full disaster recovery, make sure you've configured and scheduled regular full snapshots stored on a remote storage solution such as an S3 bucket or NFS share.

Full snapshots offer a comprehensive backup of your PAM deployment, because they include the core PAM application together with the Puppet applications you've installed in yourPAM deployment. You can use a full snapshot to restore your PAM deployment and all of your installed Puppet applications to a previous backup. For example, you could use a full snapshot to revert an undesired configuration change or a failed upgrade, or to migrate your PAM deployment to another Puppet-supported cluster.

Partial snapshots are available from the PAM console, but are limited in their usefulness. To restore from a partial snapshot, you must already have an installed and functioning version of PAM. A functioning PAM installation is needed because the option to restore a partial snapshot can only be accessed from the Snapshots section of the PAM admin console.

Partial snapshots only back up the Puppet application you specified when you configured the snapshot, for example, Continuous Delivery for Puppet Enterprise, or Puppet Comply. They do not back up the underlying PAM deployment. Partial snapshots are sometimes useful if you want to roll back to a previous version of a specific Puppet application that you've installed on your PAM deployment, but are far less versatile than full snapshots. To make sure that you have all disaster recovery options available to you, use a full snapshot wherever possible.

Configure snapshots

Before using snapshots, select a storage location, set a snapshot retention period, and indicate whether snapshots are created manually or on a set schedule.

Note: The snapshots feature was accidentally disabled on some application licenses issued prior to March 2021. If you do not see a Snapshots option in your Puppet Application Manager UI, and you would like to use this feature, please contact our Support team using the Zendesk account provided to you.

A beta version of the snapshots feature, which only supported rollback snapshots, was available prior to the 15 April Puppet Application Manager release. Some features or storage locations mentioned on this page are not available on older versions of Puppet Application Manager.

Important: Disaster recovery requires that the store backend used for backups is accessible from the new cluster. When setting up snapshots in an offline cluster, make sure to record the registry service IP address with the following command:
kubectl -n kurl get svc registry -o jsonpath='{.spec.clusterIP}'

be sure to record the value returned by this command as it is required when creating a new cluster to restore to as part of disaster recovery. For more information on restoring from snapshots, see Disaster recovery or migration using a snapshot.

  1. In the upper navigation bar of the Puppet Application Manager UI, click Snapshots > Settings & Schedule.
  2. The snapshots feature uses https://velero.io, an open source backup and restore tool. Click Check for Velero to determine whether Velero is present on your cluster, and to install it if needed.
  3. Select a destination for your snapshot storage and provide the required configuration information. Supported destinations are listed below. We recommend using an external service or NFS, depending on what is available to you:
    • Internal storage (default)
    • Amazon S3
    • Azure Blob Storage
    • Google Cloud Storage
    • Other S3-compatible storage
    • Network file system (NFS)
    • Host path
    For Amazon S3 storage, provide the following information:
    Field Description
    Bucket The name of the AWS bucket where snapshots are stored.
    Region The AWS region the bucket is available in.
    Path Optional. The path within the bucket where all snapshots are stored.
    Use IAM instance role? If selected, an IAM instance role is used instead of an access key ID and secret.
    Access key ID Required only if not using an IAM instance role. The AWS IAM access key ID that can read from and write to the bucket.
    Access key secret Required only if not using an IAM instance role. The AWS IAM secret access key that is associated with the access key ID.
    For Azure Blob Storage, provide the following information:
    Field Description
    Note: Only connections via service principals are currently supported.
    Bucket The name of the Azure Blob Storage container where snapshots are stored.
    Path Optional. The path within the container where all snapshots are stored.
    Subscription ID Required only for access via service principal or AAD Pod Identity. The subscription ID associated with the target container.
    Tenant ID Required only for access via service principal . The tenant ID associated with the Azure account of the target container.
    Client ID Required only for access via service principal . The client ID of a Service Principle with access to the target container.
    Client secret Required only for access via service principal . The Client Secret of a Service Principle with access to the target container.
    Cloud name The Azure cloud for the target storage (options: AzurePublicCloud, AzureUSGovernmentCloud, AzureChinaCloud, AzureGermanCloud)
    Resource group The resource group name of the target container.
    Storage account The storage account name of the target container
    For Google Cloud Storage, provide the following information:
    Field Description
    Bucket The name of the GCS bucket where snapshots are stored.
    Path Optional. The path within the bucket where all snapshots are stored.
    Service account The GCP IAM Service Account JSON file that has permissions to read from and write to the storage location.

    For other S3-compatible storage, provide the following information:

    Field Description
    Bucket The name of the bucket where snapshots are stored.
    Path Optional. The path within the bucket where all snapshots are stored.
    Access key ID The access key ID that can read from and write to the bucket.
    Access key secret The secret access key that is associated with the access key ID.
    Endpoint The endpoint to use to connect to the bucket.
    Region The region the bucket is available in.
    For a network file system (NFS), take note of these important steps before you begin configuration:
    • Make sure that you have the NFS server set up and configured to allow access from all the nodes in the cluster.
    • Make sure all the nodes in the cluster have the necessary NFS client packages installed to be able to communicate with the NFS server.
    • Make sure that any firewalls are properly configured to allow traffic between the NFS server and nodes in the cluster.
    Provide the following information:
    Field Description
    Server The hostname or IP address of the NFS server.
    Path The path that is exported by the NFS server.

    For a host path, note that the configured path must be fully accessible by user/group 1001 on your cluster nodes. Host path works best when backed by a shared network file system.

  4. Click Update storage settings to save your storage destination information.
    Depending on your chosen storage provider, saving and configuring your storage provider might take several minutes.
  5. Optional: To automatically create new snapshots on a schedule, select Enable automatic scheduled snapshots on the Full snapshots (instance) tab. (If desired, you can also set up a schedule for capturing partial (application-only) snapshots.)
    You can schedule a new snapshot creation for every hour, day, or week, or you can create a custom schedule by entering a cron expression.
  6. Finally, set the retention schedule for your snapshots by selecting the time period after which old snapshots are automatically deleted. The default retention period is one month.
    Note: A snapshot's retention period cannot be changed once the snapshot is created. If you update the retention schedule, the new retention period applies only to snapshots created after the update is made.
  7. Click Update schedule to save your changes.
Results
Snapshots are automatically created according to your specified schedule and saved to the storage location you selected. You can also create an unscheduled snapshot at any time by clicking Start a snapshot on the Dashboard or on the Snapshots page.

Roll back changes using a snapshot

When necessary, you can use a snapshot to roll back to a previous version of your Puppet Application Manager set-up without changing the underlying cluster infrastructure.

To roll back changes:

  1. In console menu of the Puppet Application Manager UI, click Snapshots > Full Snapshots (Instance).
  2. Select the snapshot you wish to roll back to from the list of available snapshots and click Restore from this backup cycle icon.
  3. Follow the instructions to complete either a partial restore or a full restore.
    A full restore is useful if you need to stay on an earlier version of an application and want to disable automatic version updates. Otherwise, a partial restore is the quicker option.

Disaster recovery or migration using a snapshot

Full disaster recovery is possible using a snapshot. You must create a new cluster to recover to, then follow the process outlined below to recover your instance from a snapshot.

You can also use this workflow to migrate data from legacy to non-legacy deployments, and between standalone and HA deployments.

  1. On the original system, find the version of kURL your deployment is using. Save the version information for the next step:
    kubectl get configmap -n kurl kurl-current-config -o jsonpath="{.data.kurl-version}"  && echo ""
  2. Set up a new cluster to house the recovered instance following the system requirements for your applications.
    • Install PAM using the version of kURL you retrieved earlier:
      curl -sSL https://k8s.kurl.sh/version/<VERSION STRING>/puppet-application-manager | sudo bash <-s options>
    • When setting up a new offline cluster as part of disaster recovery, add kurl-registry-ip=<IP> to the install options, replacing <IP> with the value you recorded when setting up snapshots.
      Note: If you do not include the kurl-registry-ip=<IP> flag, the registry service will be assigned a new IP address that does not match the IP on the machine where the snapshot was created. You must align the registry service IP address on the new offline cluster to ensure that the restored configuration can pull images from the correct location.
  3. To recover using a snapshot saved to a host path, ensure user/group 1001 has full access on all nodes by running:
    chown -R 1001:1001 /<PATH/TO/HOSTPATH>
  4. Configure the new cluster to connect to your snapshot storage location. Run the following to see the arguments needed to complete this task:
    kubectl kots -n default velero configure-{hostpath,nfs,aws-s3,other-s3,gcp} --help
  5. Run kubectl kots get backup and wait for the list of snapshots to become available. This might take several minutes.
  6. Start the restoration process by running kubectl kots restore --from-backup <BACKUP NAME>.
    The restoration process takes several minutes to complete. When the Puppet Application Manager UI is available, use it to monitor the status of the application.
    Note: When restoring, wait for all restores to complete before making any changes. The following command waits for pods to finish restoring data from backup. Other pods may not be ready until updated configuration is deployed in the next step:
    kubectl get pod -o json | jq -r '.items[] | select(.metadata.annotations."backup.velero.io/backup-volumes") | .metadata.name' | xargs kubectl wait --for=condition=Ready pod --timeout=5m
    

    This command requires the jq CLI tool to be installed. It is available in most Linux OS repositories.

  7. If the new cluster's hostname is different from the old one, in the Puppet Application Manager UI, update the Hostname on the Config tab, click Save Config, and then redeploy the application.
    Note: If you have installed Continuous Delivery for PE and changed the hostname, you need to update the webhooks that connect Continuous Delivery for PE with your source control provider. For information on how to do this, see Update webhooks.