Backing up PAM using snapshots

Snapshots are point-in-time backups of your Puppet Application Manager (PAM) deployment, which can be used to roll back to a previous state or restore your installation into a new cluster for disaster recovery.

Full and partial snapshots

There are two options available when you're creating a snapshot for your Puppet Application Manager (PAM) deployment, full snapshots (also known as instance snapshots) and partial (or application) snapshots. For full disaster recovery, make sure you've configured and scheduled regular full snapshots stored on a remote storage solution such as an S3 bucket or NFS share.

Full snapshots offer a comprehensive backup of your PAM deployment, because they include the core PAM application together with the Puppet applications you've installed in yourPAM deployment. You can use a full snapshot to restore your PAM deployment and all of your installed Puppet applications to a previous backup. For example, you could use a full snapshot to revert an undesired configuration change or a failed upgrade, or to migrate your PAM deployment to another Puppet-supported cluster.

Partial snapshots are available from the PAM console, but are limited in their usefulness. To restore from a partial snapshot, you must already have an installed and functioning version of PAM. A functioning PAM installation is needed because the option to restore a partial snapshot can only be accessed from the Snapshots section of the PAM admin console.

Partial snapshots only back up the Puppet application you specified when you configured the snapshot, for example, Continuous Delivery for Puppet Enterprise, or Puppet Comply. They do not back up the underlying PAM deployment. Partial snapshots are sometimes useful if you want to roll back to a previous version of a specific Puppet application that you've installed on your PAM deployment, but are far less versatile than full snapshots. To make sure that you have all disaster recovery options available to you, use a full snapshot wherever possible.

Configure snapshots

Before using snapshots, select a storage location, set a snapshot retention period, and indicate whether snapshots are created manually or on a set schedule.

Important: Disaster recovery requires that the store backend used for backups is accessible from the new cluster. When setting up snapshots in an offline cluster, make sure to record the registry service IP address with the following command:
kubectl -n kurl get svc registry -o jsonpath='{.spec.clusterIP}'

Be sure to record the value returned by this command as it is required when creating a new cluster to restore to as part of disaster recovery. For more information on restoring from snapshots, see Disaster recovery or migration using a snapshot.

  1. In the upper navigation bar of the Puppet Application Manager UI, click Snapshots > Settings & Schedule.
  2. The snapshots feature uses https://velero.io, an open source backup and restore tool. Click Check for Velero to determine whether Velero is present on your cluster, and to install it if needed.
  3. Select a destination for your snapshot storage and provide the required configuration information. You can choose to set up snapshot storage in the PAM UI or on the command line. Supported destinations are listed below. We recommend using an external service or NFS, depending on what is available to you:
    • Internal storage (default)
    • Amazon S3
    • Azure Blob Storage
    • Google Cloud Storage
    • Other S3-compatible storage
    • Network file system (NFS)
    • Host path
    Amazon S3 storage

    If using the PAM UI, provide the following information:

    Field Description
    Bucket The name of the AWS bucket where snapshots are stored.
    Region The AWS region the bucket is available in.
    Path Optional. The path within the bucket where all snapshots are stored.
    Use IAM instance role? If selected, an IAM instance role is used instead of an access key ID and secret.
    Access key ID Required only if not using an IAM instance role. The AWS IAM access key ID that can read from and write to the bucket.
    Access key secret Required only if not using an IAM instance role. The AWS IAM secret access key that is associated with the access key ID.

    If using the command line, run the appropriate command:

    Not using an IAM instance role:

    kubectl kots velero configure-aws-s3 access-key --access-key-id <string> --bucket <string> --path <string> --region <string> --secret-access-key <string>
    Using an IAM instance role:
    kubectl kots velero configure-aws-s3 instance-role --bucket <string> --path <string> --region <string>

    Azure Blob Storage

    If using the PAM UI, provide the following information:

    Field Description
    Note: Only connections via service principals are currently supported.
    Bucket The name of the Azure Blob Storage container where snapshots are stored.
    Path Optional. The path within the container where all snapshots are stored.
    Subscription ID Required only for access via service principal or AAD Pod Identity. The subscription ID associated with the target container.
    Tenant ID Required only for access via service principal . The tenant ID associated with the Azure account of the target container.
    Client ID Required only for access via service principal . The client ID of a Service Principle with access to the target container.
    Client secret Required only for access via service principal . The Client Secret of a Service Principle with access to the target container.
    Cloud name The Azure cloud for the target storage (options: AzurePublicCloud, AzureUSGovernmentCloud, AzureChinaCloud, AzureGermanCloud)
    Resource group The resource group name of the target container.
    Storage account The storage account name of the target container
    If using the command line, run the following:
    kubectl kots velero configure-azure service-principle --client-id <string> --client-secret <string> --cloud-name <string> --container <string> --path <string> --resource-group <string> --storage-account <string> --subscription-id <string> --tenant-id <string>

    Google Cloud Storage

    If using the PAM UI, provide the following information:

    Field Description
    Bucket The name of the GCS bucket where snapshots are stored.
    Path Optional. The path within the bucket where all snapshots are stored.
    Service account The GCP IAM Service Account JSON file that has permissions to read from and write to the storage location.

    If using the command line, run the appropriate command:

    For service account authentication:
    kubectl kots velero configure-gcp service-account --bucket <string> --path <string> --json-file <string>
    For Workload Identity authentication:
    kubectl kots velero configure-gcp workload-identity --bucket <string> --path <string> --json-file <string>

    Other S3-compatible storage

    If using the PAM UI, provide the following information:

    Field Description
    Bucket The name of the bucket where snapshots are stored.
    Path Optional. The path within the bucket where all snapshots are stored.
    Access key ID The access key ID that can read from and write to the bucket.
    Access key secret The secret access key that is associated with the access key ID.
    Endpoint The endpoint to use to connect to the bucket.
    Region The region the bucket is available in.
    If using the command line, run the following:
    kubectl kots velero configure-other-s3 --namespace default --bucket <string> --path <string>  --access-key-id <string> --secret-access-key <string> --endpoint <string> --region <string>

    Network file system (NFS)

    Take note of these important steps before you begin configuration:
    • Make sure that you have the NFS server set up and configured to allow access from all the nodes in the cluster.
    • Make sure all the nodes in the cluster have the necessary NFS client packages installed to be able to communicate with the NFS server.
    • Make sure that any firewalls are properly configured to allow traffic between the NFS server and nodes in the cluster.

    If using the PAM UI, provide the following information:

    Field Description
    Server The hostname or IP address of the NFS server.
    Path The path that is exported by the NFS server.
    If using the command line, run the following:
    kubectl kots velero configure-nfs --namespace default --nfs-path <string> --nfs-server <string>

    Host path

    Note that the configured path must be fully accessible by user/group 1001 on your cluster nodes. Host path works best when backed by a shared network file system.

    On the command line, run the following:
    kubectl kots velero configure-hostpath --namespace default --hostpath <string>
  4. Click Update storage settings to save your storage destination information.
    Depending on your chosen storage provider, saving and configuring your storage provider might take several minutes.
  5. Optional: To automatically create new snapshots on a schedule, select Enable automatic scheduled snapshots on the Full snapshots (instance) tab. (If desired, you can also set up a schedule for capturing partial (application-only) snapshots.)
    You can schedule a new snapshot creation for every hour, day, or week, or you can create a custom schedule by entering a cron expression.
  6. Finally, set the retention schedule for your snapshots by selecting the time period after which old snapshots are automatically deleted. The default retention period is one month.
    Note: A snapshot's retention period cannot be changed once the snapshot is created. If you update the retention schedule, the new retention period applies only to snapshots created after the update is made.
  7. Click Update schedule to save your changes.
Results
Snapshots are automatically created according to your specified schedule and saved to the storage location you selected. You can also create an unscheduled snapshot at any time by clicking Start a snapshot on the Dashboard or on the Snapshots page.

Roll back changes using a snapshot

When necessary, you can use a snapshot to roll back to a previous version of your Puppet Application Manager set-up without changing the underlying cluster infrastructure.

To roll back changes:

  1. In console menu of the Puppet Application Manager UI, click Snapshots > Full Snapshots (Instance).
  2. Select the snapshot you wish to roll back to from the list of available snapshots and click Restore from this backup cycle icon.
  3. Follow the instructions to complete either a partial restore or a full restore.
    A full restore is useful if you need to stay on an earlier version of an application and want to disable automatic version updates. Otherwise, a partial restore is the quicker option.

Disaster recovery or migration using a snapshot

Full disaster recovery is possible using a snapshot. You must create a new cluster to recover to, then follow the process outlined below to recover your instance from a snapshot.

Before you begin
  • On the original system, Puppet Application Manager (PAM) must be configured to support Full Snapshots (Instance).

  • Velero must be configured to use an external snapshot destination accessible to both the old and new clusters, such as S3 or NFS.

  • Both the old and new clusters must have the same connection status (online/offline). Migrating from offline to online clusters or vice versa is not supported.

  • You must use the 30 June 2021 or newer release of PAM.

  • For offline installs, both the old and new clusters must use the same version of PAM

Important: If you are migrating from a legacy architecture, go to our Support Knowledge Base instructions for migrating to a supported architecture for your Puppet application:
To perform data migration or disaster recovery between two systems using the same architecture (from standalone to standalone, or from HA to HA):
  1. On the original system, find the version of kURL your deployment is using by running the following command. Save the version for use in step 3.
    kubectl get configmap -n kurl kurl-current-config -o jsonpath="{.data.kurl-version}" && echo
  2. Get the installer spec section by running the command appropriate to your PAM installation type:
    Tip: See How to determine your version of Puppet Application Manager if you're not sure which installation type you're running.
    • HA installation: kubectl get installers puppet-application-manager -o yaml
    • Standalone installation: kubectl get installers puppet-application-manager-standalone -o yaml
    • Legacy installation: kubectl get installers puppet-application-manager-legacy -o yaml
    The command's output looks similar to the following. The spec section is shown in bold in the example below. Save your spec section for use in step 3.
    # kubectl get installers puppet-application-manager-standalone -o yaml
    apiVersion: cluster.kurl.sh/v1beta1
    kind: Installer
    metadata:
      annotations:
        kubectl.kubernetes.io/last-applied-configuration: |
    
      {"apiVersion":"cluster.kurl.sh/v1beta1","kind":"Installer","metadata":{"annotations":{},"creationTimestamp":null,"name":"puppet-application-manager-standalone","namespace":"default"},"spec":{"containerd":{"version":"1.4.12"},"contour":{"version":"1.18.0"},"ekco":{"version":"0.16.0"},"kotsadm":{"applicationSlug":"puppet-application-manager","version":"1.64.0"},"kubernetes":{"version":"1.21.8"},"metricsServer":{"version":"0.4.1"},"minio":{"version":"2020-01-25T02-50-51Z"},"openebs":{"isLocalPVEnabled":true,"localPVStorageClassName":"default","version":"2.6.0"},"prometheus":{"version":"0.49.0-17.1.1"},"registry":{"version":"2.7.1"},"velero":{"version":"1.6.2"},"weave":{"podCidrRange":"/22","version":"2.8.1"}},"status":{}}
      creationTimestamp: "2021-06-04T00:05:08Z"
      generation: 4
      labels:
        velero.io/exclude-from-backup: "true"
      name: puppet-application-manager-standalone
      namespace: default
      resourceVersion: "102061068"
      uid: 4e7f1196-5fab-4072-9399-15d18dcc5137
    spec:
      containerd:
        version: 1.4.12
      contour:
        version: 1.18.0
      ekco:
        version: 0.16.0
      kotsadm:
        applicationSlug: puppet-application-manager
        version: 1.64.0
      kubernetes:
        version: 1.21.8
      metricsServer:
        version: 0.4.1
      minio:
        version: 2020-01-25T02-50-51Z
      openebs:
        isLocalPVEnabled: true
        localPVStorageClassName: default
        version: 2.6.0
      prometheus:
        version: 0.49.0-17.1.1
      registry:
        version: 2.7.1
      velero:
        version: 1.6.2
      weave:
        podCidrRange: /22
        version: 2.8.1
    status: {}
    Note: If the command returns Error from server (NotFound), check that you used the correct command for your architecture. You can view all installers by running kubectl get installers. You’re targeting the most recent installer.
  3. On a new machine, create a file named installer.yaml with the following contents, replacing <SPEC> and <KURL VERSION> with the information you gathered in the previous steps.
    apiVersion: cluster.kurl.sh/v1beta1
    kind: Installer
    metadata:
    <SPEC>
      kurl:
        installerVersion: "<KURL VERSION>"
    Important: If you are running PAM version 1.68.0 or newer, the kURL installer version might be included in the spec section. If this is the case, omit the kurl: section from the bottom of the installer.yaml file. There must be only one kurl: entry in the file.
    Tip: Spacing is critical in YAML files. Use a YAML file linter to confirm that the format of your file is correct.
    Here is an example of the contents of the installer.yaml file:
    apiVersion: cluster.kurl.sh/v1beta1
    kind: Installer
    metadata:
    spec:
      containerd:
        version: 1.4.12
      contour:
        version: 1.18.0
      ekco:
        version: 0.16.0
      kotsadm:
        applicationSlug: puppet-application-manager
        version: 1.64.0
      kubernetes:
        version: 1.21.8
      metricsServer:
        version: 0.4.1
      minio:
        version: 2020-01-25T02-50-51Z
      openebs:
        isLocalPVEnabled: true
        localPVStorageClassName: default
        version: 2.6.0
      prometheus:
        version: 0.49.0-17.1.1
      registry:
        version: 2.7.1
      velero:
        version: 1.6.2
      weave:
        podCidrRange: /22
        version: 2.8.1
      kurl:
        installerVersion: "v2022.03.11-0"
  4. Build an installer using the installer.yaml file. Run the following command:
    curl -s -X POST -H "Content-Type: text/yaml" --data-binary "@installer.yaml" https://kurl.sh/installer |grep -o "[^/]*$"
    The output is a hash. Carefully save the hash for use in step 5.
  5. Install a new cluster. To do so, you can either:
    1. Point your browser to https://kurl.sh/<HASH> (replacing <HASH> with the hash you generated in step 4) to see customized installation scripts and information.
    2. Follow the appropriate PAM documentation.
      • For online installations: Follow the steps in PAM HA online installation or PAM standalone online installation, replacing the installation script with the following:
        curl https://kurl.sh/<HASH> | sudo bash
      • For offline installations: Follow the steps in PAM HA offline installation or PAM standalone offline installation, replacing the installation script with the following:
        curl -LO https://k8s.kurl.sh/bundle/<HASH>.tar.gz
        When setting up a new offline cluster as part of disaster recovery, add kurl-registry-ip=<IP> to the install options, replacing <IP> with the value you recorded when setting up snapshots.
        Note: If you do not include the kurl-registry-ip=<IP> flag, the registry service will be assigned a new IP address that does not match the IP on the machine where the snapshot was created. You must align the registry service IP address on the new offline cluster to ensure that the restored configuration can pull images from the correct location.
    Important: Do not install any Puppet applications after the PAM installation is complete. You'll recover your Puppet applications later in the process.
  6. To recover using a snapshot saved to a host path, ensure user/group 1001 has full access on all nodes by running:
    chown -R 1001:1001 /<PATH/TO/HOSTPATH>
  7. Configure the new cluster to connect to your snapshot storage location. Run the following to see the arguments needed to complete this task:
    kubectl kots -n default velero configure-{hostpath,nfs,aws-s3,other-s3,gcp} --help
  8. Run kubectl kots get backup and wait for the list of snapshots to become available. This might take several minutes.
  9. Start the restoration process by running kubectl kots restore --from-backup <BACKUP NAME>.
    The restoration process takes several minutes to complete. When the PAM UI is available, use it to monitor the status of the application.
    Note: When restoring, wait for all restores to complete before making any changes. The following command waits for pods to finish restoring data from backup. Other pods may not be ready until updated configuration is deployed in the next step:
    kubectl get pod -o json | jq -r '.items[] | select(.metadata.annotations."backup.velero.io/backup-volumes") | .metadata.name' | xargs kubectl wait --for=condition=Ready pod --timeout=20m
    

    This command requires the jq CLI tool to be installed. It is available in most Linux OS repositories.

  10. After the restoration process completes, save your config and deploy:
    1. From the PAM UI, click Config.
    2. (Optional) If the new cluster's hostname is different from the old one, update the Hostname.
    3. Click Save Config.
    4. Deploy the application. You must save your config and deploy even if you haven't made any changes.
      Note: If you have installed Continuous Delivery for PE and changed the hostname, you need to update the webhooks that connect Continuous Delivery for PE with your source control provider. For information on how to do this, see Update webhooks.