Disaster recovery with PAM

It is important to prepare your system and regularly capture full snapshots. This backs up your data and makes it easier to restore your system if disaster recovery is needed.

Prepare your system to support future disaster recovery

To make sure your system is equipped to help you recover from a potential system failure, you must:
  • Configure Puppet Application Manager (PAM) to support Full Snapshots (Instance). For instructions on configuring snapshots, see Backing up PAM using snapshots.
  • Configure Velero to use an external snapshot destination that is accessible to both your current cluster and future new clusters, such as S3 or NFS.
  • Disaster recovery requires that the store backend used for backups is accessible from the new cluster. When setting up snapshots in an offline cluster, use the following command to record the registry service IP address:
    kubectl -n kurl get svc registry -o jsonpath='{.spec.clusterIP}'

    Make a record of the value returned by this command, because you'll need it to create a new cluster to restore to as part of disaster recovery.

  • Run the latest version of PAM. Disaster recovery is only available on systems running PAM version 1.44.1 or newer.

Disaster recovery process

To perform a disaster recovery, you must create a new cluster to recover to and then recover your instance from a snapshot.
  1. Find the version of kURL your original deployment was using.
    If you have access to the original cluster, you can use this command:
    kubectl get configmap -n kurl kurl-current-config -o jsonpath="{.data.kurl-version}" && echo

    If you aren't able to run the command, you remember your PAM version, and you were on version 1.68.0 or later, you can look up the kURL version in the Component versions in PAM releases table.

    If you don't remember your PAM version or you were on a version earlier than 1.68.0, you need to contact your technical account manager or Support for assistance.

  2. If you have access to the original cluster, follow the steps for Migrating data between two systems with the same architecture.

    If your original cluster is completely offline and inaccessible, you'll need to contact your technical account manager or Support for assistance.

    Restriction:

    Your old and new clusters must have the same connection status (online or offline). Disaster recovery from an offline to an online cluster (or vice versa) is not supported.

    Additionally, for offline installs, both the old and new clusters must use the same PAM version.