Troubleshooting Continuous Delivery for PE

Use this guidance to troubleshoot issues youmight encounter with your Continuous Delivery for Puppet Enterprise (PE) installation.

Important: If your PE instance has a replica configured for disaster recovery, Continuous Delivery for PE is not available if a partial failover occurs. Learn more at What happens during failovers in the PE documentation. To restore Continuous Delivery for PE functionality, you must promote the replica to primary server.

Look up a source control webhook

Continuous Delivery for PE creates a webhook and attempts to automatically deliver it to your source control system when you add a new control repo or module to your workspace. You can look up this webhook if you ever need to manually add (or re-add) it to your source control repository.

  1. In the Continuous Delivery for PE web UI, click Control Repos or Modules, and select the control repo or module whose webhook you want to view.
  2. In the Pipelines section, click Manage pipelines.
  3. If you're using pipelines-as-code, click Manage in the web UI. This temporarily converts your pipeline code to the web UI format so you can copy the webhook. After copying the webhook, don't save any changes and make sure you switch back to Manage as code before exiting the page.
    Tip: If you use pipelines-as-code, make sure you don't save any changes, and make sure you switch back to Manage as code before exiting the page. Your pipeline code isn't affected as long as you don't save.
  4. In the Automation webhook section, copy the full webhook URL. This URL represents the unique webhook that connects this control repo or module in Continuous Delivery for PE with its corresponding repository in your source control system.
What to do next
Add the webhook to the corresponding repository in your source control system, according to the source control provider's documentation. Usually, webhooks are managed in the repository's settings.

Manually configure a Puppet Enterprise integration

When you add credentials for a Puppet Enterprise (PE) instance, Continuous Delivery for PE attempts to look up the endpoints for PuppetDB, Code Manager, orchestrator, and node classifier, and it attempts to access the primary SSL certificate generated during PE installation. If this information can't be located, such as in cases where your PE instance uses customized service ports, you must enter it manually.

This task assumes you have completed the steps in Add your Puppet Enterprise credentials and have been redirected to the manual configuration page.
  1. In the Name field, enter a unique friendly name for your PE installation.
    Tip: If you need to work with multiple PE installations within Continuous Delivery for PE, the friendly names help you differentiate which installation's resources you're managing.
  2. In the API token field, paste a PE access token for your "Continuous Delivery" user. Generate this token using the puppet-access command or the RBAC v1 API.
    For instructions on generating an access token, see Token-based authentication in the PE documentation.
  3. In the five Service fields, enter the endpoints for your PuppetDB, Code Manager, orchestrator, and node classifier services:
    1. In the PE console, go to Status and click Puppet Services status.
    2. Copy the endpoints from the Puppet Services status monitor and paste them into the appropriate fields in Continuous Delivery for PE. Omit the https:// prefix for each endpoint, as shown in the table below:
      Service PE console format Continuous Delivery for PE format
      PuppetDB service https://sample.host.puppet:8081 sample.host.puppet:8081
      Puppet Server service https://sample.host.puppet:8140 sample.host.puppet:8140
      Code Manager service https://sample.host.puppet:8170 sample.host.puppet:8170
      Important: Use port 8170 for Code Manager in Continuous Delivery for PE.
      Orchestrator service https://sample.host.puppet:8143 sample.host.puppet:8143
      Classifier service https://sample.host.puppet:4433 sample.host.puppet:4433
      Tip: The Puppet Server service is used for impact analysis, among other processes. You can run impact analysis tasks on a compiler or load balancer instead of the primary server. This is strongly recommended for PE installations that use compilers or load balancers as part of their architecture. To do this, in the Puppet Server service field, enter the hostname of the compiler or load balancer at :8140. For example: loadbalancer.example.com:8140
  4. To locate the master SSL certificate generated when you installed PE, run:
    curl https://<HOSTNAME>:8140/puppet-ca/v1/certificate/ca --insecure
    The <HOSTNAME> is your PE installation's DNS name.
  5. Copy the entire certificate (including the header and footer) and paste it into the CA certificate field in Continuous Delivery for PE.
  6. Click Save Changes.
  7. Optional: Once the main PE integration is configured, Configure impact analysis.
What to do next

If you want code deployments to skip unavailable compilers, go to Enable compiler maintenance mode.

Restart Continuous Delivery for PE

Continuous Delivery for PE is run in a managed Kubernetes cluster, and restarting the pod is an appropriate first step when troubleshooting.

To restart the pod, run:
kubectl rollout restart deployment cd4pe
For more information about the kubectl rollout command, refer to the Kubernetes documentation.

Stop Continuous Delivery for PE

In rare circumstances, you might need to shut down, or force stop, Continuous Delivery for Puppet Enterprise (PE).

CAUTION:

Force stopping Continuous Delivery for PE can cause errors. Only use these commands under specific circumstances, preferably with guidance from Support.

We recommend that you initially try to Restart Continuous Delivery for PE, which is different and less disruptive than a force stop.

  1. To stop Continuous Delivery for PE, run:
    kubectl scale deploy cd4pe --replicas=0
    For more information about the kubectl scale command, refer to the Kubernetes documentation.
  2. To start Continuous Delivery for PE after a force stop, run:
    kubectl scale deploy cd4pe --replicas=1

Logs

Because Continuous Delivery for PE is run in a managed Kubernetes cluster, you must use the kubectl logs command to access the logs.

To access the Continuous Delivery for PE logs, run:
kubectl logs deployment/cd4pe
To access your installation's PostgreSQL logs, run:
kubectl logs statefulset/postgres

Trace-level logging

To enable or disable trace-level logging:
  1. In Puppet Application Manager (PAM), go tot he Config page.
  2. Locate the Advanced configuration and tuning section.
  3. Toggle the Enable trace logging setting according to your preference.

PE component errors in logs

The logs include errors for both Continuous Delivery for PE and the numerous PE components used by Continuous Delivery for PE. Sometimes an error in the Continuous Delivery for PE logs might actually indicate an issue with Code Manager, r10k, or another PE component.

For example, this log output indicates a Code Manager issue:
Module Deployment failed for PEModuleDeploymentEnvironment[nodeGroupBranch=cd4pe_lab, 
nodeGroupId=a923c759-3aa3-43ce-968a-f1352691ca02, nodeGroupName=Lab environment, 
peCredentialsId=PuppetEnterpriseCredentialsId[domain=d3, name=lab-MoM], 
pipelineDestinationId=null, targetControlRepo=null, targetControlRepoBranchName=null, 
targetControlRepoHost=null, targetControlRepoId=null]. 
Puppet Code Deploy failure: Errors while deploying environment 'cd4pe_lab' (exit code: 1): 
ERROR -> Unable to determine current branches for Git source 'puppet' (/etc/puppetlabs/code-staging/environments) 
Original exception: malformed URL 'ssh://git@bitbucket.org:mycompany/control_lab.git' 
at /opt/puppetlabs/server/data/code-manager/worker-caches/deploy-pool-3/ssh---git@bitbucket.org-mycompany-control_lab.git

For help resolving issues with PE components, go to the PE Troubleshooting documentation.

Error IDs in web UI error messages

Occasionally, error messages shown in the Continuous Delivery for PE web UI include an error ID and instructions to contact the site administrator. For example:
A web UI error message stating: Please contact the site administrator for support along with errorID=[md5:359...bfe7 2020-02-06 19:33 1eg...4vu]. The error ID is truncated for example purposes.

For security reasons, these errors don't report any additional details. If you have root access to the Continuous Delivery for PE host system, you can search for the error ID the logs to learn more.

Duplicate job logs after reinstall

Job logs are housed in object storage after jobs are complete. If you reinstall Continuous Delivery for PE and you reuse the same object storage without clearing it, you might notice logs for multiple jobs with the same job number, or you might notice job logs already present when a new job has just started.

To remove duplicate job logs and prevent creation of duplicate job logs, make sure you clear both the object storage and the database when reinstalling Continuous Delivery for PE.

Name resolution

Continuous Delivery for PE uses CoreDNS for name resolution. In the logs, many can't reach and timeout connecting errors are actually DNS lookup failures.

To add DNS entries to CoreDNS, run the following command to open the configmap file in a text editor:
kubectl -n kube-system edit configmaps coredns
In the configmap file, add a hosts stanza directly after the kubernetes stanza according to the following format:
hosts /etc/hosts <DOMAIN> {
  <IP ADDRESS> <HOSTNAME> [ALIASES]
  fallthrough
}
Here's an example of a configmap file with the hosts stanza added:
apiVersion: v1
data:
  Corefile: |
    .:53 {
        errors
        health {
           lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           fallthrough in-addr.arpa ip6.arpa
           ttl 30
        }
        hosts /etc/hosts puppetdebug.vlan {
          10.234.4.29 pe-201922-master.puppetdebug.vlan pe-201922-master
          fallthrough
        }
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }
kind: ConfigMap
metadata:
  creationTimestamp: "2020-08-25T17:34:17Z"
  name: coredns
  namespace: kube-system
  resourceVersion: "10464"
  selfLink: /api/v1/namespaces/kube-system/configmaps/coredns
  uid: ba2907be-0067-4382-b103-fc248974719a

Looking up information about Continuous Delivery for PE

Use kubectl commands to access information about your Continuous Delivery for PE installation.

Look up the environment variables in use

To list the environment variables in use on your installation, run:
kubectl describe deployments.apps cd4pe
Note: For information on using environment variables to tune your Continuous Delivery for PE installation (such as adjusting HTTP and job timeout periods, changing the size of LDAP server searches, or enabling Git repository caching), refer to Advanced configuration.

Look up your Continuous Delivery for PE version

To print the version of Continuous Delivery for PE running in your installation, run:
kubectl get deployment cd4pe -o jsonpath='{.spec.template.spec.containers[0].image}' ;  printf "\n"

Drain a node

Drain impacted nodes when performing maintenance on your Kubernetes cluster, such as upgrading system packages or rebooting the system.

To drain a node so you can perform system maintenance, run:
/opt/ekco/shutdown.sh

Resize an existing volume

The Continuous Delivery for PE database stores historical data and over time the accumulation of this data can exhaust disk space allocated to container volumes. You can adjust the size of existing volumes as needed to adjust for this data storage.

Before you begin

A PersistentVolumeClaim (PVC) can be increased in size, but not reduced. Attempting to reduce the size of a PVC results in an error.

The current free space in each PVC can be monitored on the administration console dashboard in the Volume Available Storage (%) Prometheus graph.

Ensure there is sufficient storage available to allocate the newly configured storage amount.

  1. Set the PVC to be increased:
    export PVC="data-minio-0"
  2. Make sure the current size of the PVC is what you are expecting:
    kubectl get pvc ${PVC} -o jsonpath="{.spec.resources.requests.storage}"
  3. Update the PVC size. For example, to update the PVC storage to 20 GiB:
    kubectl patch pvc ${PVC} -p '{"spec":{"resources":{"requests":{"storage": "20Gi"}}}}'
  4. Restart the StatefulSet that use that PVC:
    kubectl get pod -o json | jq ".items[] | select (.spec.volumes[]?.persistentVolumeClaim.claimName == \"${PVC}\") | .metadata.ownerReferences[] | select (.kind == \"StatefulSet\") | .name" |sort |uniq |xargs -rt kubectl delete sts --cascade=orphan
    If there is no output from this command, then the StatefulSet could not be found.
  5. Delete the Continuous Delivery for PE migration job:
    kubectl delete job -l app.kubernetes.io/name=cd4pe-migrate
  6. In the Advanced configuration and tuning section, update the storage size based on the PVC you updated. The possible target settings are:
    • data-minio-0 = "Object store capacity"
    • data-postgres-0 = "CD4PE PostgreSQL capacity"
    • data-query-postgres-0 = "Estate Reporting PostgreSQL capacity"
  7. Deploy the application. As the application is re-deployed, there may be disruption to the application as pods are restarted.