Building Application Stacks with Puppet

Managing Google Compute Engine Instances with Puppet

Puppet is an IT automation language that has traditionally been used to configure individual nodes. Puppet's declarative language and dependency model is also suitable for describing entire application stacks on top of public cloud offerings. This post will explain how Puppet can be used to model resources through Google Compute Engine's API in order to describe application stacks as reusable and composable configuration files. Google Compute Engine (GCE) is a service offering from Google that allows users to provision virtual machine instances that run on Google's infrastructure. The one thing that really stands out about this service compared to similar offerings is how fast it is. Machine instances generally take seconds, not minutes, to spin up. The GCE API allows users to create all of the resources needed to dynamically model application stacks, including: virtual machine instances, networks, firewalls, and persistent disks. It also allows you to specify a lot of the characteristics of a virtual machine instance like the image that should be used, and how much memory and CPU to allocate to that instance. What this API can't do is tell a machine how it should be configured. There is no way to say: "Use this image as a starting place, and then configure yourself to be a mysql database." This is where Puppet comes in. It can be used with GCE in order to configure the roles that should be assigned to created instances. Puppet can also be used to perform ongoing management of those instances. This blog will take the concept one step further, explaining not only how Puppet can be used to assign roles to compute instances, but also how Puppet can be used to model the management of all of the compute objects in GCE that are used to create an application stack.

Google Compute Resources

This integration introduces Puppet resources that can be used to manage the creation and destruction of the following GCE resources.
  • gce_instance: Virtual machine instances that are assigned roles.
  • gce_disk: Persistent disks that can be attached to instances.
  • gce_firewall: Firewalls that allow certain kinds of traffic to your instances
  • gce_network: Networks that routes internal traffic between virtual machine instances. Firewalls and instances are associated with networks.
The Puppet DSL is a flexible composition language. It can be used to compose a reusable collection of resources and create an interface that can be used to configure that collection of resources. The following example demonstrates how the Puppet DSL can compose GCE Objects in order to create a defined resource type that can be used to describe that application stack.
define gce_application_stack(
  $network_range = '10.0.1.0/24',
) {

  # create some resource defaults that will be applied to
  # all declared gce_instances
  Gce_instance {
    zone    => 'us-central1-a',
    machine => 'n1-standard-1',
    image   => 'projects/google/images/ubuntu-12-04-v20120621',
    network => 'test_network',
  }

  gce_network { 'test_network':
    ensure => present,
    range  => $network_range,
  }

  gce_firewall { 'ssh':
    ensure      => present,
    description => 'allows incoming tcp traffic on 22',
    allowed     => 'tcp:22',
  }
  
  gce_instance { 'instance':
    ensure      => present,
    description => 'test machine',
  }

}
Now users can reliably create multiple instances of this stack of GCE objects using the gce_application_stack defined resource type:

gce_application_stack { 'stack_one':
  network_range = '10.0.3.0/24',
}
gce_application_stack { 'stack_two':
  network_range = '10.0.2.0/24',
}

Classifying instances

The gce_instance type also supports the ability to use Puppet to classify the instances it creates. Classification is supported with a few additional gce_instance parameters:
  • modules: Array of modules to download from the Puppet Forge
  • classes: Hash of classes along with their parameters that should be used to assign roles to the created instances.
For example, the following GCE instance can be used to not only provision an instance, but also ensures the instance is configured as a functional MySQL server.
  gce_instance { "mysql_db":
    ensure      => present,
    zone        => 'us-central1-a',
    machine     => 'n1-standard-1',
    image       => 'projects/google/images/ubuntu-12-04-v20120621',
    description => 'DB instance',
    modules     => ['puppetlabs-mysql'],
    classes     => {
      'mysql::server' => { 'package_ensure' => 'latest' },
    },
  }
The 'modules' parameter indicates that the puppetlab's mysql module should be installed from the Puppet Forge. The 'classes' parameter expresses that the mysql::server class should be used from that module in order to configure this GCE instance as a mysql::server.

Orchestration

This solution also provides a few basic capabilities for orchestrating the creation of multiple instances. The module that contains these custom resource types includes an example for deploying a 2 instance MediaWiki server composed of one server instance and one database instance.

Installation sequence:

The MediaWiki instance should not be created until the database server that it uses has been configured. In this example, assume a database is created with this following gce_instance resource:
gce_instance { 'database':
 ….
}
The Puppet DSL already supports explicit dependency management between resources using the require and before metaparameters. The following syntax can be used to express that the media_wiki_server should only be installed after the database has been configured.
gce_instance { 'media_wiki_server':
  ...
  require => Gce_instance['database']
}
The media_wiki_server in this example should only be configured after the database has been configured (and not just when its gce_instance exists). The following attributes should be added to the gce_instance database resource so it will block the until its bootstrapping script has completed and not just until its associated instance has been created.
  • block_for_startup_script - indicates that the resources should block until the bootstrapping script that runs Puppet completes.
  • startup_script_timeout - how long to wait for this script to complete before timing out.

Data orchestration

The MediaWiki instance also needs access to the ip address of the database instance in order to fully configure itself. The following syntax can be used from the classes parameter of the MediaWiki instance in order to reference connection information from the database instance.
    "Gce_instance[database][internal_ip_address]"
This parameter supports the retrieval of 2 possible values between instances:
  • internal_ip_address
  • external_ip_address
It is important to ensure that the resources that are referenced have already been created before the resources that use this syntax. This is typically ensured by specifying an explicit dependency using the require or before metaparameters. The following example can be used to connect the MediaWiki server to a database instance that has been defined in the same Puppet manifest.
  gce_instance { 'database':
   ….
    block_for_startup_script => true,
  }

  gce_instance { 'mediawiki':
    ….
    ensure        => present,
    modules       => ['martasd-mediawiki'],
    classes       => {
      'mediawiki' => {
        'admin_email'      => 'admin_email@domain.com',
        'install_db'       => false,
        'db_root_password' => 'root_password',
        'instances'        => {
          'dans_wiki' =>
            { 'db_password' => 'db_pw',
              'db_server'   => "Gce_instance[database][internal_ip_address]",
            }
        }
      }
    },
    require      => Gce_instance['database'],
  }

Next Steps

If you are interested in creating application stacks using these resources, the next step is to download these custom resource types from the module forge.
    puppet module install puppetlabs-gce_compute
The README of this project explains how to get started. It also includes an example that demonstrates how these resources can be used to define application stacks that can be applied to reliably create and destroy a 2 node deployment of MediaWiki. Let us know if you have any feedback!
Puppet sites use proprietary and third-party cookies. By using our sites, you agree to our cookie policy.