For Clojure Nerds: Puppet Labs Application Services
We recently posted an article about our new open-source Clojure application services framework, Trapperkeeper. In that post, I talked a bit about our motivations, alternative frameworks that we considered, and some of the advantages that the new framework will offer to Puppet users.
If you’re curious about the motivations, I definitely recommend reading the earlier blog post, A New Era of Application Services at Puppet Labs. To reiterate a few of the most important points, though, we found that many of our applications needed a lot of the same functionality that has become commonplace in today's application frameworks:
- the ability to configure and control which parts of the system are loaded at run-time
- the ability to compose modular bits of functionality
- a way to cohesively manage the lifecycle of the various components making up an application
- a common logging mechanism that wouldn't need to be set up and configured by each component
- a web server flexible enough to load multiple web apps, and to support a rich set of options for configuring SSL
Now I’ll take a deeper dive into the underlying technology, and give all you Clojure developers an idea why we think you might be interested in using Trapperkeeper for your own projects.
If you’re the kind of person who gets the most bang for your buck by just diving right in to some code, here are a couple of options for you:
- Have a peek at the Trapperkeeper git repo
- Use the leiningen template to generate a skeleton project:
lein new trapperkeeper my.namespace/myproject
Trapperkeeper is a simple pure-Clojure services framework based on the idea of lightweight, composable services. Of course, the word "service" means very different things in different contexts. In Trapperkeeper, we define a service as:
- a set of functions
- a set of dependencies on other services
- a simple lifecycle
To use these services, Trapperkeeper provides a basic application container. At startup, the container decides which services to load, based on a configuration file. This enables deploy-time control over the list of services that are loaded into the container, without requiring code changes. It also means you can pick and choose the exact set of services you want to deploy with your app; you don’t have to worry about the footprint of your application being bloated by features you aren’t actually using. For example, we provide a Jetty web server service, but you don’t have to ship it with your application unless you’re actually building a web application or service on top of it.
Another feature Trapperkeeper offers is the ability to specify dependencies between services. Trapperkeeper resolves all these dependencies at boot time, which gives us fail-fast behavior in the case of a missing service.
Trapperkeeper services are defined via Clojure protocols (similar to "interfaces" in many other programming languages), which means you have a concrete mechanism for advertising the contract for a service, without coupling it with implementation details. This makes it possible to swap out implementations of services without modifying the code of downstream services.
Trapperkeeper also provides a common configuration mechanism for configuring individual services within the container, allowing you to write services that provide a simple, consistent configuration mechanism for the end user. (We currently support configuration via .ini, .json, and several other file formats.)
Now that we’ve hit some of the high points, let’s get a bit further into the details!
To allow services to specify dependencies on one another, we are using the excellent open-source library Graph (github repo), from the nice folks over at Prismatic. This library allows Trapperkeeper to ensure that services are started and shut down in the appropriate order, and injects functions from upstream services into the scope of downstream services where they can be called like any other Clojure function. Here’s a code sample:
This code indicates that our
foo-service has a dependency on Trapperkeeper’s
ConfigService, and specifically on a function called
get-config that is provided by that service. Trapperkeeper (via Prismatic Graph) then ensures that the
ConfigService is initialized before
foo-service, and injects the
get-config function into the scope of the
foo-service so that
foo-service can call it as part of any of its own function definitions.
This is the simplest example of specifying a dependency; it shows how to reference a single function from another service. There are many other options for how to reference other services / functions; you can specify multiple functions from an upstream service, you can ask for a map containing all of the functions provided by an upstream service, or you can reference a service via its Clojure protocol. We’ll skip over some of those details for now, but if you’re interested, have a look at the documentation on Referencing Services.
One of the goals of Trapperkeeper is to provide a flexible system for determining which services to load into the application container at boot time. This gives us a good deal more freedom as to how we deploy our applications; for small installations that will not be under heavy load, we can load a lot of services into a single container on a single machine. However, for larger installations that might encounter scaling problems, we can decide at deploy/boot time that we’d like to run fewer services in a single container, and spread the suite of services across multiple machines.
To accomplish this, Trapperkeeper uses a very simple configuration file called
bootstrap.cfg. Here’s an example of what this file might look like:
Each line in the config file simply lists out a Clojure namespace, followed by the name of a service that is defined in that namespace. In this example, the
jetty9-service is just a reference to the Trapperkeeper Jetty 9 web server. The other two services are simple web applications that register themselves with the Jetty service to provide web functionality.
At startup, Trapperkeeper would read this file to determine which services it should load, resolve dependencies between them, and then load them all up into a single JVM process. If later you should decide that you want to scale out and separate the
ernie-service to a different machine, you would simply remove that line from the
bootstrap.cfg file and restart the application. When the app comes back up, it will be running only the Jetty service and the
Using Protocols to Define Services
All right, now that we’ve gotten some of the nitty-gritty out of the way, we can dive into some more interesting Clojure-related topics. As we mentioned earlier, Trapperkeeper services specify their "contract" with other services using Clojure protocols. This means there is a clear place where you can look to see what functionality a service offers, and you can swap out alternate implementations without affecting the code of downstream services.
Here’s a concrete example to illustrate the point:
In this example, you can see that we’ve defined a Clojure protocol called
FooService that provides the contract for our simple service. The service provides a single function,
foo, that takes no arguments (other than the standard
this argument for the protocol object).
Next up, we provide two Trapperkeeper service definitions that satisfy the FooService protocol:
uppercase-foo-service. Each simply provides an implementation of the
foo function that returns the string
"foo" in lower case or upper case, respectively.
Finally, we write a
foo-consumer service. It specifies a dependency (via
[:FooService foo]) on a
foo function provided by some service that satisfies the
FooService protocol, but it does not refer explicitly to either of the two actual implementations.
bootstrap.cfg, we can swap out the desired implementation of
FooService prior to launching Trapperkeeper:
This version of
bootstrap.cfg would give us the lower-case version, but we could swap it out to use the upper-case version without changing any of the code.
This is obviously a trivial example, but you can imagine using
bootstrap.cfg for much more interesting tasks. For example, we provide both a Jetty9 and Jetty7 version of our
WebserverService, so you can choose which version of Jetty to use by simply changing one line in the bootstrap.cfg file. You could use this approach to support swappable persistence back-ends for your application, alternate implementations of subsystems like message queues, etc.
Configuration and Logging
Configuration and logging are two core features of any application or service which, unfortunately, end up being re-invented on a fairly regular basis. Trapperkeeper ships with some plumbing to make these two tasks much easier to manage, and ensures that your services provide a simple and consistent end-user experience for managing the service and logging configuration.
For logging, we’re not doing anything fancy; Trapperkeeper simply uses the Java logback library. This library is compatible with the popular Clojure tools.logging library, so your Clojure services can be written against that library without having to concern themselves with the underlying implementation details. Also,
logback is compatible with almost all of the other existing Java logging frameworks, so if your service has a dependency on a Java library that uses a different framework, it will usually work with
logback with little or no effort.
Trapperkeeper includes a built-in service known as the
ConfigService. This service is responsible for parsing user configuration data into an in-memory data structure, and then exposing it for consumption by other services. Thus, if you are writing a service that users can configure in some way, you simply express a dependency on the
ConfigService and get the data that you need from it. The individual services do not need to worry about where the configuration data is coming from, or how to read it in. (They do have the ability to do their own validation of their specific configuration settings, however.)
Behind the scenes, the Trapperkeeper
ConfigService supports reading in configuration data from a single config file, or a directory full of configuration files. The path to the config file/directory is specified via the
--config command line argument when you launch Trapperkeeper. The config files may be of any of the following formats:
.conf (HOCON, from the typesafe configuration library),
ConfigService will read data from your config file(s) and build up a nested map, which is then accessible to all downstream services.
So let’s see an example of how that might work. Say you have a directory
./conf.d, which contains the following files:
Given those config files, you could write a service that looked like this:
And during development, you might launch Trapperkeeper with a command like this:
lein run -m puppetlabs.trapperkeeper.main --config ./conf.d
Then whenever some other part of your app made a call to the
foo function from
foo-service, you’d see something like this:
In this example, the
FooService isn’t doing anything useful with the configuration data, but in a real app you could use this to configure anything that should be configurable in your service.
Trapperkeeper defines a simple protocol for the life cycle of a service. It looks like this:
When Trapperkeeper is launched, it will resolve the dependencies between all the services in the container, and then call their lifecycle functions in the appropriate order. All
init functions are called first, followed by all
start functions. The
stop functions are called when Trapperkeeper is shutting down, in the reverse order that the services were started in. (Thus, all web services that have a dependency on the
WebserverService will be initialized after the
WebserverService, and stopped before the
All lifecycle functions are passed a
context map that can be used to store state for the service. For example, the
WebserverService stores a reference to the actual web server object in this
context map. This can be accessed from other service functions, and is used to shut down the web server during the
Multiple Web Applications in a Container
A useful feature of Trapperkeeper’s default web server implementation (not strictly a feature of Trapperkeeper itself, but useful nonetheless) is that it's designed to allow web applications to register themselves with the web server at a specified URL context. This means you can register multiple web applications / services in the same webserver, without the individual apps / services needing to have any knowledge of one another.
This is accomplished via the add-ring-handler function of the
WebserverService protocol. Some sample code might look like this:
In this example, we’re calling
add-ring-handler twice in the same service, but you could just as easily call it from two separate services, and thus have two completely isolated web services. You could also use a configuration value from the
ConfigService to control the URL prefixes at which the apps were registered!
A pattern that seems to be gaining some traction in the Clojure world is the "reloaded" workflow, originally described by Stuart Sierra in his “My Clojure Workflow, Reloaded” blog post. The idea is to design your app in such a way that the long-lived mutable state can be isolated to a spot that makes it easy to “reload” the entire application from scratch in a REPL session, without having to restart the JVM. Stuart has examples of how to achieve this in his Component library. It’s also a goal of the Jig framework.
We’ve designed Trapperkeeper to make sure it supports this pattern as well. The end result ends up looking a lot like Stuart’s example code. You can see a working version of this pattern in our example ring application.
With that code in place, you can load up that namespace in your REPL session, call
go to start your app, do some dev work, call
reset to reset it, lather, rinse, repeat. Good times!
Integrating Java libraries into a Trapperkeeper app is simple (obviously, since Clojure already has great Java interop support). However, we’ve done a bit of work to also allow easy interop with the Java servlet API.
The Jetty 9 webserver service provides two additional functions: add-servlet-handler and add-war-handler, which allow you to run existing Java servlets and web apps in the same container with your ring apps.
In addition, we’ve done a bit of proof-of-concept work to illustrate that it’s possible to run Ruby Sinatra apps in JRuby via Trapperkeeper. This code isn’t considered production-ready at this point, but we have been able to successfully run Sinatra apps, and we feel confident that this could be polished up and made a viable option.
Trapperkeeper In Practice
Much of the basis for Trapperkeeper came out of our experiences with PuppetDB, and the realization that many of the problems we’d already solved in that project were going to apply to future projects as well. The master branch of PuppetDB includes a change that will move PuppetDB to Trapperkeeper in the next release. Alongside PuppetDB, we have several other internal projects that are being built on top of Trapperkeeper; stay tuned for more details!
One of the other main goals of the Trapperkeeper project was to simplify the deployment and installation of Puppet Enterprise. The centralized logging should be a big win for our deployments, and the model of composable services should allow us to more easily share resources like an ActiveMQ instance or a Jetty webserver. And of course, the fact that it’s just a small library (as in, just a few megabytes on disk) means that application deployments can be kept small as well.
We are happy to be able to make Trapperkeeper open source; the code is available on GitHub (https://github.com/puppetlabs/trapperkeeper). At Puppet Labs, we use tons of open source software, and we believe strongly in giving back to the open-source community. We hope that Trapperkeeper has value in projects outside of Puppet Labs, and we’d be thrilled to work with the open-source community on any bug fixes and feature requests that may arise.
- Many other examples are available on GitHub: https://github.com/puppetlabs/trapperkeeper.
- Here's a Trapperkeeper service that provides an embedded Jetty webserver: https://github.com/puppetlabs/trapperkeeper-webserver-jetty9.
- Here's a more in-depth example of a few Trapperkeeper services: https://github.com/puppetlabs/trapperkeeper-webserver-jetty9/tree/master/examples/ring_app.
- Check out what’s new in the latest version of PuppetDB: https://puppetlabs.com/blog/whats-new-puppetdb-16