Software Packaging Best Practices
apt-get install puppetor
yum install puppet. Yes, it’s a trick question. Because we’ve packaged Puppet for a few distros, you can run apt-get or yum, and you get Ruby, Facter and Hiera coming along for the ride — for free, with no extra work on your part. That’s why it’s so awesome to package your products. When you deliver packages, people inherently trust you more. If they’re running Debian, they expect to see a deb package. If they’re running Red Hat, they expect an RPM. If they’re on Windows, they expect to see an MSI. If they saw a tarball, they probably wouldn’t know what to do with it — and why should they? There are other good reasons for packaging software. Packages are more powerful; they describe dependencies, and they include other technologies users need to get going right away. Packages are easier to deploy, and the user experience is consistent: It’s the same on every platform. Packages are also cleaner and more efficient. I don’t want build tools on my laptop — I want them on a build server, cared for by other people.
Packages are awesome. So why isn’t everyone packaging software?There are a few objections, and there’s an answer to every one of them. It’s expensive. If you’re packaging for something like AIX, HP-UX, or Solaris, the hardware and software will cost you — and that could be a barrier. Answer: You can start small, with used hardware. Start with Solaris i386, or a different RPM platform like CentOS or Fedora — they’re free. Learning these new packaging paradigms is hard. Answer: Learning is also fun, and after the first packaging paradigm you learn, it gets a lot easier. The second, third and fourth are all a lot easier. Maintenance. There are always bugs in software, and when you package, you will start to get new bugs opened against your packaging. In addition, when you build for an operating system, that system can change over time. Maybe you’re building for Fedora, and F19 just shipped with a new Ruby stack. We had this experience at Puppet Labs: We had to update Puppet and Facter and Hiera for all the new paradigms, and that cost us time. Answer: Maintenance is absolutely a cost you need to bear. If it’s not a cost you are willing to bear, it’s probably not worth it. But having packages out there will expand user adoption, so it’s probably worth it in the long run. If you do decide in its favor, you need to be aware you’ll start seeing bugs against your packaging, and you have to be willing to keep up with the maintenance. No demand. Maybe all your customers are fine with a tarball or Git checkouts. Answer: Really? I bet once there are packages out there, there will be demand. If you have packages and don’t update for a while, users will notice and will ask you for updated packages. Our packages are being downloaded all the time. Someone’s doing it for me. Most of the distros have packaging for your thing, so why bother? Answer: If you want Puppet 3.2, where can you find it? Debian wheezy shipped with Puppet 2.7 in it. So if you want Puppet 3.2 or 3.1 or 3.0 and you are on Debian, you have to get it from our repos. If you have your project and you want to get recent features, bug fixes, maintenance out to your users, or have new versions, you really want to control the process and have your own packages available for that.
Packaging Software: The How-ToThere are a few steps to getting a package out there, ready to use. You can probably have these steps done in less than six months (no pressure!).
- First, pick a distro. — See which distros users are using, which ones they really want support for. Research it: why is it there, what’s the community like, what’s their mission, what are their motives?
- Read the packaging guide — Debian and RPM both have them.
- Find the community and mailing lists. They’ll be the best resources you have when you build packages.
- Where does stuff go in the packages? Look for file system standards.
- How does the distro handle services? If you’re on Solaris, you’re probably looking at SMF. If you’re on Ubuntu, you’re probably looking at Upstart. Fedora has systemd; Debian and CentOS have systemv.
- What are the packaging tools? Most distros have a lower level tool with a higher-level tool on top. Debian has dpkg topped by apt; Red Hat has RPM topped by yum.
- Can the packages express dependencies? If you’re on Solaris, you are probably out of luck.
- What does the metadata look like? Is it XML? Do you want to spend your life looking at XML?
- How does versioning work? Where does it go?
- Can the packaging manager handle uninstalls? Can it handle upgrades? Can it do it cleanly and sanely? Can it recover from failure?
- How does the packaging manager handle configuration files? Do they get left behind?
- Can it do pre-install actions and post-install actions?
Practice, Package, SignOnce you have the answers to all those questions, bring up a VM and play around a bit. Get your feet wet, figure out how that OS works. Take a small pet project and try to package it up. See how it goes, and if it works. If your distro has source packages, download those, look at the contents and see if there are any small improvements to make. Make those improvements, try to rebuild it — see if it still works. Then take your project — the one you actually want to package. Package it, and once it’s built, test it. Install it in a bunch of places — install it in places where you don’t expect it to get installed. Try and install a Debian package on Windows. Okay, on second thought, you probably don’t want to try that. But how will people know that package came from you? How do you know if a package came from Puppet Labs? We sign it. We have a GPG key; we sign our packages, and we sign our tarballs. So make a GPG key, publish it to key servers, keep that private key around and keep it safe. If your package of choice doesn’t support inline signatures, then just detach sign it and leave the signature next to it, so you can still verify the package.
Get Your Package Out to the WorldYou package your software because you want people to use it. You have a few options for getting it out there: Become a distro maintainer. Yes, it might be challenging, but it’s rewarding — you’ll get a lot of users. Try to get it into Debian, Fedora and OpenCSW. Host basic downloads yourself. Put up a simple file server or put the downloads on S3. Throw up your own package repositories. It’s not very hard. There are tools to help with this (createrepo for RPM, reprepro or freight for DEB) — it’s very doable.
Avoid the Snakes and Dragons Along the WayNow you know how to package software, let’s talk about the snakes and dragons and glass, and snakes covered in glass that you need to avoid along the way.
Sprawl. It’s really easy when you’re doing automation work to try to handle all edge cases and all your workflows. Then one day you try to do a software release and you don’t recognize the workflow anymore. You try to run the right rake task, and you get it wrong. Or you look at the list of 30 rake tasks, and you can’t pick it out. That’s a sign of sprawl. We were definitely guilty of this at Puppet Labs, and it’s something to avoid. If the automation has become more of a hindrance than a help, then you’ve gone too far — you should probably take a step back and cut a few things out.
Variation in workflows. When building automation, try to be opinionated and a little bit flexible at the same time. You may think Debian and RPM need different workflows, but when you hammer down at them, you find they really can be the same workflow. You really should try to get those workflows to be similar now — it’s easier than it will be later.
Historically preserved cruft. When reading those maintainers' guides, make sure you are reading the ones that are up to date. If you are out there reading the Debian package maintainers’ guide from 1997, and you aren’t packaging for Woody, you’ll be in a world of pain when you try to follow it. You may have had this experience when trying to find Ruby docs for anything but 1.8.7.
Shipping without testing. This is really dangerous. Packaging is really powerful, and if you do it wrong, you can totally trash someone’s system by mistake. For example, OS X and Solaris have a nice feature: If your packaging path has a symlink in it, and your package thinks it should be a directory, no problem — the package manager will make that symlink a new directory. Whatever the symlink pointed at before is still there — don’t worry. But if that symlink says “var” or “etc,” that system might not be around anymore. Don’t do that.
Over-abstraction. In our workflows, to build a gem, we dynamically build it out of packaged metadata we have available. On the fly we build a gemspec, and then we build a gem from that. It sounds great, but there’s a potential problem. Say you are a developer, and you want to add a simple thing like a gem dependency to a package like Puppet. How do you do it? It might take a seasoned developer five or 10 minutes to figure out where that goes. That’s a problem, because the idea of the abstraction and automation is to make the development easier, not to slow it down. So we definitely went too far in that example. If you find developers needing maps and guidance to update simple things in packaging, you might want to take a look at those and see if you can simplify that workflow.