The last time Hackerfall tried to access this page, it returned a not found error. A cached version of the page is below, or click here to continue anyway

Barricade | Omnibus – Dependency Isolation Without Docker

I suspect Omnibus is a tool not terribly well known outside a small niche of package maintainers inside the Chef community, inside the Ruby community.

I didn’t hear about it until I was about two years into my stint at Engine Yard, a Ruby company that was a pioneer in Chef automation. Yet Omnibus, despite its lack of notoriety, is a powerful tool to have available as a software developer. Particularly if you’re a software developer that expects to deploy software.

Briefly, Omnibus is for building packages that have all of their important dependencies bundled together into a single, operating system tailored unit. It does this for Ruby, Python, and in theory anything else you might like. So you input your broad dependencies (like git repos, pip commands), run Omnibus, and get a deb, rpm (or even OSX pkg or Windows msi) at the end, built against whichever OSes, architectures and distributions you specify.

Deployment hell

Accounting for different operating systems, language versions and dependencies is an awful lot of accounting to do for remote deployments, a pattern which is re-emerging thanks to devops.

Whereas the web was arguably dominated by SaaSification for a number of years, the rise of devops tooling has meant more startups putting software on other people’s computers. Software that can’t be deployed hundreds of times a day. Software that needs to be reliable. None of this is news to industry veterans, but it is a new pattern for many of us who cut our teeth over the last decade or so.

That code has generally remained so awkward to deploy is a modern mystery of software development, but has been tackled with renewed effort in recent years.

From the language side, one of Go‘s major achievements has been its focus on cross-compilation and isolation from C libraries. After years spent trying to muddle through compilation options and missing or outdated dependencies, Go almost feels like cheating.

From the system side comes Docker (not containers). Containers allow for isolated execution, but it’s Docker’s build and image management system via tarballs and union filesystems that ultimately handle the dependencies, technology around since the early 90s.

Other than that, Linux/BSD deployment has tended toward the idiosyncratic.

Of course, rewriting your entire software stack in Go may not be the most pragmatic solution (despite how tantalizing it may be). Similarly, Docker comes with its own set of quirks, not the least of which its developmental status, but also how updates and patches will fit in to the picture.

So, in the great tradition of Making Do, Omnibus uses the tools long available to us to hack together a solution that mostly works. It’s a kludge, but it helps deliver more reliable software.

Why we are where we are

Most dynamic languages/runtimes, PHP, Ruby, Python, Node, etc, build on a strong foundation of C and C++ libraries written over the last 20-30 years to flesh out their standard library.

This generally involves wrapper libraries for the language in question, some code which exposes the underlying library as a set of primitives in its dynamic parent.

The trouble starts when these libraries behave differently from version to version (aka, all of them, to greater or lesser degrees). Across deployed operating systems you can come across libraries which vary in provenance by years. For more mature libraries this may not be an issue, but even then, years can make quite a difference.

Now, often, seasoned developers go to great lengths to ensure API compatibility, and this is more true the better understood the problem the library solves is. However, a lot of useful software development is exploratory, and therefore more likely to need backwards incompatible revision.

Which is all a roundabout way of saying that just because it works on your laptop, doesn’t mean it’ll work on my server.

How Omnibus works (mostly)

Omnibus knows about this problem. Omnibus cares. Omnibus understands. But Omnibus won’t hold your hand the whole way. It is most decidedly a power tool.

Conceptually, Omnibus can be broken up into five general components; the project scaffolder, dependency recipes, project configuration, virtual machine system and the project builder.

Project scaffolder

Scaffolding is a pattern familiar in the Ruby world via Rails and Chef, and sporadically appears in other language communities in things like IDEs.

All it does is build a directory layout and drop some important base files with filled out parameters. You create a scaffold for your project, then install the requirements for that (via bundler). If, like me, you’re going to be building Debian and CentOS packages, that’s all Omnibus itself does on the host machine.

Dependency recipes

Dependency recipes are the bread and butter of Omnibus. They’re what you’ll be reading, modifying and creating. When you create your scaffold, you might find some default recipes in the config/software directory. Mostly these are version metadata, download locations, and build instructions.

Omnibus actually handles a bunch of common dependencies by default, and these are contained in the upstream omnibus-software repository. If you’ve ever contributed to Homebrew (or just browsed the source) this will look familiar to you.

Since this is a niche project, it relies heavily on its users to keep things up to date, so it’s always worth checking that the dependencies you need are the latest.

Project configuration

Depending on your project and requirements, this could be relatively straight forward or quite tricky. Running daemons is different from making commands available, but even then, each OS you want to support still has its way of handling installation, removal, etc.

If you’ve never packaged software before this may be a steep learning curve – I highly recommend looking at some examples of real world applications packaged with Omnibus. Chef itself is one such project, GitLab is my other go-to for references. There’s also a sample Python project I used for reference since Barricade’s agent is written in Python.

Virtual machine system

You can use Omnibus to boot virtual machines for your target operating system using test-kitchen. Using a testing framework seems an odd choice for the build system, and does cause some problems, but none insurmountable.

This part isn’t really explained in detail in the project README, but is covered better in Chef’s Omnibus instructions. You build each against a target operating system so you can be reasonably sure the basic dependencies will have a minimum level of stability, and then any more sensitive or uncommon dependencies can be built and bundled into the package.

Project builder

This is the engine of Omnibus, and is what turns the build instructions into a full stack installer (as Chef likes to call them). This downloads dependencies, sets the correct install_dir options in ./configure flags, etc, and invokes the package builder.

After your builds, you’re left with bundled package artefacts in your project’s newly created pkg directory conveniently on your host machine.

How Omnibus doesn’t work

Omnibus is not magic. It does an excellent job, but once you start colouring outside the lines and writing your own recipes, you’d better be prepared to roll up your sleeves and start investigating compilation options. It’s worth noting that Ruby is likely the best tested and configured code path, since this is a Chef tool, and Chef is written in Ruby.

Provisioning test-kitchen VMs from scratch takes a tortuous 40 minutes each on my very new Macbook Pro, which is why test-kitchen being a test framework is a problem – no built in option to suspend these images means you’re sort of expected to leave them running or boot them fresh every time.

I’ve written a short helper script I stick in /usr/local/bin that just uses the knowledge of the underlying Vagrant implementation to suspend/resume machines, but it isn’t terribly reliable if you’re booting and destroying other VMs regularly (loses port mappings). Still, it helps.

We GPG sign our RPM builds. Omnibus doesn’t provide any specific helpers here, but what I like to do is run all the CentOS builds, then sign them all from the final VM in one fell swoop. Saves on all the GPG key importing busywork.

You now own your dependencies

It might sound obvious, but part of isolating your dependencies from the operating system is that they’re now under your control.

This means you own the update cycle. You also own the patch cycle. But most importantly, you own the security cycle. Of particular note is that you’re likely the proud new owner of a separate copy of OpenSSL. If that doesn’t give you the willies I don’t know what will.

Personally, I like to keep a very close eye on the upstream omnibus-software repository as well as CVEs. Barricade’s working on a system to help there, actually.

Further reading

I found both Gemnasium and SysAdvent’s blog posts illuminating when I was refamiliarising myself with Omnibus at the start of the year.

Datadog’s post on Omnibus also serves as a nice introduction.

While I haven’t tried it yet, I find Flapjack’s porting of Omnibus to use containers deeply appealing, and have plans on seeing if I can get a package building pipeline up and running using this and Circle CI in future.

July 20, 2015

Continue reading on