December 16, 2013

Day 16 - omnibus'ing your way to happiness

Written by: John Vincent (@lusis)
Edited by: Ben Cotton (@funnelfiasco)

We've all been there.

You find this awesome Python library or some cool new utility that you want to run.

You check your distribution's package repository. It's not there or worse it's an ancient version.

You check EPEL or a PPA for it. Again, it's either not available or it's an ancient version. Oh and it comes broken out into 20 sub-packages so you can't just grab the package and install it. And you have to add a third-party repository which may have other stuff you don't want pulled in.

You try to compile it yourself and realize that your version of OpenSSL isn't supported or you need a newer version of Python.

You Google furiously to see if someone has posted a spec file or some blog entry on building it. None of it works. Four hours later you still don't have it installed. You look all around and see nothing but yak hair on the ground. Oh and there's a yak with a full coat grinning at you.

My friend have I got a deal for you. Omnibus!

What is omnibus?

Omnibus is a toolchain of sorts. It was created by Chef (formerly Opscode) as a way to provide a full install of everything needed to run their products in a single native system package. While you may not have a need to build Chef, Omnibus is flexible enough to build pretty much anything you can throw at it.

Omnibus can be confusing to get started with so consider this your guidebook to getting started and hopefully having more time to do the fun things in your job. Omnibus works by leveraging two tools you may already be using - Vagrant and FPM. While it's written in Ruby, you don't have to have Ruby installed (unless you're creating your own software to bundle) and you don't even have to have FPM installed. All you need is Vagrant and two vagrant plugins.

The omnibus workflow

The general flow of an omnibus build is as follows:
  • Check out an omnibus project (or create your own)
  • Run vagrant up or vagrant up <basebox name>
  • Go get coffee
  • Come back to a shiny new native system package of whatever it was omnibus was building
Under the covers while you're drinking your coffee, omnibus is going through a lot of machinations to get you that package (all inside the vagrant vm):
  • Installing Chef
  • Checking out a few chef cookbooks to run for prepping the system as a build host
  • Building ruby with with chef-solo and the rbenv cookbook.
  • installing omnibus and fpm via bundler
  • Using the newly built ruby to run the omnibus tool itself against your project.
From here Omnibus is compiling everything above libc on the distro it's running under from source and installing it into /opt/<project-name>/. This includes basics such as OpenSSL, zlib, libxml2, pcre and pretty much everything else you might need for your final package. Every step of the build process sets your LDFLAGS, CFLAGS and everything else to point to the project directory in /opt.

After everything is compiled and it thinks it's done, it runs a sanity check (by running ldd against everything in that directory) to ensure that nothing has linked against anything on the system itself.
If that check passes, it calls FPM to package up that directory under /opt into the actual package and drops it off in a subdirectory of your vagrant shared folder.

The added benefit here is that the contents of the libraries included in the system package are almost exactly the same across every distro omnibus builds your project against. Gone are the days of having to worry about what version of Python is installed on your distro. It will be the same one everywhere.

A sample project

For the purposes of this post, we're going to work with a use case that I would consider fairly common - a Python application.

You can check out the repo used for this here: https://github.com/lusis/sample-python-omnibus-app

As I mentioned previously, the only thing you need installed is Vagrant and the following two plugins:
  • vagrant-omnibus
  • vagrant-berkshelf
If you have that done, you can simply change into the project directory and do a vagrant up. I would advise against that, however, as this project is configured to build packages for Ubuntu-10.04 through Ubuntu-12.04 as well as CentOS 5 and CentOS 6. Instead, I would run it against just a specific distribution such with vagrant up ubuntu-12.04.

Note that on my desktop (Quad-core i7/32GB of memory) this build took 13 minutes

While it's building, you should also check out the README in the project.

Let's also look at a few key files here.

The project configuration

The project definition file is the container that describes the software you're building. The project file is always located in the config/projects directory and is always named the same as the package you're building. This is important as Omnibus is pretty strict about names used aligning across the board.

Lets look at the definition for this project in config/projects/sample-python-app.rb. Here is an annotated version of that file:
annotated project def

The things you will largely be concerned with are the block of dependencies. Each of these corresponds to a file in one of two places (as noted):
  • a ruby file in <project root>/config/software
  • a file in the official omnibus-software repository on github (https://github.com/opscode/omnibus-software)
This dependency resolution issue is important and we'll address it below.

A software definition

The software definition has a structure similar to the project definition. Not every dependency you have listed needs to live in your repository but if it is not there, it will be resolved from the master branch of the opscode repository as mentioned.

This can obviously affect the determinism of your build. It's a best practice to copy any dependencies explicitly into your repository to ensure that Chef doesn't introduce a breaking change upstream. This is as simple as copying the appropriate file from the official repository into your software directory.

We're going to gloss over that for this article since we're focusing on writing our own software definitions. If you look at the dependencies we've defined, you'll see a few towards the end that are "custom". Let's focus on pyopenssl first as that's one that is always a pain from distro to distro:

annotated software def

The reason I chose pyopenssl was not only because it's a common need but because of it's sensitivity to the version of OpenSSL it builds against.

This shows the real value of Omnibus. You are not forced to only use a specific version of pyopenssl that matches your distro's OpenSSL library. You can use the same version on ALL distros because they all link against the same version of OpenSSL which Omnibus has kindly built for you.
This also shows how you can take something that has an external library dependency and ensure that it links against the version in your package.

Let's look at another software definition - config/software/virtualenv.rb
Note that this is nothing more than a standard pip install with some custom options passed. We're ensuring we're calling the right version of pip by using the one in our package directory - install_dir matches up with the install_path value in config/projects/sample-python-app.rb which is /opt/sample-python-app.

Some other things to note:
  • the embedded directory This is a "standard" best practice in omnibus. The idea is that all the bits of your package are installed into /opt/package-name/embedded. This directory contains the normal [bin|sbin|lib|include] directory structure you're familiar with. The intent of this is to signal to end-users that the stuff under embedded is internal and not something you should ever need to touch.
  • passing --install-option="--install-scripts=#{install_dir}/bin to the python package This ensures that pip will install the library binaries into /opt/package-name/bin. This is the public-facing side of your package if you will. The bits/binaries that users actually need to call should either be installed in a top-level hierarchy under /opt/package-name or symlinked from a file in the embedded directory to the top-level directory. You'll see a bit of this in the post-install file your package will call below.

postinstall/postrm

The final directory we want to look at is <project-root>/package-scripts/sample-python-app. These contain files that are passed to fpm as postinstall and postremove scripts for the package manager.
annotated postinstall

The biggest thing to note here is the chown. This is thing that bites folks the most. Since fpm simply creates tar files of directories, those files will always be owned by the user that runs fpm. With Omnibus, that's the vagrant user. What ends up happening is that your package will install but the files will all be owned by whatever uid/gid matches the one used to package the files. This isn't what you want. In this case we simply do a quick chown post-install to fix that.

As I mentioned above, we're also symlinking some of the embedded files into the top-level to signal to the user they're intended for use.

Installing the final artifact

At this point your build should be done and you should now have a pkg directory under your project root. Note that since this is using vagrant shared folders, you still haven't ever logged on to the actual system where the package is built. You don't even need to scp the file over.

If you want to "test" your package you can do the following: - vagrant destroy -f ubuntu-12.04 - vagrant up ubuntu-12.04 --no-provision - vagrant ssh ubuntu-12.04 -c 'PATH=/opt/sample-python-app/bin:\$PATH virtualenv --version'

Next steps

Obviously there's not always going to be an omnibus project waiting for everything you might need. The whole point of this is to create system packages of things that you need - maybe even your company's own application codebase.

If you want to make your own omnibus project, you'll need Ruby and the omnibus gem installed. This is only to create the skeleton project. Once you have those installed somewhere/anywhere, just run:
omnibus project <project-name>

This will create a skeleton project called omnibus-<project-name>. As I mentioned earlier, naming is important. The name you pass to the project command will define the package name and all values in the config/projects/ directory.

You'll likely want to customize the Vagrantfile a bit and you'll probably need to write some of your own software definitions. You'll have a few examples created for you but they're only marginally helpful. Your best bet is to remove them and learn from examples online in other repos. Also don't forget that there's a plethora of predefined software in the Opscode omnibus-software repository.
Using the above walk-through, you should be able to easily navigate any omnibus project you come across and leverage it to help you write your own project.

If I could give one word of advice here about the generated skeleton - ignore the generated README file. It's confusing and doesn't provide much information about how you should use omnibus. The vagrant process I've described above is the way to go. This is mentioned in the README but it's not the focus of it.

Going beyond vagrant

Obviously, this workflow is great for your internal development flow. There might come a time when you want to make this part of the official release process as part of a build step in something like Jenkins. You can do that as well but you'll probably not be using vagrant up for that.
You have a few options:
  • Duplicate the steps performed by the Vagrantfile on your build slaves:
  • install chef from opscode packages
  • use the omnibus cookbook to prep the system
  • run bundle install
  • run omnibus build
Obviously you can also prep some of this up front in your build slaves using prebuilt images - Use omnibus-omnibus-omnibus This will build you a system package with steps 1-3 above taken care of. You can then just clone your project and run /opt/omnibus/bin/omnibus build against it.

Getting help

There's not an official community around omnibus as of yet. The best place to get help right now is twitter (feel free to /cc @lusis), the #chef channel on Freenode IRC and the chef-users mailing list.
Here are a few links and sample projects that can hopefully also help:

No comments :