Building servers with code (using Chef, Vagrant, and Berkshelf)

I’ve been building Linux machines by hand for over a decade and have always found the process time-consuming, error-prone, and hard-to-replicate. The new hotness in “DevOps” is “infrastructure as code,” i.e., writing code that creates, provisions, updates, and manages our machines (servers/laptops/desktops).

When the logic for building machines lives in code, rather than solely in developers' brains, that logic is obviously easier to modify and reuse. In “the old days,” for example, replication meant disk images. Copying hard drives bit-for-bit worked well for copying one machine’s software to an identical piece of hardware. This was useful for large companies that issued all employees identical computers but useless for upgrading to a more powerful — or just different — machine. Anything other than putting identical software on identical hardware required redoing everything by hand.

Infrastructure code now lets you re-run the steps used to build a machine, and those steps (possibly with tweaks that work around any hardware or software differences between installations) will rebuild the new machine similarly to the old machine, despite underlying hardware differences. Even better, shared public cookbooks (i.e., collections of formulas/recipes for installing and configuring various pieces of software on various operating systems) enable everyone to leverage the work of others, saving huge amounts of time formerly wasted trying to configure software by trial-and-error or Googling for answers.

Two popular tools for building “infrastructure as code” are Chef and Puppet. Though each has its advantages, I chose Chef, partly because it has a strong community and partly because another developer at my firm used it on an earlier project. A third tool, BOSH, looks promising because it actually attempts to do more than Chef/Puppet (which are mainly for building and maintaining individual machines), as this diagram shows; but BOSH currently feels less mature than Chef. (A good talk on BOSH.)

Though Chef is most commonly used to build servers, it can provision desktops and laptops, too, as “It’s Not Just for Servers: Chefing Your Development Environment” explains.

I’ve recently built my first Chef-scripted server, and I’m very happy with the results. I was tasked at work with replicating an existing server (the one we test our code on using Jenkins) so we could get it up and running quickly if it ever fails again (as it did a few months ago). This suggests another huge advantage of “infrastructure as code”: Minimal downtime. When a hand-crafted server goes down, it can take days or weeks to get a new machine back up with full functionality. In fact, unless it has been recently and fully backed up, it can prove impossible to recreate the machine. When a Chef-scripted server goes down, you can re-run the script and build a new machine in minutes or hours, not days or weeks.

I was able to script the entire process of creating and provisioning the new machine (on both a physical box we owned and on a new AWS virtual machine), including the setup of PostgreSQL, MySQL, users, shadow passwords, custom user RVM and Ruby installs, etc. Though I can’t show you the code, I’d like to share the big picture on how to do the same.

I used the following tools:

  • Vagrant (site): Lets you build your machine(s) starting with any of a variety of base “boxes,” using whichever “provisioner” you wish (Puppet standalone, Puppet server, Chef solo, Chef server, Bash scripts, etc.) and runnable on a variety of “providers” (AWS, VirtualBox, HP Cloud, OpenStack, Rackspace, DigitalOcean, LXC, VMware, etc.). You create your machine by typing just vagrant up. You SSH into it with vagrant ssh. You re-provision it with vagrant provision. You stop it with vagrant halt or destroy it with vagrant destroy. You can even take your entire server and serialize and move/clone it with vagrant halt followed by vagrant package (see How to copy / duplicate a Vagrant box). Useful resources: Railscast on Virtual Machines with Vagrant and Automated Development Environments with Vagrant (by Vagrant creator Mitchell Hashimoto).

  • Vagrant boxes (site): Basically disk images you use to start building your machine, like plain pizza bases you can personalize by adding toppings.

  • Vagrant-AWS (site): Vagrant “provider” for pushing from Vagrant to AWS and controlling/provisioning the AWS box through Vagrant. To deploy to AWS, you’ll need an AWS account and need to pick an AMI (“Amazon Machine Image”). Here’s info on AWS AMIs, Amazon’s “Finding a Suitable AMI” guide and a list of Ubuntu 12.04 AMIs. After configuring everything, you start your AWS VM with vagrant up aws --provider=aws. Useful resources: How to Deploy Cloud Foundry v2 to AWS via Vagrant.

  • Berkshelf (Berkshelf & Vagrant-Berkshelf): The glue that makes it easy to use Chef Solo and Vagrant together. Berkshelf says it lets you “Manage a Cookbook or an Application’s Cookbook dependencies.” Useful resources: the Berkshelf Way and Rapid DevOps with Test Kitchen, Berkshelf and Vagrant.

  • Chef (site): The Ruby-based DSL (domain-specific language) for provisioning machines via scripts. I found the “Resources and Providers Reference” ( so helpful that I printed it out, despite it being book-length. There’s also good documentation of the “Recipe DSL” ( And I found a cookbook for most everything I wanted to install/configure on my machine. A good place to start looking for cookbooks is, but there are many cookbooks not created by Opscode (just Google “xyz cookbook,” where “xyz” is MySQL, Jenkins, PostgreSQL, or whatever else you might want). Learn Chef offers tutorial videos. And the free book “Getting started with Chef” looks helpful.

  • Chef Solo (site). A stripped-down Chef tool that lets you provision machines using Chef scripts but without the heavy overhead and multiple machines involved in a full Chef environment. Self-description: “an open source version of the chef-client that allows using cookbooks with nodes without requiring access to a server. chef-solo runs locally and requires that a cookbook (and any of its dependencies) be on the same physical disk as the node. chef-solo is a limited-functionality version of the chef-client.” Useful resources: Railscast on Chef Solo Basics.

  • VirtualBox (site): A tool that lets you create and run virtual machines on your desktop or laptop. Not intended for production usage but very convenient while creating Chef scripts that will eventually run elsewhere. Also used by development teams to create development sandboxes that can run identically on various machines with different operating systems and hardware/software configurations.

  • Vagrant-vbguest (site): Keeps VirtualBox guest additions up to date. Useful resource: Vagrant Tip: Sync VirtualBox Guest Additions.

  • Packer (site): Though I didn’t use Packer, I’m mentioning it here because it looks useful. It “lets you build Virtual Machine Images for different providers from one json file. You can use the same file and commands to build an image on AWS, Digital Ocean or for virtualbox and vagrant” (“Building Vagrant Machines With Packer”).

These three blog posts are useful for seeing how Chef-Solo, Vagrant, and Berkshelf work together: 1, 2, and 3.

I first installed Vagrant and then installed some useful plugins:

vagrant plugin install vagrant-omnibus
vagrant plugin install vagrant-berkshelf
vagrant plugin install vagrant-aws
vagrant plugin install vagrant-vbguest

I picked a Vagrant “box” that matched what we’re running on our current Jenkins server (i.e., a vanilla 64-bit install of Ubuntu 12.04, which we picked because it’s a LTS or “long-term support” release, meaning it will continue receiving security updates for several years).

Using Berkshelf, I installed some cookbooks, like chef-rvm, and configured them in my Vagrantfile.

My Gemfile looks like:

source ''
# ruby '1.9.3'

gem 'chef'
gem 'berkshelf'
gem 'knife-solo'
gem 'foodcritic'

Because I was building a replacement Jenkins box, I named my Berkshelf project jenkins_kitchen. My jenkins_kitchen/Berksfile looks like this:

site :opscode

cookbook 'apt'
cookbook 'sudo'
cookbook 'java'
cookbook 'database'
cookbook 'mysql'
cookbook 'postgresql'
cookbook 'tmux', github: 'stevendanna/tmux'
cookbook 'jenkins'
cookbook 'configure_Hedgeye_dbs', path: './cookbooks/configure_Hedgeye_dbs'
cookbook 'rvm', github: 'fnichol/chef-rvm'

I then created my own cookbook with two recipes, one that runs early in the provisioning process to create users, etc. and one that runs near the end of the provisioning process to install and configure additional packages, like ack-grep, emacs (because some developers have poor taste), imagemagick, ghostscript, Node.js, libmagick, and libssl. See Chef’s recipe DSL documentation and this guide to authoring cookbooks for the many things you can do in your recipes. You can set shadow passwords, set environment variables, run arbitrary Bash or Ruby code, install Ruby gems, etc.

There are even methods available in Chef::Util::FileEdit for manipulating files in various ways, like search-and-replace. Though it’s better to configure your machine’s apps via chef.json settings passed by the Vagrantfile to your Chef cookbooks, you can use Chef::Util::FileEdit to directly modify textual configuration files. I found I had to do so to enable password-based SSH logins.

I also installed the foodcritic Ruby gem, a linting tool for Chef cookbooks, which I found slightly useful on my small project; I suspect it could be quite helpful on a larger project.

I hit a few issues I want to flag (esp. my colleagues and future me):

  • Installing RVM: Installing RVM and multiple Ruby versions on different users' accounts was hard. I tried using chef-rvm but couldn’t get it running quickly with Chef-solo. That could have been, in part, because I was still pretty new to Chef. Whatever the reason, I found the answer in this blog post from The key was adding 'vagrant' => { 'system_chef_solo' => '/usr/local/ruby/bin/chef-solo' } to chef.json in my Vagrantfile.

  • Installing AWS CLI tools: I struggled to install the two sets of AWS CLI tools I needed till I discovered aws-cli. Then it was as trivial as running pip install awscli. (I happened to already have pip installed; if you don’t, you can install pip.)

  • AWS security groups: Amazon (AWS) uses “security groups” to manage privileges. You probably will need to explicitly open up various ports (80, 8080, 443, etc.) on your AWS machine via AWS’s web console. You can create multiple “security groups” in the AWS console, assign privileges to security groups, and then (in Vagrant) assign various security groups to each AWS VM you create. And, as vagrant-aws says, “If you have issues with SSH connecting, make sure that the instances are being launched with a security group that allows SSH access.” We also needed to set that in the AWS console before I could SSH into my new virtual machine.

  • AWS elastic IP addresses: Amazon (AWS) uses “elastic IP addresses.” What this means is that Amazon provides your box with TWO IP addresses. The IP address your box believes is its public IP address is actually a private IP address inside Amazon’s network. To access your box from the Internet, you must use the other IP address Amazon provides, which Amazon maps to your private internal IP address. Your box won’t know anything about its true public IP address, but you’ll need to use that to SSH into it (outside of Vagrant) or hit any of its ports for viewing websites, etc. I figured out the actual public IP from the “Connection to closed.” message I saw every time I exited my “vagrant ssh aws” session.

  • AWS region in Vagrantfile: The AWS region specified in your Vagrantfile should NOT include the letter suffix. For example, aws.region="us-west-2a" should be aws.region="us-west-2" See this discussion.

  • AWS “RunInstances”: If you’re trying to deploy to AWS and see “UnauthorizedOperation => You are not authorized to perform this operation,” you need to enable “RunInstances” in your AWS console.

  • Verbose Vagrant output: To see more verbose Vagrant output, prefix your vagrant commands with VAGRANT_LOG=debug. I found this incredibly useful when I tried to deploy to AWS and the process kept hanging at “[aws] Waiting for SSH to become available…” After adding the environment variable, I suddenly saw rich feedback which was invaluable for diagnosing and fixing the underlying problems.

  • Enabling passwordless sudo: I found it useful to enable passwordless sudo. I did so using the sudo cookbook.

  • One Chef script, multiple Vagrant providers: Running one set of scripts across multiple Vagrant providers. Initially, I built my scripts on a VirtualBox virtual machine running on my laptop controlled by Vagrant on my laptop. When I was ready to deploy to a new virtual machine on AWS, most of the script remained the same, but certain things needed to change. To change/override configuration settings for a specific provider, you define a config.vm.provider inside your Vagrantfile. For example, I used config.vm.provider :aws do |aws, override| to specify settings that applied only to my AWS machine. You can also define config.vm.define to name your various instances. I had a config.vm.define "aws" and a config.vm.define "local" and could call vagrant using those names (e.g., vagrant ssh aws or vagrant provision local). I found this page on DRYing up Vagrant files helpful for understanding how to use the same script on multiple machines running on different providers.

  • Broken VirtualBox required re-installation: My VirtualBox installation stopped working, probably after Mac OS upgrades of some kind. I opened the VirtualBox GUI and found the error message “NS_ERROR_FAILURE (0x80004005).” Further investigation revealed that VirtualBox also gave me the error “kernel driver not installed (rc=-1908).” I Googled these and learned that VirtualBox is highly prone to getting corrupt. There were tons of Google hits for these errors. I uninstalled then reinstalled VirtualBox from the same VirtualBox.pkg I had previously installed it with. After doing so, vagrant up and vagrant ssh both worked! Moral of the story: Updates to Mac OS can break VirtualBox, so you may need to delete/reinstall VB.

  • Flawed Chef scripts that appear to succeed: As you’re becoming familiar with Chef and adding to your scripts, you’ll probably find your scripts failing occasionally. Beware “fixing” it by rearranging the order of your script because you may well find that your script now runs to completion but later discover that it didn’t run everything you believed it ran. When you run vagrant provision, it skips resources it considers already provisioned. I recommend occasionally wiping out (vagrant destroy) your virtual machine and re-building your machine from scratch (vagrant up) to guard against false positives.

  • Use apt cookbook but don’t put ‘sudo apt-get upgrade’ in your Chef script: If you’re running Ubuntu on your VM, you’ll want to install the apt cookbook to keep your apt cache up to date. But I still run sudo apt-get update and sudo apt-get upgrade manually because this advice recommends against putting “apt-get upgrade” in a Chef script.

Thanks to my employer, Hedgeye, for giving me time to write this blog post and to my wonderful colleague Scott Smith for helpful comments that make this less painful to read. All remaining errors are completely Scott’s fault, of course.

Posted by James on Thursday, January 09, 2014