Tuesday, April 15, 2014

Cookbook Development Process

UPDATE: We've migrated away from VirtualBox and Minitest in favor of Docker, Chefspec, and Serverspec.  Most of the workflow below is still relevant, but I'll be posting a more detailed update at a future date.

After a few recent conversations I've had, I realized that it may be helpful to share our current Chef cookbook workflow with the community.  It works well for us and I'm pretty proud of how far it's enabled us to come.  That said, if you have any detailed suggestions on how to make it better, let me know.  

Source Code Management

We use Git (Gitolite fronted by GitList) to house all of our cookbooks.  Each cookbook has its own dedicated repository.  We create branches for all new development, merge to master after peer code reviews are completed, and use tags to designate successful codelines (more on this later).

Development Suite

Our cookbook development suite includes CentOS, Test-Kitchen, Vagrant, VirtualBox, Berkshelf, Foodcritic, chef-minitest handler, Veewee to create custom box images, and RVM with dedicated gemsets for each cookbook.

Development Methodology


Integration testing, code promotion
Our testing suite is based on the fail-fast methodology, which means we execute the cheapest tests first (time wise and computationally).  This reduces the feedback loop on the easy stuff and frees up our test systems to spend more time in the Test-Kitchen phase, which is the most costly.

We opted to use Jenkins to perform the testing because it's free, and frankly, it's awesome.  Each Jenkins job checks out the code for the target cookbook and executes the same build job.  It does this via a bash script that we wrote.  The script is kept in its own Git repo and is checked out during each Jenkins build. This allows us to change the testing logic across all of the jobs from a single place, and allows us to track changes to the script over time.  Feel free to check out the script and use it if you like.  You can get a copy here.  The script does the following:

Cookbook linting:
  • Cookbook grammar checking, coding standards.  (Spaces vs Tabs, etc)
  • Verify that each cookbook dependency is version locked in metadata.rb
  • README.md formatting (for example: Jenkins build job URI)
  • Berkshelf Berksfile opscode reference
  • Foodcritic validation
  • Execute the "berks" command to verify that all cookbook dependencies are indeed pulled from our local Git repo instead of the Internet

Test-Kitchen:
  • This is where we see if the cookbook executes cleanly and if the minitest suite passes

This completes the testing phases.  At this point, we know that the cookbook passes.  What we do next is my favorite part.

Code Promotion

We now retrieve the version of the cookbook we are testing from the metadata.rb file.  With that version number, we check to see if there is currently a tag in the related Git repo with that version as the name.  If the answer is yes, then that concludes the Jenkins job.  

However, if the tag does NOT exist, we create a new tag based on the current code, and upload the cookbook and all of its dependencies to the Chef Server (note that when Berkshelf uploads the cookbooks, it only uploads cookbook versions that do not already exist on the Chef Server).  And, since we version lock all of our cookbook dependencies, there is no risk of the new cookbook being accidentally rolled out to nodes.  It's just ready for future use.
·       
All Jenkins cookbook jobs are triggered on code commit, and are also scheduled to build every Tuesday morning.  The scheduled build on Tuesdays helps us catch changes that occur in the Ruby world that could negatively affect our build system.  We implemented the scheduled builds after one day when we needed to make a change in our infrastructure and found out that cookbook builds have been broken for two weeks because of a broken Ruby dependency/environment issue.

Development Process

Below are the specific steps we use to develop cookbooks.  I've included a flowchart at the end to give you a visual on the process.

Identify new cookbook requirements

Review the list of requirements for this cookbook.  If this cookbook is replacing an existing configuration, document each requirement that will need to be migrated.  Verify that all proposed requirements are achievable.


Check community site and local repo for existing cookbook

Visit the Opscode Chef Community website and local Git repo and search for an existing cookbook that may already cover some or all of the documented requirements.  If such a cookbook exists, use that one in conjunction with a wrapper cookbook to apply our company-specific settings, if any.


Run the automated cookbook initialization script

You must pass the cookbook name at the end $COOKBOOK\_NAME.  This script performs the following:
  • Accepts a single argument (cookbook name)
  • Creates the cookbook via the berks cookbook command
  • Creates a ruby gemset with the same name as the cookbook (this isolates each cookbook and allows us to experiment with new features without compromising the integrity of the entire development environment)
  • Creates a default Gemfile
  • Executes bundle install
  • Creates a default .kitchen.yml
  • Creates a default README.md, based on our standard format
  • Initializes a new local Git repo, adds all the new cookbook files to the repo, and then performs an initial commit
For more details about what this script does, please analyze the source.

curl -L https://raw.githubusercontent.com/jmauntel/cookbook-init/master/cookbook-init.sh | bash -s -l $COOKBOOK_NAME


Create the new cookbook repo on the Git server so you have a place to store your code


Update README.md

Include description, dependencies, requirements, test cases, and author sections.

Commit your code and push to the Git server

git add .
git commit -a -m "First pass at README.md"
git remote add origin git@git.acme.com:chef-cookbooks/${COOKBOOK_NAME}.git
git push origin master

If the repo does not yet exist, perform this step as soon as possible.

Create minitest logic to test the first requirement

In your cookbook directory, update `files/default/tests/minitest/default_test.rb`.  Examples can be found here.

Boot the first kitchen instance / validate test failure

In most cases you should use the default-centos-6.3 image as your initial testing system, however the full list of available images can be listed by executing `kitchen list`.

kitchen test default-centos-6.3 -d never

Update the cookbook recipe to satisfy the first requirement

Resource documentation and examples can be found on the Opscode documentation site. You can also reference other cookbooks in your Git repo.

Retest cookbook with Test Kitchen

In your cookbook directory, execute the following:

kitchen converge default-centos-6.3

In your cookbook directory, update the appropriate recipe (default is `recipes/default.rb`)

Lint cookbook with Foodcritic

In your cookbook directory, execute the following:

foodcritic -f any . && echo PASS || echo FAIL

If any rules fail, you can look up the meaning on the Foodcritic website.

Rebuild & test cookbook from scratch

In your cookbook directory, execute the following:

kitchen destroy && kitchen test default-centos-6.3 -d never

Destroy Test Kitchen instance

If all tests pass, destroy the instance executing the following:

kitchen destroy

Commit code, push to Git server

git add .
git commit –a –m "New feature passed"
git push origin master

Create a Jenkins job for the cookbook

I can go into this step in more detail, if anyone is interested.

Update the version number in metadata.rb

In your cookbook directory, update `metadata.rb` with a new version number, following SEMVER standards.

Add all files to local git repo, perform commit, push to Git server

In your cookbook directory, execute the following:

git add .
git commit –a –m 'First functional version of cookbook'
git push origin master

Monitor the Jenkins job

If failures are found, refactor the cookbook.



Thursday, February 27, 2014

Improve Test-kitchen Performance

At my current gig, we've automated our cookbook testing with test-kitchen and integrated it into Jenkins.  This was all fine and good until we started to get concerned with how long our testing was taking, given that many of our .kitchen.yml files include multiple platforms and multiple test suites.  Because of this, some of our tests have as many as 8 systems to build so we can test all of the variations, and a single change would result in a 90 minute feedback loop.  That's WAY too long.

So I started doing some investigating into what was taking so long and I discovered that our Vagrant systems were swapping heavily during Chef runs.  That's when I discovered that Vagrant uses 256MB of memory as the default setting for it's instances.  That lead me to add in the following parameter to all of our .kitchen.yml files:

- name: centos-6.5
  driver_config:
    box: centos-6.5
    box_url: http://imagesource.acme.com/centos-6.5.box
    customize:
      memory: 1024

This simple change in the .kitchen.yml file decreased 90 minute test runs to 25 minutes!

Command-line Kung Fu

I recently stumbled on an article on Lifehacker that suggested a way to learn Linux commands in-line with your daily systems administration.  Basically, it calls for adding a string of commands to your .bashrc file that randomly selects a command from various bin directories and executes a whatis on them.

I thought it was brilliant, but when I went to try it I was bummed to find out that the shuf command is not shipped with the base distribution of CentOS, which is the primary OS I support.  Because of this, I decided to spin my own version of the hack that will work on CentOS.

# RANDOM COMMAND-LINE KNOWLEDGE

# List of bin directories that include commands you'd like to learn
binDirs='/bin /usr/bin /usr/sbin /sbin'

# Count the number of commands in $binDirs and randomly select a number in the range
randomNum=$(expr $RANDOM % $(ls $binDirs | wc -l))

# Based on the selected $randNum, select the corresponding command
randomCmd=$(ls $binDirs | head -${randomNum} | tail -1)

# Lookup the command with whatis and display the results
echo "Did you know that:"
whatis $randomCmd | sed 's/^/\t/g'

The hack isn't perfect because not all commands have an entry in the whatis database.  When this happens, you will see a response similar to this:

Did you know that:

        numastat: nothing appropriate

But, this doesn't happen all that often, and with this hack, I've discovered some awesome commands like watch.

Did you know that:
        watch                (1)  - execute a program periodically, showing output fullscreen