Jenkins Delivery

At Sky Betting and Gaming, we use Chef and Jenkins extensively to manage our infrastructure and build our software. 

We are starting to use public cloud services where it can either help us with scale or increase the speed of our service delivery. We wanted a common toolchain for our software delivery and operational developers across both our on premises VMWare estate, and the public cloud. 

Enter Jenkins Delivery, our take on continuous delivery using Jenkins 2, Chef, Test Kitchen and Docker. It is a continuous delivery platform with deep Chef and Chef Supermarket integration. It encourages and enables building and testing software right alongside configuration management, compliance and the deployment pipeline itself. It makes heavy use of the Pipeline features shipped with Jenkins 2.0, and includes custom extensions to the Pipeline DSL that integrate Jenkins with Chef.

What follows is a walkthrough of standing the system up in a VPC, and an example of using it to build and deploy software in an EC2 instance. The repositories for what follows are all available publically, so feel free to try this out and create your own CD platform!

Pre-Requisites

  • AWS Account

First you will need an Amazon Web Services account. Your API key should be in ~/.aws/credentials in Amazon's shared credential format.

  • The ChefDK installed

The most recent version of the ChefDK for your platform should work. This has been tested using ChefDK 0.14. Download and install the ChefDK from here: https://downloads.chef.io/chef-dk/

  • IAM Role

You will need an IAM role configured in AWS that has permission to modify Route53 DNS records and permission to create EC2 instances

Creating the IAM Role

  1. Select 'Identity and Access Management (IAM)'
  2. Select Roles
  3. Create new role
  4. Call it 'jenkins-delivery'
  5. Select a role type of 'Amazon EC2'
  6. Attach policies 'AmazonRoute53FullAccess' and 'AmazonEC2FullAccess'
  7. Create Role
IAM role setup

IAM role setup

  • AWS Virtual Private Cloud (VPC)

You will need an AWS VPC to create the system in.

Creating the VPC

  1. Select the destination Amazon region, and select Virtual Private Cloud (VPC)
  2. Start the VPC configuration wizard and choose 'VPC with Single Public Subnet'
  3. Choose a VPC name. Make sure 'Enable DNS hostnames' is enabled.
  4. Go to 'Subnets' and select the public subnet you just created with the wizard. Select 'Subnet Actions' and choose 'Modify Auto-Assign Public IP', make sure it is enabled.
Auto-assign public IP

Auto-assign public IP

  • EC2 Security Groups

You will need two EC2 security groups for the delivery system. One to allow inbound access from your administrative location (Access to the Jenkins, Chef and Supermarket UI/APIs) and one to allow access between the delivery system and instances you create using the system.

Creating the Security Groups

  1. Select the region in which you are deploying the system, and EC2
  2. Security Groups
  3. Create, name "admin-access' and description 'Administrative access to Jenkins Delivery'. Make sure the group is created in the correct VPC. Create.
  4. Select the new group, allow inbound access on ports 22 and 443 from your administrative address range.
  5. Create, name 'delivery-system' and description 'Access internal to Jenkins Delivery'. Make sure the group is created in the correct VPC. Create.
  6. Select the new group. Add a new inbound rule, allowing ALL TCP and set the source to the security group created in the previous step
  7. Add a second rule to 'delivery-system'. Allow HTTPS from 0.0.0.0/0.

This is for annoying OAuth related reasons. The Supermarket server needs to be contactable on its public IP address from the other instances. We don't know what the public IP addresses will be yet....

You should further restrict this once the system is built by restricting HTTPS access to just your administrative location, and the three Delivery instance public IP addresses. 

delivery-system security group

delivery-system security group

admin-access security group

admin-access security group

  • Delegated Route53 Domain

You will need a DNS zone that is delegated to Amazon's Route53 service. In the example we are using the subdomain test.gharris.uk

Delegating the Domain

  1. In the AWS console, select Route53
  2. Select 'Hosted Zones'
  3. Create Hosted Zone
  4. Choose your domain name and add a comment. The type should be 'Public Hosted Zone'
  5. Click Create.
  6. Save the Route53 Zone ID for later on
  7. Make sure the name servers for the newly delegated subdomain are correctly set to Amazon's
Route53 delegated subdomain

Route53 delegated subdomain

These are the pre-requisites for building the system. Once these are in place, you can continue.

building the system

 

Fork the Jenkins Delivery Repository

First, you will need to fork the Jenkins Delivery git repository. You can use the git server of your choice, in this example we're using public Bitbucket. The repository is available here:

https://bitbucket.org/harriga/jenkins_delivery

Fork this and then clone your repository to your local system. Then:

cd jenkins_delivery

Create SSH Keys

The system will need some SSH keys in order to operate. Keys are needed for pushing updates back to your git repositories, SSH access to newly created EC2 instances and SSH orchestration within the environment once it is running. Create these SSH keys:

  • An EC2 SSH Keypair
  1. Select the destination Amazon region, and select EC2
  2. Select Key Pairs
  3. Create a Key Pair and call it 'jenkins-delivery'
  4. Save the downloaded key to:
jenkins_delivery/keys/aws
  • An SSH key for pulling and pushing updates to git repositories
ssh-keygen -f keys/git
  1. Give this key access to clone your jenkins_delivery repository
  2. Instructions for Bitbucket: https://confluence.atlassian.com/display/BITBUCKET/Use+deployment+keys
  3. Instructions for Github: https://developer.github.com/guides/managing-deploy-keys/
  • An SSH key for orchestration inside AWS post provisioning
ssh-keygen -f keys/knife

Populate Data Bags

Included in the jenkins_delivery respository is a convinience script that will convert the three SSH keys stored in jenkins_delivery/keys into JSON format and update the corresponding data bags in data_bags/keys/. Run this script to update the supplied (empty) data bags:

Updating data bag content

Updating data bag content

Configure Delivery Environment

Before installing the system, the install needs to be customised. Edit the management environment configuration to match your install

vi envs/management.json
"default_attributes": {
 "resolver": {
 "nameservers": ["10.0.0.2"]
 },
 "route53_domain": "your.route53.domain",
 "route53_zoneid": "XXXXXXXXXXXXXXX",
 "delivery_ssh_uri": "git@bitbucket.org:<user>/jenkins_delivery.git"
},

The name server on Amazon networks generally lives on .2 in the same subnet, so you shouldn't need to change this unless you changed the default IP allocation when creating the VPC using the VPC creation wizard.

Set Up Local Test Kitchen Configuration

Test Kitchen needs to know some specifics about your Amazon setup before installing. Configure .kitchen.local.yml to match your environment. See .kitchen.local.yml.example in your jenkins_delivery repository

 

Setup ChefDK

Run the following to setup your ChefDK environment (paths to gem, bundler, test kitchen, chef, etc)

eval $(chef shell-init sh)

Install the kitchen-sync rubygem, which gives Test Kitchen rsync transport support. Also install the Test Kitchen EC2 driver:

gem install kitchen-sync kitchen-ec2

Setup SSH Agent

Set up a local SSH agent and add the AWS EC2 SSH key to it

chmod 600 ./keys/aws
eval $(ssh-agent)
ssh-add ./keys/aws

Create the System

We are now ready to install the system. Running the Test Kitchen converge will create 3 EC2 instances. One Chef server, one Chef Supermarket server and one Jenkins instance to tie everything together. Although in this example we're creating new Chef and Supermarket servers to integrate with, it is possible to use Jenkins Delivery with existing Chef and Supermarket servers. This is left as an exercise for the reader, though!

kitchen converge

Note: If you see something similar to the below either a) you forgot to add your EC2 ssh key to your ssh-agent, or b) You were unlucky and Kitchen failed to authenticate with the instance before cloud-init sorted out SSH keys on Amazon's side. tl;dr, press enter and Kitchen will re-auth

EC2 instance <i-0f27541d512486357> ready.
centos@ec2-52-58-222-128.eu-central-1.compute.amazonaws.com's password: 

Time passes. The amount of time is variable depending on a few factors. I'm using an instance type of m4.large and creating the three instances usually takes around 30 minutes. When it has finished, it should look something like this:

 

A lot has happened with the Test Kitchen converge.

It has created Chef, Chef Supermarket and Jenkins 2.0 instances, configured them and started the Jenkins pipeline called InitialSetup. This pipeline configures the first Chef organization called 'management', bootstraps all 3 instances and sets them up as Chef Server clients in the management organization. This means that the delivery system itself is managed in the same way as any other environment that is managed by Jenkins Delivery. 

kitchen list

Login to Jenkins

Login to the jenkins console, at https://jenkins-delivery.<your route53 domain> using the 'admin' user. The password can be obtained by logging into the Jenkins instance:

Once logged in, you should see that the InitialSetup pipeline has begun. This pipeline sets up the system ready to start building custom software. The Jenkins console should look something like this when InitialSetup has completed successfully:

Login to Supermarket

Login to Chef Supermarket at https://chef-supermarket.<your route53 domain> with the 'chefadmin' user and authorise Supermarket to use the chefadmin account. The default password for the chefadmin user can be set in the .kitchen.local.yml configuration, check .kitchen.yml for the default.

Building and Deploying Software

Now we can demonstrate using the system to deliver software. In this (fairly trivial) example, we are using our public engineering blog as 'software' that we will be deploying onto an EC2 instance. In order to do this, there are two more git repositories you will need to fork. 

  1. skybet_public. This repository is an example Jenkins Delivery repository. It contains a Chef repository, along with both build and deployment pipelines for the project. Fork the repository from https://bitbucket.org/harriga/skybet_public and give the Jenkins Delivery git SSH key access to pull/push to your fork.
  2. sbg_engineering: This repository is a Chef cookbook and associated Jenkins Delivery Pipeline used to build the cookbook. Fork the repository from https://bitbucket.org/harriga/engineering and give the Jenkins Delivery git SSH key access to pull/push to your fork.

This example only uses one Chef cookbook, but could just as easily be using many more in order to build complex environments composed of many different services. There is no real limit except your imagination.

Create an Organization

Next, we use the Jenkins console and the 'CreateChefOrg' job to create a container Organization for our code to be uploaded to, and our EC2 instances to be driven from. CreateChefOrg will have already been run once for the setup of the 'management' organization during the InitialSetup pipeline. Run it again for our new 'skybet_public' organization:

The pipeline will run, create the new organization and make it available in the Jenkins menus for use with other jobs.

Build a Cookbook Version

Before starting this step, make sure you have logged into Supermarket once and allowed the chefadmin account to be linked and used for Chef Supermarket (above). Jenkins uploads to Chef Supermarket using this account so it must already be authorised to do so.

Run the ChefCookbook job, and fill in the parameters appropriately. Make sure the Jenkins git SSH key you configured earlier has access to clone and push to the cookbook repository

This job will checkout the Cookbook repository using the supplied git reference, then execute pipeline/build.groovy directly from the cookbook repository. In this example, our cookbook pipeline performs the following tasks:

  • Check out the Cookbook repository using the supplied git reference
  • Run Test Kitchen (Chef recipe both builds and tests the software, as well as building and testing the configuration management)
  • Tag the repository according to the supplied $VERSION_TYPE. This is semantic versioning, so minor major or patch revisions
  • Upload the built and tagged cookbook to Chef Supermarket. The Supermarket cookbook is versioned and contains a bundled, pre-built version of the application

The completed pipeline

The built cookbook will now be available on Chef Supermarket. Login to the Chef Supermarket and verify that you can now see a version of sbg_engineering hosted:

The cookbook as it appears in Chef Supermarket

Build a Jenkins Delivery Repository

Next it is time to build and test a Jenkins Delivery repository. This repository contains the Berksfile, Chef roles, environments, data bags and the Jenkins pipelines used for build and deployment of the service.

Run the 'BuildIntegration' job. This will execute pipeline/build.groovy from the source repository. In this example, our test is a simple 'berks update'.

In this example, our integration build pipeline performs the following tasks:

  • Check out the repository using the supplied git reference
  • Run 'berks update', verifying that all required cookbook versions are available
  • Tag the repository according to the supplied $VERSION_TYPE. This is semantic versioning, so minor major or patch revisions

Once finished, the integration build will look something like this:

This has built version 0.0.2 of the integration which is ready to be deployed.

Deploy the Integration

The 'DeployIntegration' job will checkout a Jenkins Delivery repository, then execute pipeline/deploy.groovy. In the supplied example, this uploads all Chef code to the Chef server (Roles, environments, data bags) and then installs the Berksfile to the Chef server (which includes the sbg_engineering cookbook we built earlier). Execute the 'DeployIntegration' job:

 

 

 

Delivery Pipelines with Test Kitchen, Chef and Docker

Delivery Pipelines with Test Kitchen, Chef and Docker

We have been using Chef for configuration management at SB&G for a good while now, making heavy use of Test Kitchen for testing our cookbooks with Jenkins, Docker containers and ServerSpec.

These cookbooks also serve as local development environments. By distributing Vagrant box images containing all the necessary tools for Chef development to our teams, it is usually the case that simply cloning a cookbook in the Workstation VM and running Test Kitchen is enough to get a development version of an application running, in Docker containers, locally. Test Kitchen configures ports forwarded to the application, allowing developers to access locally hosted services from their desktop.

It is a nice workflow, it means a high level of confidence that changes deployed in integration environments are the same changes that get released to production environments and that they work in the same way. Because developers are free to experiment locally, innovation is easier. Problems are caught earlier in the development process - saving time further down the line, and is increasingly common practice with configuration management tools like Chef and Puppet.

Jenkins, Test Kitchen, Chef and Docker put together are much more than just a configuration management platform, though.

Starting to think about how to increase our speed of delivery as we grew, we realised that small is good. Small teams, providing microservice-like services to each other with established APIs and SLAs. These teams need to be independent, with as little dependancy on other teams as possible. These teams need to be small which means they don’t want to have to manage complex build, test and development environments, but they do want to be able to have complex CI pipelines for all kinds of software, from php to nodejs or Java. How can Chef help?

Chef recipes don’t just have to be used to write system configuration or install packages. With Test Kitchen and Docker, we can use Chef DSL to perform and test any action inside the container. Replacing CI integration bash scripts usually run by Jenkins with Chef DSL run by Test Kitchen makes these scripts testable and version controlled in the same way as Chef cookbooks. Developers and operations are using the same SDK to orchestrate their workflows, meaning greater collaboration.

This means that we can write Test Kitchen suites that do things such as check out git repositories, execute Mocha tests, run eslint for nodejs, or install a compiler and build a binary, or do something with Maven. Endless possibilities!

This is good for a few reasons. First, your CI pipeline itself is now testable and version controlled code. Second, that CI pipeline is running inside a Docker container using the same software versions as will deploy onto the production platform, since they’re using the same Chef recipes. This means that your tests are representative, and you don’t have problems for example where one team needs Java 1.7 on the CI slave but another team needs Java 1.8, it is all in containers so everyone can get along. Third, developers and operations are now talking the same language.

Testing of the application and the infrastructure code is part of the same delivery pipeline. At all stages of the development workflow, platform and applications are tested together, even on the developers local machine.

The final piece of the puzzle is Jenkins Pipeline. This is a plugin for Jenkins maintained by CloudBees that allows you to configure your jobs as a Groovy-based DSL. The plugin allows job definitions to be stored and run directly from source control, which means the Jenkins pipeline can also be stored in the same git repository as the application and infrastructure code. We create ‘stub’ Jenkins jobs for each of our services, and these jobs run Pipeline DSL from the git repository maintained by the service owning team.

That makes it very easy for a team to make changes to their CI workflows, while being able to make use of a centrally maintained Jenkins instance that has deep integration with Chef and other orchestration flows. Complex flows can be built that define the entire software delivery pipeline, with a very small cost of starting up a new project.

An example might help at this point, so lets look at some code for an example Chef Integration. This example is a single git repository, containing both the application code (a nodejs application) and the infrastructure code. It also contains the CI pipeline as a Jenkins Pipeline definition. The nodejs application requires a connection to one of our MySQL databases in order to function. The layout of the repository:

event-service/
│   .kitchen.yml                <- Test Kitchen configuration. Container setup, Chef run lists
│   Berksfile                   <- Berkshelf for Chef cookbook version management, pulling in common functionality
│   workflow.groovy             <- Jenkins Pipeline job definition
├───event-service/              <- NodeJS application
├───dockerfiles/                <- Dockerfiles for creating basic containers from images
└───chef/
    ├───cookbooks/
    │   ├───event-service/
    │   │   ├───recipes/
    │   │   │   lint.rb         <- CI Lint stage definition
    │   │   │   test.rb         <- CI Test stage definition
    │   │   │   build.rb        <- CI Build stage definition
    │   │   │   vendor.rb       <- CI Vendor stage definition
    │   │   │   deploy.rb       <- Application release recipe

First, the Test Kitchen configuration. Kitchen uses YAML for configuration, and supports Ruby ERB fragments in-line. This is useful because it allows us to pass environment variables via Test Kitchen through to Chef recipes.

The driver configuration comes first, of which there are many. Docker provides the features we need. Chef Zero is the provisioner, which will be used to configure the container after it has been created by the dockerfile.

---
driver:
  name: docker
provisioner:
  name: chef_zero
  cookbooks_path: ./chef/cookbooks
  client_rb:
    environment: DEV
platforms:
  - name: centos7
    driver_config:
      dockerfile: ./dockerfiles/centos7
      volume: <%=ENV['PWD']%>:/tmp/workspace # Make the working directory available inside the container
    attributes:
      ci_build: <%=ENV["CI"]%>
      workspace: /tmp/workspace

Important to note here is the volume mount. When run in CI, this means that the docker container, and therefore Chef, have access to the Jenkins workspace. This makes it simple to write Chef recipes that output to the Jenkins workspace from inside a docker container. This can be used to write test results or to create build artefacts for later analysis by Jenkins or use in later Pipeline stages.

Next, the suites are defined. Each suite is a container with its own Chef run list, and containers can be linked together. Here we create a fixtured MySQL server which is linked to our NodeJS application container:

suites:
  - name: db-server # Fixtured MySQL container
    run_list:
      - recipe[sbg_mysql::install]
      - recipe[sbg_event-service::db]
      - recipe[sbg_event-service::db-fixtures]
    driver:
      instance_name: db-server
      publish: 3306
  - name: app-server  # Application test and build container, linked to fixtured DB
    run_list:
      - recipe[sbg_event-service::lint]
      - recipe[sbg_event-service::test]
      - recipe[sbg_event-service::build]
      - recipe[sbg_event-service::vendor]
    driver:
      instance_name: app-server
      forward:
        - 1700:1700
      links: "db-server:db-server"
    attributes:
      sbg_event-service:
        db-host: db-server

With this configuration, running kitchen converge from the root of the repository will launch two docker containers with port 1700 forwarded to the running application, that has been built from source.

The Chef recipes themselves are fairly simple.

lint.rb Installs the eslint utility and runs it, outputting the result to the shared volume

#run eslint inside the container, output the results to the shared volume mount
execute "Install eslint" do
  command "/opt/node/bin/npm i -g eslint"
  creates "/opt/node/bin/eslint"
  action :run
  not_if {File.exists?("/opt/node/bin/eslint")}
end

execute "event-service eslint report" do
  command "eslint --ext .js,.jsx -f checkstyle . | /usr/bin/tee #{node['workspace']}/build/lint-eslint.xml"
  cwd "#{node['workspace']}/event-service"
  action :run
end

test.rb Runs npm test and outputs the test result to the shared volume

#run npm test and copy the resulting Mocha test report to the Jenkins workspace for analysis by Pipeline
execute "run npm test" do
  command "npm install && npm run test"
  cwd "#{node['workspace']}/event-service"
  action :run
end

execute "Copy Mocha test report to workspace" do
  command "cp build/test-mocha.xml #{node['sbg_event-service']['workspace']}/build/test-mocha.xml"
  cwd node['workspace']
  user "root"
  group "root"
  action :run
end

build.rb Runs npm install and installs the production dependancies

#prune the installation and install npm production dependancies
execute "run npm install" do
  command "npm prune && npm install --production"
  cwd "#{node['workspace']}/event-service"
  action :run
end

vendor.rb Creates a deployable artefact of the node application in the shared volume mount

#Create a .tbz2 containing the node application and all its production dependancies
execute "build artefact" do
  command "/bin/tar -cvjf #{node['workspace']}/build/event-service-v#{node['new_tag_version']}.tbz2 event-service/"
  cwd node['workspace']
  action :run
end

A fairly simple set of steps to build an application. Test results, when run by Jenkins, are output into the Jenkins workspace for later analysis. What ties this together is Jenkins Pipeline. The workflow.groovy for this example is described below. This DSL is run by Jenkins when a new tagged version of the event-service is needed:

def rubyPath     = '/opt/chefdk/embedded/bin/ruby --external-encoding=UTF-8 --internal-encoding=UTF-8 '

def env = "event-service"
repo = "ssh://git@git-server/${env}.git"


def notifySlack(text, channel) {
    def slackURL = 'https://hooks.slack.com/services/XXXXXX/XXXXXXXX'
    def payload = JsonOutput.toJson([text      : text,
                                     channel   : channel,
                                     username  : "Jenkins",
                                     icon_emoji: ":jenkins:"])
    sh "curl -X POST --data-urlencode \'payload=${payload}\' ${slackURL}"
}

def kitchen = '''#!/bin/bash
foodcritic chef/cookbooks
CI=true kitchen converge
CI=true kitchen verify
'''

Some definitions. Slack is used to notify teams of completed builds at the end of the workflow. The Test Kitchen commands run the Kitchen configuration defined above.

node("slave-docker") {
    currentBuild.setDisplayName("${env} #${currentBuild.number}")
    branch = "release"

Select a Jenkins slave and set the current display name of the build. This is useful for providing developer feedback during a workflow.

    stage name: "checkout-${env}", concurrency: 1
    checkout([$class: 'GitSCM', branches: [[name: branch]], doGenerateSubmoduleConfigurations: false, extensions: [[$class: 'CleanBeforeCheckout']], submoduleCfg: [], userRemoteConfigs: [[url: repo]]])

The first CI stage. This will check out the event-service git repository to the Jenkins workspace

    stage name: "get-new-tag-${env}", concurrency: 1
    sh '''#!/usr/bin/bash
    LASTTAG=`git describe --abbrev=0 --tags`;
    VERSION=${LASTTAG/event-service-v};
    NEWVERSION=$(( VERSION + 1 ));
    NEWTAG="${NEWVERSION}";
    echo -n $NEWTAG > .gitver'''
    def newtag = readFile('.gitver').trim()
    echo "New version will be ${newtag}"

This is a utility stage that determines the next tag version for the repository based off the previous tag version. This is a required workaround since Jenkins Pipeline sh steps currently don’t have any return values. There is an issue raised for this.

    stage name: "test-kitchen-${env}", concurrency: 1
    wrap([$class: 'AnsiColorSimpleBuildWrapper', colorMapName: "xterm"]) {
        sh kitchen
    }

The main stage. Runs Test Kitchen, which builds and verifies the application in docker containers using Chef. The Chef recipes output test results and a build artefact to the Jenkins workspace.

    stage name: "warnings-${env}", concurrency: 1
    step([$class: 'WarningsPublisher', canComputeNew: false, canResolveRelativePaths: false, consoleParsers: [[parserName: 'Foodcritic']], defaultEncoding: '', excludePattern: '', healthy: '', includePattern: '', parserConfigurations: [[parserName: 'JSLint', pattern: 'build/lint-*.xml']], unHealthy: ''])

The lint output is parsed by Jenkins, JSLint for the NodeJS and Foodcritic for the Chef recipes.

    stage name: "junit-${env}", concurrency: 1
    step([$class: 'JUnitResultArchiver', keepLongStdio: true, testResults: 'build/test-*.xml'])

The test output from npm test is analysed by Jenkins. Failed tests here results in a failed build

    stage name: "archive-${env}", concurrency: 1
    step([$class: 'ArtifactArchiver', artifacts: 'build/*.tbz2', excludes: ''])

This stage tells Jenkins to archive the artefact produced by the vendor.rb Chef recipe

    stage name: "push-tag-${env}", concurrency: 1
    sh "git tag -a event-service-v${newtag} -m \"event-service-v${newtag} pushed by Jenkins\""
    sh "git push --tags"

    currentBuild.setDisplayName("event-service-v${newtag}")

    notifySlack("Build ${currentBuild.number} completed, tagged with event-service-v${newtag}","#event-service")
}

Finally, push a new tagged version of the application + infrastructure code, set the build name to that version and notify our teams Slack channel that a new build has been successfully completed.

Further Jenkins jobs can then be used to push that tag to integration and production environments.

Speeding Up Chef Search

At my day job, we make extensive use of Chef searches throughout our recipes. Chef search can be used to find out almost anything about a Chef node, but after writing cookbooks for a few different parts of our stack we found most of the searches were pretty similar. We need to know the hostnames where software is running, thier IP addresses, fairly simple information. Most of our searches were queries like "Give me an array of hostnames that run the role y" or "Give me an array of IP addresses that run the role z".

The traditional way to do this is like this:

result = search(:node, 'role:common')

This executes a Chef search during the compile phase of the Chef run, which involves an API call to the Chef server, dragging back what can be a fairly large json node object containing all attributes Chef stores for the node. This gets slow very quickly when you need to search in lots of places in your recipes, both the performance of the Chef server and client suffer, resulting in long converge times on the nodes.

Chef introduced partial search to help with this, which allows you to specify a filter server-side so that you're not throwing huge json blobs over the network the whole time. This looks like so:

filter = {
:rows =>1000,
:filter_result =>{
:ipaddress => [ 'ipaddress' ]
},
}
result = Chef::Search::Query.new.search( :node, 'role:common', filter );

This is better, but still requires a relatively expensive API call to the Chef server, and still involves json searialization and deserialization still going on for every search we want to do. There isn't really much point in doing this over and over in the various different places we need to use search in, especially since all our searches are so similar.

This is why we use global_search. Node JSON objects are loaded and examined during the first search query of the Chef run using partial search, then any attributes we are interested in are cached under node.run_state for each node.
Subsequent searches during compile or execution are filled from the node.run_state cache, which means there is only one API call into the Chef server for search for each of our Chef runs. Because node.run_state is in memory in the chef-client process, it speeds things along nicely.

The same query with global_search looks like this:

chef-shell> include_recipe "sbg_global_search"
chef-shell> result = get_role_member_hostnames('common')
['host-a','host-b','host-c']
chef-shell>

We can also search across Organizations. To do this, you will need to add a client in the target Organization called 'searchclient'. The client needen't have any more permission than to read from the API.

To search another Organization:

node.default['sbg_global_search']['search']['myorg']['endpoint'] = 'http://yourchefserver/organizations/myorg'
node.default['sbg_global_search']['search']['myorg']]['search_key'] = 'Client key content'
chef-shell> include_recipe "sbg_global_search"
chef-shell> result = get_role_member_hostnames('common', 'myorg')
['host-d','host-e','host-f']
chef-shell>

The cookbook is available from github.com

Functions

get_environment_nodes(env=node.chef_environment.downcase)
Returns a hash of node FQDNs and attributes from the node.run_state cache. Optional, return nodes from alternate Organization env

get_role_member_hostnames(role, env=node.chef_environment.downcase)Returns an array of node names where node has role on the run_list. Optional, return nodes from alternate Organization env

get_role_member_ips(role, env=node.chef_environment.downcase)
Returns an array of ipaddresses for each node that has role on the run_list. Optional, return nodes from alternate Organization env

get_role_member_fqdns(role, env=node.chef_environment.downcase)
Returns an array of fqdns for each node that has role on the run_list. Optional, return nodes from alternate Organization env

What is XRAID2?

Recently I bought myself some NAS storage and, because it was a pretty good offer, picked up a Netgear RND4000 ReadyNAS NV+ v2 from eBuyer. It's a 4-bay NAS enclosure that has a nice web based control panel for creating shares and managing the unit.

I got curious about what XRAID2 actually is, so I had a poke around the command line. Under the hood it's using the Linux md device driver from kernel 2.6.31, which is where the "XRAID2" expansion capabilities come from. When you're using the device in XRAID2 mode and insert a new disk, it gets formatted with a GPT partition table with 3 partitions:

root@nas:~# gdisk -l /dev/sda
GPT fdisk (gdisk) version 0.7.0

Partition table scan:
MBR: protective
BSD: not present
APM: not present
GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sda: 5860533168 sectors, 2.7 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): C97568EC-FABC-45C1-B703-96B61603C693
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 5860533134
Partitions will be aligned on 64-sector boundaries
Total free space is 4092 sectors (2.0 MiB)

Number Start (sector) End (sector) Size Code Name
1 64 8388671 4.0 GiB FD00
2 8388672 9437247 512.0 MiB FD00
3 9437248 5860529072 2.7 TiB FD00

The first partition will join the RAID 1 device /dev/md0, which is the actual system install. As far as I can tell, it's always 4GB in size. 

The second partition, again always 512MB in size, joins /dev/md1 which is also RAID 1. This is the Linux swap partition.

The 3rd partition will take up whatever space is left on the drive and joins /dev/md2. This is a RAID 5 device and is where all the NAS storage space is. The md devices on my system:

root@nas:~# cat /proc/mdstat 
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md2 : active raid5 sdc3[2] sda3[0] sdb3[1]
5851089408 blocks super 1.2 level 5, 64k chunk, algorithm 2 [3/3] [UUU]

md1 : active raid1 sdc2[2] sda2[0] sdb2[1]
524276 blocks super 1.2 [3/3] [UUU]

md0 : active raid1 sdc1[2] sda1[0] sdb1[1]
4193268 blocks super 1.2 [3/3] [UUU]

unused devices: <none>

The device that actually gets mounted as a filesystem is /dev/c/c with a Linux ext4 file system. This is because on top of the md driver, /dev/md2 is itself a Linux LVM physical volume, with one logical volume on top of it, like so:

root@nas:~# pvdisplay 
--- Physical volume ---
PV Name /dev/md2
VG Name c
PV Size 5.45 TiB / not usable 30.75 MiB
Allocatable yes
PE Size 64.00 MiB
Total PE 89280
Free PE 160
Allocated PE 89120
PV UUID ww3P2w-Nj8Z-yemx-4gSQ-VhE4-C0Ri-pZW1hI

root@nas:~# lvdisplay
--- Logical volume ---
LV Name /dev/c/c
VG Name c
LV UUID 9kFN0b-bNgZ-RhWs-aiQQ-kQjJ-vKms-Bfsb1a
LV Write Access read/write
LV Status available
# open 1
LV Size 5.44 TiB
Current LE 89120
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:0

It's this logical volume that is formatted as ext4 and mounted at /c like so:

root@nas:~# mount
/dev/md0 on / type ext3 (rw,noatime)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw)
tmpfs on /ramfs type ramfs (rw)
tmpfs on /USB type tmpfs (rw,size=16k)
usbfs on /proc/bus/usb type usbfs (rw)
/dev/c/c on /c type ext4 (rw,noatime,acl,user_xattr,usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv0)
nfsd on /proc/fs/nfsd type nfsd (rw)

This is good news as far as i'm concerned. It means that should the ReadyNAS hardware fail out of warranty for whatever reason, it's trivial to plug the disks into a generic Linux machine and rebuild the RAID array without losing any data.