December 1, 2015

Day 1 - Using Automation to build an OpenStack Cloud

Written by: JJ Asghar, @jjasghar

Edited by: Klynton Jessup, @klyntonj

I wrote this as a narrative around what I hope a typical engineer would experience trying to resolve an issue. This story, while fiction, is taken from personal experiences and inspired by what I hope would happen. I hope you enjoy this.

Like normal, I came to my stand-up unexcited for the normal grind. My boss Steve came in, sat down at the conference table, and put his notebook down. Yes, stand-up happened in a conference room and yes we actually sat at the conference room table, so, honestly, I never really understood why we called it a stand-up. I guess it was a “rebranding” of “Daily Status” or maybe it was a hold over from those days we tried to do “agile.” Who knows?

Anyway, Steve opened his notebook and looked around at my team. “So, we have a problem. The local development for our cookbooks is great but we need to start testing on multiple platforms. There’s a chance we might be spinning up a new application and it only runs on CentOS.” (We’re a Ubuntu shop.) There were some sighs and groans as we looked around at each other.

“Today y’all have your normal responsibilities but we need to think of a way to parallelize our cookbook development. I dunno if you’ve ever tried it but running kitchen test -p 2 on your laptops brings everything to a grinding halt. Let’s skip stand-up and spend the first half of the day doing some research and let’s try and come up with some ideas. We’ll get back together after lunch.” With that Steve closed his notebook, got up, and walked out of the room.

“Interesting,” I said under my breath. “I guess it’s time to start Googling.”

I walked back over to my laptop and typed in “parallelize cookbook development” in Google. Wow, there was nothing about test-kitchen on there, until I got to on the second page! It’s 2015, so this post from 2013 has to be out of date. Right? But didn’t Steve mention the -p in stand-up? I read through the page and found out that there was a kitchen test --parallel that must be what he was talking about. Sweet. OK, he isn’t confused and just saying words again.

I hopped over to and noticed all the drivers for test-kitchen. There were a ton, ranging from AWS, to HyperV, to OpenStack, and Digital Ocean (DO). This is great, I can run test-kitchen on any cloud I want. I started to play around with the different options and settled on as my test version. I’ve always enjoyed using Digital Ocean to do my development and it seemed reasonable. I opened my .kitchen.yml and looked at the configuration.

  name: vagrant
    - ["forwarded_port", {guest: 80, host: 8080, auto_correct: true}]
    cpus: 4
    memory: 8096

I made the changes to the driver name and added in the Droplet size I wanted:

  name: digitalocean
  size: 8gb

And then installed the gem:

gem install kitchen-digitalocean
export DIGITALOCEAN_SSH_KEY_IDS="8675, 30900"

Then I ran test-kitchen. Nice! It worked. This is great. I spun up both of my test boxes on Digital Ocean, one Ubuntu 14.04 and one CentOS 7 at the same time and verified them both. I got up and walked over to Steve’s office. “Hey Steve, I think I got an answer for you from stand-up,” I said.

“Oh yeah? That was fast,” he said looking up from his laptop. “What is it?”

“I hooked up test-kitchen to Digital Ocean and ran a test command on both systems at the same time,” I said, with confidence.

“Ha! That’s awesome. I had no idea you could use other drivers with test-kitchen, I thought it was all local development.”

“Yeah I learned that after looking at the GitHub test-kitchen org, turns out it was pretty easy to set up.”

“Hang on, Digital Ocean costs money though right?”

“Yeah, it’s only pennies though, and I charged my own personal account.”

“Ah well if it’s ‘only pennies’ I guess you won’t need to expense it.”

“Ouch, fine. I guess I deserve that. So you think it might be too expensive to run our test suite for every commit?”

“You read my mind. We need something like DO but local. How about OpenStack?”

“HAHAHAAHAHAAHAHAHAHAHAH” I actually started crying from the tears of laughter. “Yeah right, I don’t have enough time to set up an OpenStack cloud. OpenStack has only gotten worse since they started the project.”

Steve took a moment to let me compose myself, then said plainly, “You should give it a shot, I heard JJ, the OpenStack Chef guy, mention something called an ‘OpenStack-model-t.’ It’s a project Chef was working on to help people build basic OpenStack clouds. Any chance there’s a kitchen-openstack driver for it like there was for Digital Ocean?”

“Actually, yeah there is. Ah, OK I see where you’re going with this, I’ll report back what I find out about these two projects.” I turned around and walked out of his office.

I sat down at my laptop, opened up Chrome, and typed in: openstack-model-t. The first hit was: One of the first things that caught my eye was: “The Customer Can Have Any Color He Wants So Long As It’s Black”. Henry Ford. Funny, real funny Chef. I started looking around. It seemed that the cookbook was pretty straight forward. There was even a .kithen.yml file in the repo, so I figured I’d checkout the repo and give it a shot.

cd ~
mkdir openstack-stuff
cd openstack-stuff
git clone
cd openstack-model-t
chef exec kitchen verify

My laptop fans started to spin and my MacBookPro started to heat up, yep, I was building an OpenStack cloud on my laptop. After about 20 minutes, I came back to my laptop and I saw:

         Process "nova-scheduler"
           should be running
             should eq "nova"
         Process "neutron-l3-agent"
           should be running
             should eq "neutron"
         Process "neutron-dhcp-agent"
           should be running
             should eq "neutron"
         Process "neutron-metadata-agent"
           should be running
             should eq "neutron"
         Process "neutron-linuxbridge-agent"
           should be running
             should eq "neutron"

       Finished in 1.52 seconds (files took 0.41367 seconds to load)
       67 examples, 0 failures

       Finished verifying <default-ubuntu-1404> (0m18.47s).
-----> Kitchen is finished. (16m24.13s)

Wow, a fully tested and verified All-in-One OpenStack cloud in one command! Let’s see if I can spin up a VM. Reading the README, it seems I have to run a script when I first ssh into the box to make sure I have an initial network and “CirrOS” image on the machine. OK, so be it.

chef exec kitchen login
sudo su -

Now the README tells me I can look at the URL in my web browser, with and I should see the OpenStack login. I opened up Chrome again and typed it in. Sweet, it worked. Using the demo username and password of mypass, I was able to login without a hitch. I clicked the Instance button on the left-hand side and then “Launch Instance”. Heavens to Betsy, it actually worked!

I looked around, I didn’t believe it. A successful OpenStack build out of the box? What kind of black magic was this? This is bonkers. I stood up from my laptop and took a walk. This opened up a huge opportunity for my team and I needed to clear my head.

As I walked around my office I saw one of our IT team members carrying a new laptop that he had provisioned for someone on another team. He was going the same direction I was, so I figured I’d ask, “Hey Billy, quick question what do you do with the old desktops that these laptops are going to replace?”

I guess I caught him off guard because he turned and looked at me with confused and concerned eyes, “Oh, hey, honestly, nothing. They have already depreciated in value so they just sit in the IT room until we get Goodwill to come ’round and we write them off as a donation. Nothing too exciting.”

“Interesting, any chance I can get…three of them, I’d like to try something out.”

“Sure, no problem, come ’round in a bit and I’ll get you what you need.”

“Awesome, see you then.” And I continued walking.

About twenty or thirty minutes later I walked up to the IT room and Billy was looking at Reddit. No real surprise there, 60% of IT work is done or is linked to from Reddit and yes you can quote me on that. “Hey Billy, I’m here, can you hook me up?”

“Sure, no problem, take what you need; just write it down on that clipboard.”

“Can’t I email it to you?”

“Meh, you could, but I’d prefer to see you write down what you take when you take it.”

“Fair enough,” I said, as I picked up one of the desktops. “Any chance you have a spare switch or two?”

“Probably, just be sure to put it down on that clipboard if you find one.”

“Cool, thanks again.”

“No problem”, he said as he turned back to looking at his laptop.

I pulled together what I needed and took it all back to my desk. The README for the model-t had a controller node that needed 3 NICs in it and the compute nodes only needed 2 if I wasn’t going to use a storage network. I thought about it for a few moments and realized that nope, I don’t need a storage network for running a test-kitchen OpenStack cloud, so 2 NICs would be fine.

I powered on the controller node, it had Windows 7 on it, which meant I had to re-image these machines. I downloaded Ubuntu 14.04, created a USB boot disk, and started the installation on the machines. I tried doing it in parallel but ended up confusing myself; so I did it serially. I rebooted the controller node and started the installation. I did a basic install, naming the machine Controller, original I know, and connected it to my lab network. I was able to ping with only my management network plugged in so I felt like a was making some progress. I repeated the process with the two Compute machines naming them Compute1 and, you guessed it, Compute2 and then decided to break for lunch. Over lunch I talked to some of my coworkers about what I was doing and all-in-all there was a pretty positive response. I did get a couple giggles and shakes of their heads about OpenStack, but I expected that.

I walked back to my desk, looked at the beginning of what I was hoping to be a real OpenStack cloud and was pretty proud of my work. I remembered then that Steve wanted to have a sync up after lunch, so I headed over to the dreaded stand-up conference room. As I walked in Steve was just starting.

“So it seems we have a couple options on the table. Most of you figured out that you could run test-kitchen with different drivers giving you access to more compute resources. That’s good. Some of you didn’t come to me with anything so I have to assume you either got caught up with something else, or didn’t bother looking.”

After that inspiring talk, I spoke up, “Hey Steve, I’ve made some pretty impressive progress with the OpenStack-model-t cookbook, I’m going to go heads down on it for the rest of the day to see if I can get this done by EOB.”

Steve looked at me, smiled, then looked at the rest of the room and said, “See that’s what I want, someone to run with my assignments. OK, let’s get back to work and let’s see where we are at stand-up tomorrow.” We all started filing out of the conference room, all of us realizing at that moment how much of a waste of our time our sync meetings were.

I sat down in front of my laptop and started to think of what I had to do next. I put a small plan together:

  1. Get a hosted Chef instance, the 5 free nodes they give me ought to be enough to get this proof of concept built.
  2. Get the model-t cookbook up on the hosted instance.
  3. Figure out what I need in the run_list for each of the machines.
  4. Converge and see my cloud come to life.

First is always first, right? So I went to and got myself an instance of hosted Chef. It was pretty easy I just created a new org, baller-model-t and then pulled down the getting_started kit. I did a chef exec knife status to confirm it was working and it was. Awesome, step 1 complete. Second, I uploaded the openstack-model-t cookbook to my hosted instance. It complained about dependencies because I had forgotten to do the Berkshelf stuff. So I did the following:

cd openstack-model-t
chef berks install
chef berks upload
cd ..
chef exec knife cookbook upload openstack-model-t -o .

Sweet, success. OK, as it looked in the cookbook the All-in-One test was sent from the default.rb recipe. That’s good, that’s my controller node. Now the question is what’s in my compute node run_list? Well it looks like compute_node.rb is the wrapper for just a compute node. Great, I’m almost done. Now to get the converge to happen. I still need to get these boxes to check into the hosted Chef instance, so I decided to use knife bootstrap to do it in one shot. These were the commands I used:

chef exec knife bootstrap controller -x ubuntu --sudo -r 'openstack-model-t::default'
chef exec knife bootstrap compute1 -x ubuntu --sudo -r 'openstack-model-t::compute_node'
chef exec knife bootstrap compute2 -x ubuntu --sudo -r 'openstack-model-t::compute_node'

I went to https://controller/horizon and I couldn’t believe what I saw. I logged in with admin/mypass and I saw 3 hypervisors and an empty OpenStack cloud ready to go. I ssh’d into the controller node, ran the bash, and saw it come to life. (NOTE FROM MY FUTURE SELF: If I remember correctly, I had to change some of the floating-ip options around on my second build of this, but still, this was amazing.)

I spun up a CirrOS image, then a second, and a third. I ssh’d into each of them and make sure I could ping It worked! I had a running multi-node horizontally scalable cloud on my desk. I looked at the clock and it was only about an hour ‘til quitting time. I searched for some OpenStack cloud images and found Ubuntu’s and CentOS’ and injected them into my cloud. I leaned back for a moment and thought about what I had accomplished. I had to smile. As someone who has always been interested but scared to try building out OpenStack this seemed like dream come true.

For a couple moments, I thought about going home, then I realized that I could finish the project off if I just got kitchen-openstack running. So I went through the same steps that I did with Digital Ocean with the kitchen-openstack driver. I ran my parallel tests and successfully spun up a test-kitchen run.

I couldn’t be more proud of my accomplishments for today. For the first time in a while I’m going to be excited for stand-up tomorrow when I can show this off to everyone.

December 25, 2014

Day 25 - Windows has Configuration Management?!?

Written by: Steven Murawski (@stevenmurawski)
Edited by: William Shipway (@shipw)

Windows Server administration has long been the domain of “admins” mousing their way through a number of Microsoft and third party management UIs (and I was one of them for a while). There have always been a stalwart few who, by hook or by crook, found a way to automate the almost unautomateable. But this group remained on the fringes of Windows administration. They were labeled as heretics and shunned, until someone needed to do something not easily accomplished by a swipe of the mouse.

The sea winds have shifted and over the past seven or eight years, Microsoft released PowerShell and began focusing on providing a first class experience to the tool makers and automation-minded. The earlier group of tool makers and automators gained traction and began to develop a larger following, as more Microsoft and third party products added support for PowerShell. That intrepid group of early automators formed the core of the PowerShell community and began welcoming new converts - whether they were true believers or forced into acceptance by the lack of some capability in their comfortable management UIs. Now, most Windows Server administrators have delved into the command line and have begun to succumb to the siren call of automation.

Just as the PowerShell community’s evangelism was reaching a fevered pitch, Microsoft added another management tool - Desired State Configuration. The tool-makers and automators were stunned. Cries of “what about my deployment scripts?” and “but, I already built my VM templates!” echoed through the halls. Early adopters of PowerShell v3 lamented “isn’t this what workflows were for?”. Some had already begun to explore the dark arts of configuration management using tools like Chef and Puppet to bring order to their infrastructure management. With the help of those in the community who blazed a trail in implementing configuration management on Windows, those cries of dismay began to turn into rabid curiosity and even envy. The administrators began to read books like the Phoenix Project and hear stories from companies like Stack Exchange, Etsy, Facebook, and Amazon about this cult of DevOps. They wanted access to this new realm of possibilities, where production deployments don’t mean a week of late nights in the office and requests for new servers don’t go to the bottom of the pile to sit for a month to “percolate”.

Read on, dear reader to understand the full story of Desired State Configuration and its place in the new DevOps world where Windows Server administrators find themselves.

An Introduction to Desired State Configuration

With the release of Windows Server 2012 R2 and Windows Management Framework 4, Microsoft introduced Desired State Configuration (DSC). DSC consists of three main components: the Local Configuration Manager, a configuration Domain Specific Language (DSL), and resources (with a pattern for building more). DSC is available on Windows Server 2012 R2 and Windows 8.1 64 bit out of the box and can be installed on Windows Server 2012, Windows Server 2008 R2, and Windows 7 64 bit with Windows Management Framework 4. There is an evolving ecosystem around Desired State Configuration, including support for a number of systems management and deployment projects. To me, one of the most important benefits of the introduction of Desired State Configuration is the awakening of the Windows administration community to configuration management concepts.

A Platform Play

The inclusion of Desired State Configuration may seem like a slap in the face to existing configuration management vendors, but that is not the case. Desired State Configuration is a platform level capability similar to PerfMon or Event Tracing for Windows. DSC is not intended to wholesale replace other configuration management platforms, but to be a base which other platforms can build on in a consistent manner.

The Evolution of DSC

One of the major knocks against administering Windows servers in the past has been the horrendous story around automation. Command-line tools were either lacking coverage or just plain missing. The shell was in a sorry state.

Then, shortly before Windows Server 2008 shipped, PowerShell came about. Initially, PowerShell had relatively poor native coverage for managing Windows, but it worked with .NET, WMI, and COM, so it could do just about anything you needed.

More coverage was introduced with each release of Windows Server. Windows Server 2012 had an explosion of coverage via native PowerShell commands for just about everything on the platform.

PowerShell appeared to be the management API for configuring Windows servers. The downside of a straight PowerShell interface is that PowerShell commands aren’t necessarily idempotent. Some like Add-WindowsFeature are, and do the right thing if the command is run repeatedly. Others are not, like New-Website, which will throw errors if the site already exists.

DSC was introduced to provide a common management API that offers consistent behavior. Under the covers, it is mostly PowerShell that is running, but the patterns the resources follow ensure that only the work that needs to be done is done, and when a resource is in the proper state, that it is left alone.

Being a platform feature means that there is a consistent, supported mechanism for customers and vendors to manage and evolve the configured state of Windows servers.

Standards Based

Desired State Configuration was built using standards already supported on the Windows platform - CIM and WSMAN.

CIM, Common Information Model, is the DMTF standard that WMI is based upon and provides structure and schema for DSC.

WSMAN, WS-Management, is a web services protocol and DMTF standard for management traffic. WinRM and PowerShell remoting are built on this transport as well.

While these might not be the greatest standards in the world, they do provide a consistent manner for interacting with the Desired State Configuration service.

An Evolving API

Though Windows Management Framework (WMF) was just recently introduced (it has been released for just over a year), WMF 5 development is well under way and includes many enhancements and bug fixes. One major change is to make the DSC engine’s API more friendly to use by third-party configuration management systems.

There was also a recent rollup patch for Server 2012 R2 (KB3000850) that contains a number of bugfixes and some tweaks for ensuring compatibility with changes coming in WMF 5.

Diving In

Now that we’ve got a bit of history and rationale for existence out of the way, we can dig in to the substance of Desired State Configuration.

The Local Configuration Manager

The engine that manages the consistency of a Windows server is the Local Configuration Manager (LCM). The LCM is exposed as a WMI (CIM) class (MSFT_DscLocalConfigurationManager) in the Root/Microsoft/Windows/DesiredStateConfiguration namespace.

The LCM is responsible for periodically checking the state of resources in a configuration document. This agent controls

  • whether resources are allowed to reboot the node as part of a configuration cycle
  • how the agent should treat deviance from the configuration state (apply and never check, apply and report deviance, apply and autocorrect problems)
  • how often consistency checks should be run
  • and more…

It has a plugin/extension point with the concept of Download Managers. Download Managers are used for Pull mode configurations. There are two download managers that ship in the box, one using a simple REST endpoint to retrieve configurations and one using a SMB file share. As it currently stands, these are not open for replacement by third parties (but it could be made so - please weigh in to the PowerShell team about that before WMF 5 is done!).

A Quick Note - Push vs. Pull

DSC configurations can be imperatively pushed to a node (via the Start-DscConfiguration cmdlet or directly to the WMI API), or if a Download Manager is configured it can pull a configuration and resources from a central repository (currently either SMB file share or REST-based pull server). If a node is in PULL mode, when a new configuration is retrieved, it is parsed to find the various modules required for the configuration to be applied. If any of the requisite modules and versions are not present on the local node, the pull server can supply those.

DSC Resources

Resources are the second major component of the DSC ecosystem, and are what make things happen in the context of DSC. There are three ways of creating DSC resources: They can be written in PowerShell, as WMI classes, or in Windows Management Framework 5, as PowerShell classes. As PowerShell class-based resources are still an experimental feature and the level of effort to create WMI based resources is pretty high, we’ll focus on PowerShell-based resources here.

DSC resources are implemented as PowerShell modules. They are hosted inside another PowerShell module under a DSCResources folder. The host module needs to have a module metadata file and have a module version defined in order for it to host DSC resources.

The resources themselves are PowerShell modules that expose three functions or cmdlets:

  • Get-TargetResource
  • Test-TargetResource
  • Set-TargetResource

Get-TargetResource returns the currently configured state (or lack thereof) of the resource. The function returns a hashtable that the LCM converts to an object at a later stage.

Test-TargetResource is used to determine if the resource is in the desired state or not. It returns a boolean.

Set-TargetResource is responsible for getting the resource into the desired state. Set-TargetResource is only executed after Test-TargetResource.

The Configuration DSL

Also introduced with Desired State Configuration are some domain specific language extensions on top of PowerShell. Actually, Windows Management Framework 4 added some public extension points in PowerShell for creating new keywords, which is what DSC uses.

Stick with me here, as it may get a bit confusing - I’ll be using “configuration” in two contexts. First is the configuration script. This is defined in PowerShell and can be defined in a script file, a module, or an ad hoc entry at the command line. The second use of “configuration” is in the context of the configuration document. This is the final serialized representation of the configuration for a particular machine or class of machines. This document is in Managed Object Format (MOF) and is how CIM classes are serialized.

The first keyword defined is configuration. The configuration keyword indicates that the subsequent scriptblock will be a configuration document and should be parsed differently. All your standard PowerShell constructs and commands are valid inside of a configuration, as are a few new keywords. There are two static keywords and a series of dynamic keywords in a configuration document.

The first two static keywords are node and Import-DscResource. I’ll deal with the latter first, since it seems very oddly named. Import-DscResource looks in name like a cmdlet or function, but is a keyword that is valid only in a configuration document and only outside of the context of a node. Import-DscResource identifies custom and third-party modules to make available in a configuration document. By default, only DSC resources in modules located at $pshome/modules (usually c:\windows\system32\windowspowershell\v1.0\modules) can be used without using Import-DscResource and specifying which modules to make resources available from. The second static keyword is the node keyword. Node is used to identify the machine or class of machines that the configuration is targeted at. Resources are generally assigned inside node declarations.

The configuration also includes a number of potential dynamic keywords which represent the DSC resources available for the configuration.

An example configuration script looks something like:

configuration SysAdvent
    Import-DscResource -ModuleName cWebAdministration

    node $AllNodes.where({$_.role -like 'web'}).NodeName
      windowsfeature IIS
        Name = 'web-server'

      cWebsite FourthCoffee
        Name = 'FourthCoffee'
        State = 'Started'
        ApplicationPool = 'FourthCoffeeAppPool'
        PhysicalPath = 'c:\websites\fourthcoffee'
        DependsOn = '[windowsfeature]IIS'


The above configuration script, when run, creates a command in the current PowerShell session called SysAdvent. Running that command will generate a configuration document for every server in a collection that has the role of a web server. The configuration command has a common parameter of ConfigurationData which is where AllNodes comes from (more on that in a bit). The result of this command will be a MOF document describing the desired configuration for every node identified as a web server.

MOF documents created by the command are written in a folder (of the same name as the configuration) created in the current working directory. Files are named for the node they represent (e.g. server1.mof). You can specify a custom output location. Here is our newly created MOF document:

@GenerationDate=12/22/2014 04:12:56

instance of MSFT_RoleResource as $MSFT_RoleResource1ref
SourceInfo = "::7::7::windowsfeature";
 ModuleName = "PSDesiredStateConfiguration";
 ModuleVersion = "1.0";
 ResourceID = "[WindowsFeature]IIS";
 Name = "web-server";

 ConfigurationName = "SysAdvent";

instance of PSHOrg_cWebsite as $PSHOrg_cWebsite1ref
ResourceID = "[cWebsite]FourthCoffee";
 PhysicalPath = "c:\\websites\\fourthcoffee";
 State = "Started";
 ApplicationPool = "FourthCoffeeAppPool";
 SourceInfo = "::12::7::cWebsite";
 Name = "FourthCoffee";
 ModuleName = "cWebAdministration";
 ModuleVersion = "1.1.1";

DependsOn = {


 ConfigurationName = "SysAdvent";

instance of OMI_ConfigurationDocument
 GenerationDate="12/22/2014 04:12:56";

Other Tidbits

There are a few other things one should know in preparation for digging into DSC.

ConfigurationData and AllNodes

Configurations have support for a convention-based approach to separating environmental data from the structural configuration. The configuration script represents the structure or model for the machine, and the environmental data (via ConfigurationData) fleshes out the details.

ConfigurationData is represented by a hashtable with at least one key - AllNodes. AllNodes is an array of hashtables representing the nodes that should have configurations generated and becomes an automatic variable that can be referenced in the configuration (like in the example above). The value provided in $ConfigurationData is also referenced and you can create custom keys and reference those in your configuration document. The PowerShell team reserves the right to use any key in the ConfigurationData hashtable that is prefixed with PS.


$ConfigurationData = @{
  AllNodes = (
      @{NodeName = '*', InterestingData = 'Every node can reference me.'}
      @{NodeName = 'Server1'; Role = 'Web'},
      @{NodeName = 'Server2'; Role = 'SQL'},

Sysadvent -ConfigurationData $ConfigurationData

Resources in DSC are not ordered by default and there is no guarantee of ordering. The current WMF 4 implementation and the previews of WMF 5 all seem to serially process resources, but there is NO guarantee that will stay that way. If you need things to happen in a certain order, you need to use DependsOn to tell a resource what needs to happen first before that one can execute.

Node Names

In PUSH mode, the node name is either the server name, FQDN, or IP address (any valid way you can address that node via PowerShell remoting).

In PULL mode, the node name is not the server name. Servers are assigned a GUID and they use that to identify which configuration to retrieve from a pull server. Where this GUID comes from is up to you - you can generate them on the fly, pull one from AD, or use one from another system. Since the GUID is the identifier, you can use one GUID to represent an individual server or a class of servers.

WMF 5 - In Production

If you are running Windows Server 2012 R2, you can stay on the bleeding edge AND get production support. The PowerShell team recently announced that if you are using WMF 5, you can get production support for what they call “stable” designs - those features that either existed in previous versions of the Management Framework or have reached a level that the team is ready to provide support. Other features, which are more in flux, are labeled experimental and don’t carry the same support level. With this change, you can safely deploy WMF 5 and begin to test new features and get the bug fixes faster than waiting for the full release. WMF previews are released roughly quarterly.

With WMF 5, you can dig into new and advanced features like Debug mode, partial configurations, and separate pull servers for different resource types.

Building an Ecosystem

No tooling is complete without a community around it and Desired State Configuration is no different.

PowerShellGet and OneGet

OneGet and PowerShellGet are coming onto the scene with WMF 5 (although after they release they should be available somewhat downlevel too). OneGet is a package manager manager and provides an abstraction layer on top of things like nuget, chocolatey, and PowerShellGet, and eventually tools like npm, RubyGems, and more. PowerShellGet provides a way to publish and consume external modules, including those that contain DSC resources.

Finding new resources becomes as easy as:

Find-Module -Includes DscResource

Third Parties


Back in July 2014, Chef made a preview of our DSC integration available (video, cookbook) and in September shipped our first production-supported integration (the dsc_script resource) and have more coming. DSC offers Chef increased coverage on the Windows platform.


The guys at ScriptRock (full disclosure - they are friends of mine) have done a pretty interesting thing by taking a configuration visualization and testing tool and offering an export of the configuration as a DSC script. Very cool.


There is a Puppet module on the Forge showing some DSC integration. I’m not too familiar with the state of that project, but it’s great to see it!


Brewmaster from Aditi is a deployment tool and can leverage DSC to get a server in shape to host a particular application, allowing you to distribute a DSC configuration with an application.


PowerShell.Org hosts a DSC Hub containing forums, blog posts, podcasts, videos and a free e-book on DSC.

So, What Are You Waiting For?

Start digging in! There’s a ton of content out there. Shout at me on Twitter (@stevenmurawski) or via my blog if you have any questions.

December 24, 2014

Day 24 - 12 days of SecDevOps

Written by: Jen Andre (@fun_cuddles)
Edited by: Ben Cotton (@funnelfiasco)

Ah, the holidays. The time of year when we want to be throwing back the eggnogs, chilling in front our fake fireplaces, maybe catching a funny Christmas day movie… but oh no we can’t, because guess what, a certain entertainment company was held hostage by a security breach the likes of which corporate America has never seen before… and no more movie for you.

It’s an interesting time to be a security defender. The recent Sony breach has just put a period on the worst-of-the-worst scenarios that us tinfoil-hat, paranoid security people have been ranting about all along: one bad breach could be business shattering.

But let’s step back, and look at the theme of this blog: the 12 days of SecDevOps. Besides being a ridiculous title that I’m 90% sure my ops director chose specifically as a troll for me (thanks, Pete), it underlines an important concept. Whether `security` is in your job title or not, operations is increasingly becoming the front-line for implementing security defenses.

Given that reality, and the fact that security breaches are NOT going away, and that most of us don’t have yacht-sized security budgets, I thought it would be interesting to come up with 12 practical, high-impact things that small organizations could be doing to shore up their security posture.

Day 1: Fear and Loathing and Risk Assessment and Hipsters

Risk assessment. It’s not just some big words auditors love to use. It’s simply weighing the probability of bad things happening against the cost to mitigate the risk of that bad thing happening. And using that to make good security decisions as you make day-to-day architecture and ops choices:

risk = (threat) x (probability) x (business impact)*

*whoever told you there would be no math lied to you

You may not be aware of it, but as an ops person you are likely doing risk assessment already, except more likely around things like uptime and reliability. Consider this scenario:

  • John, the web guy, proposes replacing PostgreSQL with SomeNewHipsterDB.
  • You ask yourself, ‘huh, what’s the chances that I’m going to get paged at 3am because writes stop happening and my web site starts screaming in pain?’ You are probably not having warm-fuzzy feelings about this plan.
  • Your development and ops team evaluates the benefits to the engineering team and business for switching to SomeNewHipsterDB and weighs it against the probability that you are going to get woken up all of the time, and the impact it will have on your sunny disposition and decide that yeah… maybe not gonna do it.
  • Or, you do, except you mitigate this risk by saying ‘John, you will be forever paged for all SomeNewHipsterDB issues. Done.’

Cool. Now do this for security. Every time you are making architecture choices, or changing configuration of your infrastructure, or considering some new third-party service SaaS you’ll be sending data to, you should be asking yourself: what’s the impact if that service or system gets hacked? How will you mitigate the risks?

This doesn’t have to be a formal or fancy report. It can be a running text file or spreadsheet with all of the possible points of failure. Get everyone involved with thinking of ways pieces of the infrastructure or organization can be hacked, and ways you are protected against those worst-case scenarios. It can be like ‘ANYONE WHO OWNS OUR CHEF SERVER COULD DESTROY EVERYTHING [but we have uber-monitoring and Jane over there reviews audit logs daily]’. Start with the scenario: what if… ? and have conversations with engineers and business owners defend why what we’re doing is good enough. Make security a fundamentally collaborative process.

Day 2: Shared Secrets: Figure it Out Now

There’s 3 things in life that are inevitable: death, taxes… and the fact that a sales guy left to his own devices will always put all of his passwords in a plain text file (or if fancy, an Excel spreadsheet).

The lesson is this: password management isn’t something that just the technical team decides and manages for itself. We should be advocating organization-wide education on managing credentials, because guess what? Access to Salesforce, Gmail, and all of these SaaS services with sensitive business data are being used by people who are not engineers.

Solution? As part of every employee’s onboarding process, install password management on an employee’s workstation, and show them how to use it (e.g. 1Password or LastPass, or whatever your tool of choice is). Start doing this from the outset, as it’s best to figure this out on Day 1 rather than 200 employees in.

Day 3: Shared Secrets for Infrastructure, Too

When it comes to infrastructure secrets, there are extra concerns because in most cases, systems needs to be able to access these secrets in a non-interactive, automated way (e.g. I need to be able to spin up an app server that knows how to authenticate to my database).

If all of your infra passwords start unencrypted somewhere in a git repo, You Are Going To Have A Bad Time. Noah has a good article on various options for managing shared secrets in your infrastructure.

Day 4: Config Management On All Of The Things (So You Aren’t Sweating from Shell Shocks)

This should be obvious to everyone who drinks from the DevOps Koolaid, but CM has done beautiful things for patch management. It may be tempting to deploy a one-off box used for dev manually without config management installed, but guess what? In the case of Browser Stack, that turned out to be a massive achilles heel.

Making the process easy for devs to get access to the infrastructure they need (while giving you the ability to manage systems) is key. Do this right away.

Day 5: Secure your Development Environments (Because No One Else Will)

If left to their own devices, development environments tend to veer to chaotic. This isn’t just because developers are lazy (and as a developer, I mean this in the nicest possible way) but because of the nature of the prototyping and testing process.

From a security perspective, this all means bad juju (see Browser Stack example above). I can assure you that if you start building your prototype or dev infrastructure exposed to the public internet, deploying it without even the basic config management, it will stay that way forever.

So: if you are using AWS, start with an Amazon VPC with strict perimeter security, and require VPN access for any development infrastructure. Get some config management on everything, even if it’s just for system patches.

Put some bounds around the chaos early on, and this will make it easy to mature the security controls as the product and organization mature.

Day 6: 2-Factor all of the things (well, the important things)

Require 2-factor wherever you can. Google Apps has made enforcing this super easy, and technologies like DuoSecurity and YubiKey make adding 2-factor to your critical infrastructure (e.g., your VPN accounts) far, far less annoying than it used to be.

Day 7: Encrypt your Emails (and other communications)

Encrypt your emails. It’s annoying to set up, but guess what? Hackers just love to post juicy stuff on pastebin. Again, from Day 1, help every single employee configure PGP or SMIME encryption as part of the onboarding process. Once installed, it’s relatively painless to use (as long as you don’t mind archaic mail clients from 1999).

This is especially important to drill into executives because they tend to have more sensitive emails (e.g. their private boardroom chatter), and are particularly susceptible to phishing style-attacks. With the recent Sony email leaks, you now have some leverage. You can throw the ‘Angelina Jolie’ emails in front of them and ask: how much do you think business and reputations would suffer were their entire email archives publically disclosed via a breach?

For many of us, chat is as crucial as email in terms of the type of reputation-critical information we put there. It may not be reasonable to switch to a self-hosted chat solution, but in that case, ensure you are picking a service that helps YOU mitigate your risk. E.g., do you need all of the history? Do you need private history for user chats?

Day 8: Security Monitoring: Start Small, Plan Big

Put the infrastructure in place to collect as much security data as possible, then start slowly making potential security issues visible by adding reports and alerts that deal with threat scenarios you are most worried about.

Start small. Remember that risk assessment list you made? Identify what you are most afraid of (um, that PHP CMS that has hundreds of vulnerabilities reported per year? Your VPN server?) and tackle monitoring for those items first.

Instrumenting your infrastructure from day 1 for security monitoring (even if it’s just collecting all of the system and application logs) puts you in a good position later on to start sophisticated reporting and intrusion detection on that data.

Day 9: Code/Design Reviews

Although there have been a lot of advancement in static and dynamic source code analysis tools (which you can integrate right into your CI process), a good-old fashion code review by a human being goes a long way. If you’re using GitHub, just make it part of the development workflow and testing pipeline. Whenever changes are made to authentication or authorization, have someone look for automated tests that deal with those cases.

Day 10: Test Your Users

Phish yourself regularly. It’s really easy to do, and can be illuminating to the rest of the business which may not be as technical as the operations/engineering side, and not really understand really the impact of opening an attachment in an email or not checking URLs where they are logging into a website. You can use some open source tools, but are also many services now that you can pay to do this for you.

Day 11: Make an Incident Response Plan Now

So, you see something odd in your logs. Like, Bob your DBA ran a Postgres backup on production DB, tar’d it up, and sent it to an FTP server in Singapore. Bob lives in Reston VA, and this is definitely not normal. You start seeing evidence of other weird stuff ‘bob’ is doing that he shouldn’t be.

What now? Do you email Bob and say ‘something weird is happening?’ Do you call the Director of Ops? Do you put a message in a lonely chat room?

Figure out a plan for escalating possible critical security issues. Doesn’t have to be fancy or use specialize ITIL incident response workflow tools. Make a group in PagerDuty. Have an out-of-band channel for communicating details, in case your normal network goes the way of Sony and is totally compromised or just plain is not working. Maybe it’s as simple as an email list that doesn’t use the corporate email accounts, or a conference bridge everyone can hop on.

Day 12: Don’t be the Security ‘A**hole’

You. Yes, you. Don’t be the security a**hole that gets in everyone’s way and loses sight of the real reason for everyone’s existence: to run a business. You can be the security champion without being the blocker. In fact, that’s the only way to be effective. If a user is coming to you and saying ‘this is really really annoying, I don’t want to do it’ - listen to them. Too many security personnel disregard the usability issue of security controls for the sake of security theater, which leads to (unsurprisingly) abandonment, cynicism, and apathy when it comes to real security concerns.

DevOps is really a philosophy: it’s not a job title, or a set of tools, it’s the concept of using modern tools and processes to facilitate collaboration with the engineers who deliver the code and those who must maintain it. Um, that was a lot of words, but the key word is collaboration. It’s no longer acceptable to throw ‘security over the wall’ and expect your users and ops people to just do what you say.

The best security cultures are not prescriptive, they are collaborative. They understand that business needs to get done. They are intellectually honest and admit ‘yeah, we could get hacked’ - but what can we do about this in a way that doesn’t bring everything to a halt? Zane Lackey has a great talk on building a modern security engineering organization that expounds many of these ideas, and more.

December 23, 2014

Day 23 - The Importance of Pluralism, or The Danger of the Letter "S"

Written by: Mike Fiedler (@mikefiedler)
Edited by: Hugh Brown (@saintaardvark)

Prologue: A Concept

One aspect of Chef that’s confusing to people comes up when searching for nodes that have some attribute: just what is the difference between a nodes reported ‘role’ attribute, and its ‘roles’ attribute? It seems like it could almost be taken for a typo – but underlying it are some very deep statements about pluralism, pluralization, and the differences between them.

One definition of the term ‘pluralism’ is “a condition or system in which two or more states, groups, principles, sources of authority, etc., coexist.” And while pluralism is common in descriptions of politics, religion and culture, it also has a place in computing: to describe situations in which many systems are in more than one desired state.

Once a desired state is determined, it’s enforced. But then time passes – days, minutes, seconds or even nanoseconds – and every moment has the potential to change the server’s actual state. Files are edited, hardware degrades, new data is pulled from external sources; anyone who has run a production service can attest to this.

Act I: Terms

Businesses commonly offer products. These products may be composed of multiple systems, where each system could be a collection of services, which run on any number of servers, which run on some amount of hosts. Each host, in turn, provides another set of services to the server that makes up part of the system, which then makes up part of the product, which the business sells.

An example to illustrate: MyFace offers a social web site (the product), which may need a web portal, a user authentication system, index and search systems, long-term photo storage systems, and many more. The web portal system may need servers like Apache or Nginx, running on any number of instances. A given server-instance will need to use any number of host services, such as /o, cpu, memory and more.

So what we loosely have is: products => systems => services => servers => hosts => services. (Turtles, turtles, turtles.)

In Days of Yore, when a Company ran a ‘Web Site’, they may have had a single System, maybe some web content Service, made up of a web Server, a database Server (maybe even on the same host) - both consuming host services (CPU, Memory, Disk, Network) - to provide the Service the Company then sells, hopefully at a profit (right!?).

Back then, if you wanted to enact a change on the web and database at the same time (maybe release a new feature), it was relatively simple, as you could control both things in one place, at roughly the same time.


In English, to pluralize something, we generally add a suffix of “s” to the word. For instance, to convey more than one instance, “instance” becomes “instances”, “server” becomes “servers”, “system” becomes “systems”, “turtle” becomes “turtles”.

We commonly use pluralization to describe the concept of a collection of similar items, like “apples”, “oranges”, “users”, “web pages”, “databases”, “servers”, “hosts”, “turtles”. I think you see the pattern.

This extends even in to programming languages and idiomatic use in development frameworks. For example, all tables in a Rails application will typically pluralize a table name for objects named Apple to apples.

This emphasizes that the table in question does not store a singular Apple, rather many Apple instances will be located in a table named apples.

This is not pluralism, this is pluralization - don’t get them confused. Let’s move on to the next act.

Act II: Progress

We’ve evolved quite a bit since the Days of Yore. Now, a given business product can span hundreds or even thousands of systems of servers running on hosts all over the world.

As systems grow, it becomes more difficult to enact a desired change at a deterministic point in time across a fleet of servers and hosts.

In the realm of systems deployment, many solutions perform what has become known as “test-and-repair” operations - meaning that when provided a “map” desired state (which typically manifests in human-written and readable code), that when executed, will “test the current state of a given host, and perform and ”repair" operations to bring the host to the desired state - whether it be installing packages, writing files

Each system calls this map something different - cfengine:policies, bcfg2:specifications, puppet:modules, chef:recipes, ansible:playbooks, and so on. While they don’t always map 1:1, they all have some sort of concept for ‘Things that are similar, but not the same.’ They will have unique IP addresses, hostnames, while sharing enough of a set of common features to become termed something like “web heads” or the like.

Act III: Change

In the previous sections, I laid the groundwork to understand one of the more subtle features in Chef. This feature may be available in other services, but I’ll describe the one I know.

Using Chef, there is a common deployment model where Chef Clients check in with a Chef Server to ask “What is the desired state I should have?” The Chef terminology is ‘a node asks the server for its run list’.

A run list can contain a list of recipes and/or roles. A recipe tells Chef how to accomplish a particular set of tasks, like installing a package or editing a file. A role is typically a collection of recipes, and maybe some role-specific metadata (‘attributes’ in Chef lingo).

The node may be in any state at this point. Chef will test for each desired state, and take action to enforce it: install this package, write that file, etc. The end result should either be “this node now conforms to the desired state” or “this node was unable to comply”.

When the node completes successfully, it will report back to Chef Server that “I am node ‘XYZZY’, and my roles are ‘base’ and ‘webhead’, my recipes are ‘base::packages’, ‘nginx’, ‘webapp’” along with a lot of node-specific metdata (IP addresses, CPU, Memory, Disk, and much more).

This information is then indexed and available for others to search for. A common use case we have is where a load balancing node will perform a search for all nodes holding the webhead role, and add these to the balancing list.

Pièce de résistance, or Searching for Servers

In a world where we continue to scale and deploy systems rapidly and repeatedly, we often choose to reduce the need for strong consistency amongst a cluster of hosts. This means we cannot expect to change all hosts at the precise same moment. Rather we opt for eventual consistency: either all my nodes will eventually be correct, or failures will occur and I’ll be notified that something is wrong.

This changes how we think about deployments, and more importantly, how do we use our tools to find other nodes.

Using Chef’s search feature, a search like this:

webheads = search(:node, 'role:webheads')

will use the node index (a collection of node data) to look for nodes with the webheads role in the node’s run list - this will also return nodes that have not yet completed an initial Chef run and reported the complete run list back to Chef Server.

This means that my load balancer could find a node that is still mid-provisioning, and potentially begin to send traffic to a node that’s not ready to receive yet, based on the role assignment alone.

A better search, in this case might be:

webheads = search(:node, 'roles:webheads')

One letter, and all the difference.

This search now looks for an “expanded list” that the node has reported back. Any node with the role webheads that has completed a Chef run would be included. If the mandate is that only webhead nodes get the webhead role assigned to them, then I can safely use this search to include nodes that have completed their provisioning cycle.

Another way to use this search to our benefit is to search one axis and compare with another to find nodes that never completed provisioning:

badnodes = search(:node, 'role:webheads AND NOT roles:webheads')
# Or, with knife command line:
$ knife search node 'role:webheads AND NOT roles:webheads'

This will grab any nodes with an assignment but not a completion – very helpful when launching large amounts of nodes.

Note: This is not restricted to roles; this also applies to recipe/recipes. I’ve used roles here, as we use them heavily in our organization, but the same search patterns apply for using recipes directly in a run list.


This little tidbit of role vs roles has proven time and again to be a confusing point when someone tries to pick up more of Chef’s searching abilities. But having both adjectives describe a state of the node is helpful in making a determination of what state the node is in, and whether it should be included in some other node’s list (such as in the loadbalancer/webhead example from before).

Now, you may argue against the use of roles entirely, or the use of Chef Server and search, and use something else for service discovery. This is a valid argument - but be careful you’re not tethering a racehorse to a city carriage. If you don’t fully understand its abilities, someday it might run away on you.


A surgeon spends a lot of time how to use a sharpened bit of metal to fix the human body. While there are many instruments he or she will go on to master, the scalpel remains the fundamental tool, available when all else is gone.

While we don’t have the same risks involved as a surgeon, the tools we use can be more complex, and provide us with a large amount of power at our fingertips.

It behooves us to learn how they work, and when and how to use its features to provide better systems and services for our businesses.

Chef’s ability to discern between what a node has been told about itself, and what it reports about itself, can make all the difference when using Chef to accomplish complex deployment scenarios and maintain flexible infrastructure as code. This not only lets you accomplish fundamentals of service discovery and less hard-coded configurations, but lets you avoid the uncertainty of bringing in yet another outside tool.

On that note, Happy Holiday(s)!

December 22, 2014

Day 22 - Largely Unappreciated Applicability

Written by: John Vincent (@lusis)
Edited by: Joseph Kern (@josephkern)

I have had the privilege of writing a post for SysAdvent for the past several years. In general these posts have been focused on broader cultural issues. This year I wanted to do something more technical and this topic gave me a chance to do that. It’s also just REALLY cool so there’s that.


I’m sure most people are familiar with Nginx but I’m going to provide a short history anyway. Nginx is a webserver created by Igor Sysoev around 2002 to address the C10K problem. The C10K problem isn’t really a “problem” anymore in the sense that is was originally. It’s morphed into the C10M problem. With the rise of the sensors, we may be dealing with a Cgazillion problem before we know it.

Nginx addressed this largely with an event loop (I know some folks think the event loop was invented in 2009). The dominant webserver at the time (and still), Apache, used a model of spawning a new process or a new thread for each connection. Nginx did it differently in that it spawned a master process (handles configuration, launching worker threads and the like) and a pool of workers each with their own event loop. Workers share no state between each other and select from a shared socket to process requests. This particular model works and scales very well. There’s more on the history of Nginx in its section in the second edition of AOSA. It’s a good read and while not 100% current, the basics are unchanged.


Lua is a programming language invented in 1993. The title of this article is a shout out to how underappreciated Lua is not only as a language but in its myriad uses. Most people who have heard of Lua know it as the language used for World of Warcraft plugins.

Lua is an interesting language. Anyone with experience in Ruby will likely find themselves picking it up very quickly. It has a very small core language, first-class functions and coroutines. It is dynamically typed and has one native data structure - the table. When you work in Lua, you will learn to love and appreciate the power of tables. They feel a lot like a ruby hash and are the foundation of most advanced Lua.

It has no classes, but they can be implemented after a fashion using tables. Since Lua has first-class functions, you can create a “class” by lumping data and function into a table. There’s no inheritence but instead you have prototypes (There’s a bit of sugar to help you out when working with these ‘objects’ - e.g calling foo:somefunc() to imply self as the first argument as opposed to foo.somefunc(self)).

For a good read on the language history - Wikipedia - Lua website

For some basics on the language itself, the wikipedia article has code sample and also the official documentation. There is also a section on Lua in the newest edition of the Seven Languages series - Seven More Languages in Seven Weeks

I’ve also written a couple of modules as well (primary for use with OpenResty):

If you want to see an example of how the “classes” work with Lua, take a look at the github example and compare the usage described in the README with the module itself.

Combining the two

As I mentioned, Lua is an easily embeddable language. I’ve been unable to find a date on when Lua support was added to Nginx but it was a very early version (~ 0.5).

One of the pain points of Nginx is that it doesn’t support dynamically loaded modules. All extended functionality outside the core must be compiled in. Lua support in Nginx made it so that you could add some advanced functionality to Nginx via Lua that would normally require a C module and a recompile.

Much of the Nginx api itself is exposed to Lua directly and Lua can be used at multiple places in the Nginx workflow. You can:

All of these are documented on the Nginx website.

For example, if I wanted to have the response body be entirely created by Lua, I could do the following in Nginx:

location /foo {
  content_by_lua '
  ngx.header.content_type = 'text/plain'
  local username = "bob"
  ngx.say("hello ", username)

This example would return hello bob as plain text to your browser when you requested /foo from Nginx.

Obviously escaping could get to be a headache here so most of the *_by_lua directives (which is for inlined Lua code in the Nginx config files) can be replaced with a *_by_lua_file where the Lua code is stored in an external file.

Some other neat tricks you have available are using the cosocket api where you can actually open arbitrary non-blocking network connections via Lua from inside an Nginx worker.

As you can see, this is pretty powerful. Additionally the Lua functionality is provided in Nginx via a project called LuaJIT which offers amazing speed and predicatable usage. By default, Lua code is cached in Nginx but this can be disabled at run-time to help speed up the development process.

Enter the OpenResty

If it wasn’t clear yet, the combination of Nginx and Lua basically gives you an application server right in the Nginx core. Others have created Lua modules specifically for use within Nginx and a few years ago an enterprising soul started bundling them up into something called OpenResty.

OpenResty combines checkpointed versions of Nginx, modified versions of the Lua module (largely maintained by the OpenResty folks anyway), curated versions of LuaJIT and a boatload of Nginx-specific Lua modules into a single distribution. OpenResty builds of Nginx can be used anywhere out-of-the-box that you would use a non-Lua version of Nginx. Currently OpenResty is sponsored by CloudFlare where the primary author, Yichun Zhang (who prefers to go by “agentzh” everywhere) is employed.

OpenResty is a pretty straightforward “configure/make/make install” beast. There is a slightly dated omnibus project on Github from my friend Brian Akins that we’ve contributed to in the past (and will be contributing our current changes back to in the future). Much of my appreciation and knowledge of Lua and OpenResty comes directly from Brian and his omnibus packages are how I got started.

But nobody builds system packages anymore

Obviously system packages are the domain of greyhaired BOFHs who think servers are for serving. Since we’re all refined and there are buzzword quotas to be maintained, you should probably just use Docker (but you have to say it like Benny says “Spaceship”).

Seriously though, Docker as a packaging format is pretty neato and for what I wanted to do, Docker was the best route. To that end I give you an OpenResty tutorial in a box (well, a container).

The purpose of this repo is to help you get your feet wet with some examples of using Lua with Nginx via the latest OpenResty build. It ships with a Makefile to wrap up all the Docker invocations and hopefully make things dead simple. It works its way up from the basics I’ve described all the way to communicating between workers via a shared dictionary, making remote API calls to Github, two Slack chat websocket “clients” and the skeleton of a dynamic load balancer in Nginx backed by etcd:

In addition, because I know how difficult it can be to develop and troubleshoot against code running inside Nginx I’ve created a web based repl for testing out and experimenting with the Nginx Lua API:

To use the basic examples in the container, you can simply clone the repo and run make all. This will build the container and then start OpenResty listening on port 3131 (and etcd on 5001 for one of the demos). The directory var_nginx will be mounted inside the container as /var/nginx and contains all the neccessary config files and Lua code for you to poke/prod/experiment with. Logs will be written to var_nginx/logs so you can tail them if you’d like. As you can see it also uses Bootstrap for the UI so we’ve pretty much rounded out the “what the hell have you built” graph.

Please note that while the repo presents some neat tricks, the code inside is not optimized by any stretch. The etcd code especially may have some blocking implications but I’ve not yet confirmed that. The purpose is to teach and inspire more than “take it and run it in prod”.

Advanced Examples

If you’d like to work with the Slack examples, you’ll need to generate a slack “bot” integration token for use. The Makefile includes support for running an etcd container appropriate for use with the tutorial container. If you aren’t a Slack user then here’s a screenshot so you can see what it WOULD look like:

Wrap up

Maybe this post has inspired you to at least take a look at OpenResty. Lua is a really neat language and very easy to pick up and add to your toolbelt. We use OpenResty builds of Nginx in many places internally from proxy servers to even powering our own internal SSO system based on Github Oauth and group memeberships. While most people simply use Nginx as a proxy and static content service, we treat it like an application server and leverage the flexibility of not requiring another microservice to handle certain tasks (in addition to using it as a proxy and static content service).

The combination of Nginx and Lua won’t replace all your use cases but by learning the system better, you can better leverage the use of Nginx across the board.

December 21, 2014

Day 21 - Baking Delicious Resources with Chef

Written by: Jennifer Davis (@sigje)
Edited by: Nathen Harvey (@nathenharvey)

Growing up, every Christmas time included the sweet smells of fresh baked cookies. The kitchen would get incredibly messy as we prepped a wide assortment from carefully frosted sugar cookies to peanut butter cookies. Holiday tins would be packed to the brim to share with neighbors and visiting friends.

Sugar Cookies

My earliest memories of this tradition are of my grandmother showing me how to carefully imprint each peanut butter cookie with a crosshatch. We’d dip the fork into sugar to prevent the dough from sticking and then carefully press into the cookie dough. Carrying on the cookie tradition, I am introducing the concepts necessary to extend your Chef knowledge and bake up cookies using LWRPs.

To follow the walkthrough example as written you will need to have the Chef Development Kit (Chef DK), Vagrant, and Virtual Box installed (or use the Chef DK with a modified .kitchen.yml configuration to use a cloud compute provider such as Amazon).

Resource and Provider Review

Resources are the fundamental building blocks of Chef. There are many available resources included with Chef. Resources are declarative interfaces, meaning that we describe the state we want the resource to be in, rather than the steps required to reach that state. Resources have a type, name, one or more parameters, actions, and notifications.

Let’s take a look at one sample resource, Route.

route “NAME” do
  gateway “”
  action :delete

The route resource describes the system routing table. The type of resource is route. The name of the resource is the string that follows the type. The route resource includes optional parameters of device, gateway, netmask, provider, and target. In this specific example, we are only declaring the gateway parameter. In the above example we are using the delete action and there are no notifications.

Each Chef resource includes one or more providers responsible for actually bringing the resource to the desired state. It is usually not necessary to select a provider when using the Chef-provided resources, Chef will select the best provider for the job at hand. We can look at the underlying Chef code to examine the provider. For example here is the Route provider code and rubydoc for the class.

While there are ready-made resources and providers, they may not be sufficient to meet our needs to programmatically describe our infrastructure with small clear recipes. We reach that point where we want to reduce repetition, reduce complexity, or improve readability. Chef gives us the ability to extend functionality with Definitions, Heavy Weight Resources and Providers (HWRP), and Light Weight Resources and Providers (LWRP).

Definitions are essentially recipe macros. They are stored within a definitions directory within a specific cookbook. They cannot receive notifications.

HWRPs are pure ruby stored in the libraries directory within a specific cookbook. They cannot use core resources from the Chef DSL by default.

LWRPs, the main subject of this article, are a combination of Chef DSL and ruby. They are useful to abstract repeated patterns. They are parsed at runtime and compile into ruby classes.


Extending resources requires us to revisit the elements of a resource: type, name, parameters, actions, and notifications.

Idempotence and convergenence must also be considered.

Idempotence means that the provider ensures that the state of a resource is only changed if a change is required to bring that resource into compliance with our desired state or policy.

Convergence means that the provider brings the current resource state closer to the desired resource state.

Resources have a type. The LWRP’s resource type is defined by the name of the file within the cookbook. This implicit name follows the formula of: cookbook_resource. If the default.rb file is used the new resource will be named cookbook.

File names should match for the LWRP’s resource and provider within the resources and providers directories. The chef generators will ensure that the files are created appropriately.

The resource and it’s available actions are described in the LWRP’s resource file.

The steps required to bring the piece of the system to the desired state are described in the LWRP’s provider file. Both idempontence and convergence must also be considered when writing the provider.

Resource DSL

The LWRP resource file defines the characteristics of the new resource we want to provide using the Chef Resource DSL. The Resource DSL has multiple methods: actions, attribute, and default_action.

Resources have a name. The Resource DSL allows us to tag a specific parameter as the name of the resource with :name_attribute.

Resources have actions. The Resource DSL uses the actions method to define a set of supported actions with a comma separated list of symbols. The Resource DSL uses the default_action method to define the action used when no action is specified in the recipe.

Note: It is recommended to always define a default_action.

Resources have parameters. The Resource DSL uses the attribute method to define a new parameter for the resource. We can provide a set of validation parameters associated with each parameter.

Let’s take a look at an example of a LWRP resource from existing cookbooks.

djbdns includes the djbdns_rr resource.

actions :add
default_action :add

attribute :fqdn,     :kind_of => String, :name_attribute => true
attribute :ip,       :kind_of => String, :required => true
attribute :type,     :kind_of => String, :default => "host"
attribute :cwd,      :kind_of => String

The rr resource as defined here will have one action: add, and 4 attributes: fqdn, ip, type, and cwd. The validation parameters for the attribute show that all of these attributes are expected to be of the String class. Additionally ip is the only required attribute when using this resource in our recipes.

Provider DSL

The LWRP provider file defines the “how” of our new resource using the Chef Provider DSL.

In order to ensure that our new resource functionality is idempotent and convergent we need the:

  • desired state of the resource
  • current state of the resource
  • end state of the resource after the run
Requirement Chef DSL Provider Method
Desired State new_resource
Current State load_current_resource
End State updated_by_last_action

Let’s take a look at an example of a LWRP provider from an existing cookbook to illustrate the Chef DSL provider methods.

djbdns includes the djbdns_rr provider.

action :add do
  type = new_resource.type
  fqdn = new_resource.fqdn
  ip = new_resource.ip
  cwd = new_resource.cwd ? new_resource.cwd : "#{node['djbdns']['tinydns_internal_dir']}/root"

  unless IO.readlines("#{cwd}/data").grep(/^[\.\+=]#{fqdn}:#{ip}/).length >= 1
    execute "./add-#{type} #{fqdn} #{ip}" do
      cwd cwd
      ignore_failure true

new_resource returns an object that represents the desired state of the resource. We can access all attributes as methods of that object. This allows us to know programmatically our desired end state of the resource.

type = new_resource.type assigns the value of the type attribute of the new_resource object that is created when we use the rr resource in a recipe with a type parameter.


load_current_resource is an empty method by default. We need to define this method such that it returns an object that represents the current state of the resource. This method is responsible for loading the current state of the resource into @current_resource.

In our example above we are not using load_current_resource.


updated_by_last_action notifies Chef that a change happened to converge our resource to its desired state.

As part of the unless block executing new_resource.updated_by_last_action(true) will notify Chef that a change happened to converge our resource.


We need to define a method for each supported action within the LWRP resource file. This method should handle doing whatever is needed to configure the resource to be in the desired state.

We see that the one action defined is :add which matches our LWRP resource defined actions.

Cooking up a cookies_cookie resource

Preparing our kitchen

First, we need to set up our kitchen for some holiday baking! Test Kitchen is part of the suite of tools that come with the Chef DK. This omnibus package includes a lot of tools that can be used to personalize and optimize your workflow. For now, it’s back to the kitchen.

Kitchen Utensils

Note: On Windows you need to verify your PATH is set correctly to include the installed packages. See this article for guidance.

Download and install both Vagrant, and Virtual Box if you don’t already have them. You can also modify your .kitchen.yml to use AWS instead.

We’re going to create a “cookies” cookbook that will hold all of our cookie recipes. First we will use the chef cli to generate a cookbook that will use the default generator for our cookbooks. You can customize default cookbook creation for your own environments.

chef generate cookbook cookies
Compiling Cookbooks...
Recipe: code_generator::cookbook

followed by more output.

We’ll be working within our cookies cookbook so go ahead and switch into the cookbook’s directory.

$ cd cookies

By running chef generate cookbook we get a number of preconfigured items. One of these is a default Test Kitchen configuration file. We can examine our kitchen configuration by looking at the .kitchen.yml file:

$ cat .kitchen.yml

  name: vagrant

  name: chef_zero

  - name: ubuntu-12.04
  - name: centos-6.5

  - name: default
      - recipe[cookies::default]

The driver section is the component that configures the behavior of Test Kitchen. In this case we will be using the kitchen-vagrant driver that comes with Chef DK. We could easily configure this to use AWS or any other cloud compute provisioner.

The provisioner is chef_zero which allows us to use most of the functionality of integrating with a Chef Server without any of the overhead of having to install and manage one.

The platforms define the operating systems that we want to test against. Today we will only work with the CentOS platform as defined in this file. You can delete or comment out the Ubuntu line.

The suites is the area to define what we want to test. This includes a run_list with the cookbook::default recipe.

Next, we will spin up the CentOS instance.

Preheat Oven

Note: Test Kitchen will automatically download the vagrant box file if it’s not already available on your workstation. Make sure you’re connect to a sufficiently speedy network!

$ kitchen create

Let’s verify that our instance has been created.

$ kitchen list

➜  cookies git:(master) ✗ kitchen list
Instance             Driver   Provisioner  Last Action
default-centos-65    Vagrant  ChefZero     Created

This confirms that a local virtualized node has been created.

Let’s go ahead and converge our node which will install chef on the virtual node.

$ kitchen converge

Cookie LWRP prep

We need to create a LWRP resource and provider file and update our default recipe.

We create the LWRP base files using the chef cli included in the Chef DK. This will create the two files resources/cookie.rb and providers/cookie.rb

chef generate lwrp cookie

Let’s edit our cookie LWRP resource file and add a single supported action of create.

Edit the resources/cookie.rb file with the following content:

actions :create

Next edit our cookie LWRP provider file and define the supported create action. Our create method will log a message that includes the name of our new_resource to STDOUT.

Edit the providers/cookie.rb file with the following content:


action :create do
 log " My name is #{}"

Note: use_inline_resources was introduced in Chef version 11. This modifies how LWRP resources are handled to enable the inline evaluation of resources. This changes how notifications work, so read carefully before modifying LWRPs in use!

Note: The Chef Resource DSL method is actions because we are defining multiple actions that will be defined individually within the providers file.

We will now test out our new resource functionality by writing a recipe that uses it. Edit the cookies cookbook default recipe. The new resource follows the naming format of #{cookbookname}_#{resource}.

cookies_cookie "peanutbutter" do
   action :create

Converge the image again.

$ kitchen converge

Within the output:

Converging 1 resources
Recipe: cookies::default
  * cookies_cookie[peanutbutter] action create[2014-12-19T02:17:39+00:00] INFO: Processing cookies_cookie[peanutbutter] action create (cookies::default line 1)
 (up to date)
  * log[ My name is peanutbutter] action write[2014-12-19T02:17:39+00:00] INFO: Processing log[ My name is peanutbutter] action write (/tmp/kitchen/cache/cookbooks/cookies/providers/cookie.rb line 2)
[2014-12-19T02:17:39+00:00] INFO:  My name is peanutbutter

Our cookies_cookie resource is successfully logging a message!

Improving the Cookie LWRP

We want to improve our cookies_cookie resource. We are going to add some parameters. To determine the appropriate parameters of a LWRP resource we need to think about the components of the resource we want to modify.

Delicious delicious ingredients parameter

There are some basic common components of cookies. The essential components are fat, binder, sweetner, leavening agent, flour, and additions like chocolate chips or peanut butter. The fat provides flavor, texture, and spread of a cookie. The binder will help “glue” the ingredients together. The sweetener affects the color, flavor, texture, and tenderness of a cookie. The leavening agent adds air to our cookie changing the texture and height of the cookie. The flour provides texture as well as the bulk of the cookie structure. All of the additional ingredients differentiate our cookies flavoring.

A generic recipe would involve combining all the wet ingredients and dry ingredients separately and then blending them together adding the additional ingredients last. For now, we’ll lump all of our ingredients into a single parameter.

Other than ingredients, we need to know the temperature at which we are going to bake our cookies, and for how long.

When we add parameters to our LWRP resource, it will start with the keyword attribute, followed by an attribute name with zero or more validation parameters.

Edit the resources/cookie.rb file:

actions :create  

attribute :name, :name_attribute => true
attribute :bake_time
attribute :temperature
attribute :ingredients

We’ll update our recipe to incorporate these attributes.

cookies_cookie "peanutbutter" do
   bake_time 10
   temperature 350
   action :create

Using a Data Bag

While we could add the ingredients in a string or array, in this case we will separate them away from our code. One way to do this is with data bags.

We’ll use a data_bag to hold our cookie ingredients. Production data_bags normally exist outside of our cookbook within our organization policy_repo. We are developing and using chef_zero so we’ll include our data bag within our cookbook in the test/integration/data_bags directory.

To do this in our development environment we update our .kitchen.yml so that chef_zero finds our data_bags.

For testing our new resource functionality, add the following to the default suite section of your .kitchen.yml:

data_bags_path: "test/integration/data_bags"

At this point your .kitchen.yml should look like this.

$ mkdir -p test/integration/data_bags/cookies_ingredients

Create peanutbutter item in our cookies_ingredients data_bag by creating a file named peanutbutter.json in the directory we just created:

  "id" : "peanutbutter",
  "ingredients" :
      "1 cup peanut butter",
      "1 cup sugar",
      "1 egg"

We’ll update our recipe to actually use the cookies_ingredients data_bag:

search('cookies_ingredients', '*:*').each do |cookie_type|
  cookies_cookie cookie_type['id'] do
    ingredients cookie_type['ingredients']
    bake_time 10
    temperature 350
    action :create

Now, we’ll update our LWRP resource to actually validate input parameters, and update our provider to create a file on our node, and use the attributes. We’ll also create an ‘eat’ action for our resource.

Edit the resources/cookie.rb file with the following content:

actions :create, :eat

attribute :name, :name_attribute => true
# bake time in minutes
attribute :bake_time, :kind_of => Integer
# temperature in F
attribute :temperature, :kind_of => Integer
attribute :ingredients, :kind_of => Array

We’ll update our provider so that we create a file on our node rather than just logging to STDOUT. We’ll use a template resource in our provider, so we will create the required template.

Create a template file:

$ chef generate template basic_recipe

Edit the templates/default/basic_recipe.erb to have the following content:

Recipe: <%= @name %> cookies

<% @ingredients.each do |ingredient| %>
<%= ingredient %>
<% end %>

Combine wet ingredients.
Combine dry ingredients.

Bake at <%= @temperature %>F for <%= @bake_time %> minutes.

Now we will update our cookie provider to use the template, and pass the attributes over to our template. We will also define our new eat action, that will delete the file we create with create.

Edit the providers/cookie.rb file with the following content:


action :create do

  template "/tmp/#{}" do
    source "basic_recipe.erb"
    mode "0644"
      :ingredients => new_resource.ingredients,
      :bake_time   => new_resource.bake_time,
      :temperature => new_resource.temperature,
      :name        =>,

action :eat do

  file "/tmp/#{}" do
    action :delete


Try out our updated LWRP by converging your Test Kitchen.

kitchen converge

Let’s confirm the creation of our peanutbutter resource by logging into our node.

kitchen login

Our new file was created at /tmp/peanutbutter. Check it out:

[vagrant@default-centos-65 ~]$ cat /tmp/peanutbutter
Recipe: peanutbutter cookies

1 cup peanut butter
1 cup sugar
1 egg

Combine wet ingredients.
Combine dry ingredients.

Bake at 350F for 10 minutes.

Peanut Butter Cookie Time

Let’s try out our eat action. Update our recipe with

search("cookies_ingredients", "*:*").each do |cookie_type|
  cookies_cookie cookie_type['id'] do
    action :eat

Converge our node, login and verify that the file doesn’t exist anymore.

$ kitchen converge
$ kitchen login
Last login: Fri Dec 19 05:45:23 2014 from
[vagrant@default-centos-65 ~]$ cat /tmp/peanutbutter
cat: /tmp/peanutbutter: No such file or directory

To add additional cookie types we can just create new data_bag items.

Cleaning up the kitchen

Messy Kitchen

Finally once we are done testing in our kitchen today, we can go ahead and clean up our virtualized instance with kitchen destroy.

kitchen destroy

Next Steps

We have successfully made up a batch of peanut butter cookies yet barely touched the surface of extending Chef with LWRPs. Check out Chatper 8 in Jon Cowie’s book Customizing Chef and Doug Ireton’s helpful 3-part article on creating LWRP. You should examine and extend this example to use load_current_resource and updated_by_last_action. Try to figure out how to add why_run functionality. I look forward to seeing you share your LWRPs with the Chef community!

Feedback and suggestions are welcome

Additional Resources

Thank you

Thank you to my awesome editors who helped me ensure that these cookies were tasty!