When production differs from development. A tale of two web apps.

I've been meaning to blog about Docker for awhile. When blogging about technology I tend to like to cover pros and cons (Spoiler...the pros outweigh the cons).

One of the best "pros" is the same Docker image used in development will make it all the way to production with no changes. You might use the image slightly differently in production then development but you are using the same Docker image.

I had an experience recently that really hit that home and that experience is what this blog is about.

Some background information. Last year we did a one day training bootcamp at Sequoia. The bootcamp theme was Amazon Web Services (AWS). One of my presentations covered AWS CloudFormation where I discussed my experience migrating my personal website to AWS. (The length of my presentations tend to be directly proportional to the amount of effort to create them. You can checkout the more concise blog.)

My personal website "www.robhughes.net" is a simple ruby-on-rails (RoR) app. In my development environment I use WEBrick as the web server and sqlite as the backing database.

For the production environment I like Apache httpd for the web server, Phusion Passenger to make httpd "ruby aware" and Amazon RDS (MySQL flavor) as the backing database.

And right there is where I think I made a mistake. Thinking about production differently than development. It's too much of a pain to have a separate installation of httpd and a separate database for each web app that I might work on. Docker can help with that.

CloudFormation allows you to create configuration driven approach to deploying an application environment. The configuration is captured in a "JSON template" which can then be used to create "cookie cutter" copies of your application. I love CloudFormation. Part of the CloudFormation template, for my web app, details how to provision virtual servers, called EC2 instances, that make up the web-tier of the application. Some Amazon EC2 instances come with a tool called "cloud-init" installed. As EC2 instances spin up cloud-init takes metadata passed to it from the CloudFormation service to install the required software and services running on the server. For my web app that involves installing things like Apache httpd, ruby 2.1 from the Amazon YUM package repos and the Phusion Passenger gem from RubyForge. The Phusion Passenger gem requires other ruby gems. Ruby gems often build native extensions when they are installed which requires additional tools like compilers, make/automake, scripting languages, etc. (I encountered a gem the other day that required the "patch" executable to be installed.)

At this point you can see that my production environment is significantly different from my development environment. It does make the production environment more complex making it more susceptible to breakage and leading to "but it works in development".

The main mistake I made was in not locking down the ruby gem dependencies for Phusion Passenger. Over time new versions of gems are released. As EC2 instances die off and new ones are created those new gems get pulled in. Over time gem dependencies change. Either different versions of gems are required or new gems are required. It leads to a condition I call "ruby gem version madness". A couple of times over the past six months I would check in on my website only to find it non-responsive. Investigating I would find entries in various logs indicating that "this and such" required gem was not installed. After days of trying to find a stable set of ruby gems I found myself thinking "why is this so hard?" There must be a better way.

It turns out the folks who created Phusion Passenger also created Docker images containing Phusion Passenger. Smart. The Docker image I found allows you to select from various Ruby versions and to enable/disable Nginx as the web server. The Phusion Passenger website contains excellent documentation and they didn't scrimp on the documentation for the Phusion Passenger Docker image. Between the documentation and other Docker community contributors I quickly enough had a Docker image running my RoR app with Nginx and Phusion Passenger as the front end.

Getting that Docker image into the AWS environment required a few tweaks to my existing CloudFormation template. I was pleasantly surprised to find I significantly reduced the amount of cloud-init logic to provision a new EC2 instance. I'd estimate the number of executable lines of configuration logic dropped from around 90 lines to four. When things like that happen I usually feel like I am going in the right direction.

Those four lines of logic are basically:

  1. Install the Docker package
  2. Start the Docker service
  3. Start the rhdndocker (aka Rob Hughes Dot Net Docker) container. This also pulls the rhdndocker image from Docker Hub to the EC2 instance.
  4. Return status to CloudFormation to signal success or failure.

Also, CloudFormation passes Amazon RDS (MySQL) database connection parameters to cloud-init which are then passed as environment variables into the Docker container to allow it to connect to its backing database.

Now the production environment is basically Docker running my Nginx/Phusion Passenger/Rails container and talking to Amazon RDS (MySQL).

But what does my development environment look like? You got it. Virtually the same. Docker running my Nginx/Phusion Passenger/Rails container and talking to a linked Docker MySQL container. It also allows me to run as many Nginx/Phusion Passenger/MySQL instances as I have ideas for new web apps without polluting my laptop with lots of software.

For more information check out these links: