Your Dockerfile for Rails

In the previous article, we have deployed a typical Rails application using Docker. Once Docker has been set up, the install was quite straightforward: just retrieve an image from the Docker Index and run it! Now, you may wonder how easy it was to build that image. So let’s watch the making of the previous episode!

For those who can’t wait, the “making of” is already published on GitHub as rails-meets-docker. So you can have a look at the Dockerfile straight away. But I assume you want to know more!

More Than One Way To Do It

There are a few ways to create a Docker image:

  • create a new image from a tarball using docker import
  • build an image on top of an existing one using docker build
  • save a customized container as an image with docker commit

The import command is useful to create a Docker image from a vanilla Linux distribution. But I guess this is the main purpose of this command. So let’s ignore the import for now.

Most of the time, we leverage an existing image to create a new one. And it’s easy to do that “by hand”:

  1. spawn a container from the base image using docker run
  2. do the setup, install the packages, and so on
  3. docker commit the container as a new image

Many Docker examples are based on this “manual build”. And this is fine for prototyping, really. But it lacks two things, at least:

  • we can’t easily diverge and make small changes to the build process
  • we can’t be sure about the exact steps required for the build

We are developers so we prefer to write code whenever this is possible. It’s because we know about all the benefits: making things reproducible, easy to edit and self-documented. So let’s write some code to instrument the image build!

From the previous episode

Before going further, it’s best to recall a few things from the previous episode:

  • an image is immutable: it cannot be modified
  • a container is a living thing
  • a container is spawned from an image

The build process may involve some shell-script commands such as apt-get install to update the image and install new programs. But nothing can be executed inside an image because it’s frozen, right? This means that docker build may proceed like this:

  1. spawn a container from an image
  2. run shell scripts inside the container
  3. save the result: commit the container as an intermediate image
  4. proceed to next build step

This is like a manual build, except that it is not!

Dockerfile

The Dockerfile is a script that describes the docker build process. Its syntax is inspired by the Unix Shell and is easy to learn:

  • there is one statement per line: a Docker instruction and its arguments
  • shell-like comments and empty lines are ignored

The Dockerfile syntax supports a dozen of instructions. Here is a short overview:

  • FROM sets the name of the base image
  • RUN runs a shell command inside the container
  • ADD imports file from the current filesystem into the container
  • the other instructions are about how to run the image

Remember that RUN and ADD both operate on a temporary container that Docker will create for the purpose of the build. It then commits the container into an image and proceeds to the next step.

Now, we are ready to dive into real Dockerfile action!

FROM fcat/ubuntu-universe:12.04

RUN apt-get -qy install git vim tmux
RUN apt-get -qy install ruby1.9.1 ruby1.9.1-dev build-essential libpq-dev libv8-dev libsqlite3-dev
RUN gem install bundler
RUN adduser --disabled-password --home=/rails --gecos "" rails

ADD docrails/guides/code/getting_started /rails
ADD scripts/start /start
ADD scripts/setup /setup
RUN su rails -c /setup

EXPOSE 3000
USER rails
CMD /start

I’ve rearranged the content for easy reading but this is almost what I have used to build the fcat/rails-getting-started Docker image. It proceeds in 3 main steps:

  1. set the base image
  2. add files and run commands
  3. set the specs

The base image

The base image given to the FROM instruction can be either local or remote. And Docker is kind enough to fetch (and cache) automatically the remote image when it’s missing!

The meat of the build

As you can see, the RUN instruction simply passes the commands verbatim to the shell. We tune the commands so that they don’t attempt to interact with us. For instance, apt-get -qy simply means: be quiet and assume “yes” if you’ve got some question for me.

The ADD command copies a file or an entire directory from the current directory (where you build) to the container (what you build). Warning! Don’t put quotes in the destination path!

Everything happens as if we were logged in as root typing commands inside the container:

  • files we ADD belong to root
  • scripts are run with root privileges

There is no problem with that, but you may want to change the owner of the file you copy (using chown) or switch to another user before running a command. If you are not too familiar with Unix tools, here is how to run “touch /tmp/hereiam” as user “nobody”

su nobody -c "touch /tmp/hereiam"

The Presets

The last part of the Dockerfile contains the specs of the image:

EXPOSE 3000
USER rails
CMD /start

They behave like defaults for the docker run command. They describe:

  • the command to start when spawning the container
  • the user who starts that final command
  • the network ports to expose

It’s possible to override all these values when we docker run but it’s best practice to embed the presets in the image. See also ENV and VOLUMES.

Please note that “preset” is not the official term in Docker documentation; this is just a quick reminder you may find helpful.

Scripts to make your life easier

As you may have noticed, the Dockerfile adds 2 shell scripts to our image:

ADD scripts/start /start
ADD scripts/setup /setup

Their purpose is to extract logic, both for maintenance and build speed. And these two save us a significant amount of time!

The /start script is the command to run when the image “starts”. It will be something like rails server in our case. It matches the CMD instruction in the Dockerfile:

CMD /start

This is just a convention of mine but it makes things easier when entering the container in console mode: it makes it easier to mimic a normal docker run with no argument.

The setup script is all about running bundler, tuning configuration files, and running database migrations. Nothing special here, but that means the resulting docker image is ready to use!

Here is a basic /setup script:

#!/bin/bash
set -e

cd /rails
bundle install
rake db:migrate

One last thing. Having these scripts make prototyping easier: you can check the “setup” and the “start” without having to fire a new build.

Ready to build

By now, we have our Dockerfile and the files to add to the image:

$ ls -1 Dockerfile scripts/*
Dockerfile
scripts/setup
scripts/start

Building the image is now trivial:

$ docker build .

Step 1 : FROM fcat/ubuntu-universe:12.04
 ---> 3ce111668a02
...
Step 12 : CMD /start
 ---> Running in d73ab04860c1
  ---> 3248af6376ee
  Successfully built 3248af6376ee

The command returns the unique id of the new image. This is OK, but it would be easier to give a name to the image straight away. So I suggest you register on the Docker Index and give a pretty name to your image, based on your account name.

My login is “fcat” and I came up with this “rails-getting-started” name for my image, so here is the full build command:

$ docker build -t fcat/rails-getting-started .

This is it really. And, by the way, I’m ready to push!

$ docker push fcat/rails-gettings-started

The push refers to a repository [fcat/rails-getting-started] (len: 1)
Sending image list

Please login prior to push:
...

Pushing repository fcat/rails-getting-started (1 tags)
Pushing 8dbd9e392a964056420e5d58ca5cc376ef18e2de93b5cc90e868a1bbc8318c1c
...

As we said before, Docker will create an incremental set of images, from the base image (see FROMinstruction) to the resulting one. And the final one is given our pretty name.

By the way, this implies that docker build may create a lot of intermediary images! The previous build features 12 steps, so I now have 12 local images. See docker images -a to get the full image list.

About the cache

Was your first build successful? Great! But it took a while, didn’t it? And you may want to do more adjustments to your Dockerfile. At the end of the day, it will probably take a lot of time, right?

There is good news for you: Docker build has a cache where it stores every single image it creates. This means that Docker should be able to reuse many intermediate images the next time you ask for a build.

Here is what we get if we rerun the exact same build:

$ docker build -t fcat/rails-getting-started .

Step 1 : FROM fcat/ubuntu-universe:12.04
 ---> 3ce111668a02
Step 2 : RUN apt-get -qy install git vim tmux
 ---> Using cache
 ---> bdf910ca1d22
Step 3 : RUN apt-get -qy install ruby1.9.1 ruby1.9.1-dev build-essential libpq-dev libv8-dev libsqlite3-dev
 ---> Using cache
 ---> 0430324fb5b3
Step 4 : RUN gem install bundler
 ---> Using cache
 ---> 8a3096a60ec7
Step 5 : RUN adduser --disabled-password --home=/rails --gecos "" rails
 ---> Using cache
 ---> d01a7d5a984b
Step 6 : ADD docrails/guides/code/getting_started /rails
 ---> 808cfff0f433
Step 7 : ADD scripts/setup /setup
 ---> 941accc5712e
Step 8 : RUN su rails -c /setup
 ---> Running in 54aec83841a2
Fetching gem metadata from https://rubygems.org/..........
Fetching gem metadata from https://rubygems.org/..
Resolving dependencies...
Installing rake (10.1.0)
Installing i18n (0.6.4)
...
Successfully built 3248af6376ee

The very first build took about 30 minutes on my Intel® Core™ i5 CPU, but the second one took less than 4 minutes. That’s a lot better!

So Docker has a smart cache system. But this is not the silver bullet you are looking for. As you can see, Docker was not able to re-use the result of the setup script. This makes sense as this script calls bundle install, and the result may vary over the time.

We have a similar issue with apt-get update: when running this command, the resulting image will never be reused. Hopefully, the cache works fine with apt-get install.

The making-of the fcat/rails-getting-started taught me a few tips to make the build faster:

  • group related commands in shell scripts, like the setup one
  • run these scripts at the end of the Dockerfile (we know the result will never be reused)
  • split your build and your Dockerfile as needed

Going to production

Now let’s practice a little bit. Let’s pretend that you want to adjust the rails-getting-started to make it closer to your production requirements. The minimal modification would be to switch from Webrick to a real Ruby application server.

So let’s move to thin, to keep things easy. The resulting image will fcat/rails-getting-started-thin, to match my “fcat” account.

First step is to define the base image:

FROM fcat/rails-getting-started

Then, create a new Gemfile with ‘thin’ enabled. Add it to the container:

ADD Gemfile /rails/Gemfile

Update the setup script to compile the assets:

ADD setup /setup

The file is owned by root. We have to fix that:

RUN chown rails:rails /rails/Gemfile

The setup script from the base image will work just fine. But we have to run it again to update the gem bundle:

RUN su rails -c /setup

The content of the new image is now good enough for us. But the image still need some presets to make it ready for use:

ENV RAILS_ENV production
EXPOSE 3000
USER rails
CMD /start

We set the RAILS_ENV environment variable using Docker. From now, the default Rails environment will be “production” whenever a new container is spawned from the image.

The Dockerfile and the Gemfile are ready. Let’s build!

$ docker build -t fcat/rails-getting-started-thin .

Now, we can share the new image on the Docker Index again:

$ docker push fcat/rails-gettings-started-thin

You should be able to run this Rails application easily:

$ docker run -d -p 5000:3000 fcat/rails-getting-started-thin

Track your build

It may take some time to be familiar with the Docker build. But at the end it is really worth it: with a little practice, creating a new Dockerfile is “piece of cake”, and it’s very easy to share your images and reuse other ones! Dockerfiles make your build trackable, easy to edit and self-documented.