Cloud & Engineering

Sohrab Hosseini

Dock Tales: Docker Authoring, with Special Guest Mule ESB

Posted by Sohrab Hosseini on 30 March 2015

docker, tech, mule, docktales

If you followed my previous rambling, you might feel like we are due for some hands-on Dockering. So let’s get technical!

There are plenty of Docker “tutorials” out there. They show you step-by-step what to do, but I believe that, more often than not, they fail to tell you why. And by the time you finish these tutorials, you end up with something very hacky that you would not feel comfortable deploying to production; simplicity seems to often mean sacrificing best practices.

Well not here; here I will pick up where all those other articles left off. We will concentrate on the thought process that goes behind authoring Docker images and the workflow that leads to a container that you feel comfortable putting your name against. You know what they say: Build a man a fire, and he’ll be warm for a day. Set a man on fire, and he’ll be warm for the rest of his life. (RIP, Terry Pratchett.)

Our goal today is a productionised Docker image of a Mule ESB application, running on Mule ESB Community Edition. It is worthwhile to mention that I am using Mule ESB as an example and the principles outlined here should assist you with your other Dockering endeavours as well.

I have broken this down into rounds: we will start with a hack and each round we will improve it. It goes without saying that some basic knowledge of Docker is assumed. To get the most out of this article, you should know Dockerfile syntax, docker commands and shell scripting. (Turns out that, for better or worse, 70% of Dockering is shell scripting.)These concepts are covered by many people in many articles. You will have a hard time notfinding some suitable pre-reading.

Workspace Setup

I have spoken at length about Docker’s low barrier to entry and this is no more apparent in the steps required to get a Docker authoring environment up:

  1. Install Docker

I highly recommend a Linux operating system as your Docker authoring environment. After all, this is a Linux technology so the workflow is very natural to that operating system. For the less fortunate, there is boot2docker. This application allows you to run Docker on non-Linux operating systems by using a hypervisor, which serves a small, in-memory Linux Kernel to the Docker client.

Your mileage with boot2docker will vary per OS. On MacOS, docker client is available in Terminal which will talk to the hypervisor. On Windows, however, things are less integrated and you have to SSH into the hypervisor to perform any docker commands.

If you really have to use Windows, firstly you have my sympathies. But you may want to look into using PuTTy to SSH into boot2docker, as opposed to the cumbersome Windows Command Prompt.

Also please note that a common gotcha with boot2docker is that people forget about the hypervisor. For example, when you bind an exposed port, it will not be bound to localhost but instead to whatever IP boot2docker image has been assigned (boot2docker ip will tell you).

Round 1: Conceptualise!

Authoring a Docker image is basically installing and configuring all the necessary software to run the application. These steps do not stray too far from how you would do them on a physical machine. In our example, if I was given a fresh Linux installation, I would need to:

  1. Install Oracle JRE,
  2. Install Mule Server CE, and
  3. Install Mule application in Mule server

Before I can figure out these steps though, I need to choose a target Linux distribution. Containers are usually created on top of base images. The Docker official repository offers a wide range of base images to choose from. Best practices recommend Debian as the base image since it is small enough (~ 90MB) while still being a full distribution. And who are we to argue with best practices.

If I were to run a bash script that performs the above installation steps on a Debian distro, it would look something like this:

#!/bin/bash

cd /tmp

# install Oracle JRE
wget --no-check-certificate --no-cookies \
     --header "Cookie: oraclelicense=accept-securebackup-cookie" \
    http://download.oracle.com/otn-pub/java/jdk/7u75-b13/jre-7u75-linux-x64.tar.gz
sudo tar -zxf jre-7u75-linux-x64.tar.gz -C /opt
sudo ln -s /opt/jre1.7.0_75 /opt/jre
sudo update-alternatives --install /usr/bin/java java /opt/jre/bin/java 100

export JAVA_HOME=/opt/jre
export JRE_HOME=$JAVA_HOME

# install Mule CE
wget --no-check-certificate \
    https://repository-master.mulesoft.org/nexus/content/repositories/releases/org/mule/distributions/mule-standalone/3.6.1/mule-standalone-3.6.1.tar.gz
sudo tar -zxf mule-standalone-3.6.1.tar.gz -C /opt
sudo ln -s /opt/mule-standalone-3.6.1 /opt/mule

export MULE_HOME=/opt/mule
export PATH=$PATH:$MULE_HOME/bin

# install Mule app
cp sample-app.zip /opt/mule/apps/

# run server
mule

This may look like a lot of code but we are basically downloading and copying Oracle JRE and Mule CE to /opt folder, creating symbolic links for ease of use and setting up some environment variables. Then we copy our Mule app to the Mule server’s hot deployment folder, assuming it was already in /tmp. Finally we start the Mule server.

Round 2: Dockerise!

We use a Dockerfile to describe how a Docker image is configured. Dockerfile has a very simple syntax and you can even guess what most of the commands do without consulting the reference.

So let us convert our shell script to a Dockerfile:

FROM debian:wheezy
MAINTAINER sohrab <sohrab@dpe.com.au>

# install supporting tools
RUN apt-get update
RUN apt-get install -y procps wget

WORKDIR /tmp

# install Oracle JRE
RUN wget --no-check-certificate --no-cookies \
         --header "Cookie: oraclelicense=accept-securebackup-cookie" \
         http://download.oracle.com/otn-pub/java/jdk/7u75-b13/jre-7u75-linux-x64.tar.gz
RUN tar -zxf jre-7u75-linux-x64.tar.gz -C /opt
RUN ln -s /opt/jre1.7.0_75 /opt/jre
RUN update-alternatives --install /usr/bin/java java /opt/jre/bin/java 100

ENV JAVA_HOME /opt/jre
ENV JRE_HOME $JAVA_HOME

# install Mule CE
RUN wget --no-check-certificate \
         https://repository-master.mulesoft.org/nexus/content/repositories/releases/org/mule/distributions/mule-standalone/3.6.1/mule-standalone-3.6.1.tar.gz
RUN tar -zxf mule-standalone-3.6.1.tar.gz -C /opt
RUN ln -s /opt/mule-standalone-3.6.1 /opt/mule

ENV MULE_HOME /opt/mule
ENV PATH $PATH:$MULE_HOME/bin

# install Mule app
COPY sample-app.zip /opt/mule/apps/

# run server
CMD ["mule"]

The first line defines the base image that we are using and MAINTAINER is just a documentation command. We also need to install a couple of supporting tools, like wgetand ps (used by Mule start-up script), since the Debian base image is a bit bare-bones to keep its size down. The rest of the code is similar in functionality to their bash counter-parts.

You may note we no longer need to use sudo to perform these actions as Docker starts inside the container as root by default. This can be controlled using the USER command in Dockerfile.

The rule of thumb for containerisation is to limit each container to a single main process. In our example, this is the Mule server. While it is possible to run and monitor multiple processes, using tools like supervisord, it is much cleaner to decouple each concern into its own container.

Assuming you have sample-app.zip in the same folder as this Dockerfile, you can now build and run your container. To do so, navigate to that folder and execute:

# docker build --tag sample-app .
# docker run -it sample-app

Compare Compare: Round 1 to Round 2

Round 3: Optimise!

A Docker image is essentially a multi-layer file system. Once the container is running, these layers are flattened to create one cohesive file system. Almost each line of our Dockerfile creates a layer that is stacked on top of the layer created by previous line.

The runtime performance of a container may suffer when there are too many layers, especially if the application needs to modify a file stored in a much lower layer. This is because all lower layers are read-only and require copy-on-write to be modified. As such, a good practice is to keep image layers to a minimum, for example by merging neighbouring RUN commands into a single RUN command.

My rule of thumb is to let each layer provide a specific function, e.g. a layer to install Oracle JRE, and another layer to install Mule CE. This may also help with layer reusability if you have similar commands, in similar order, in other Dockerfiles.

Before I show you the updated Dockerfile, there is another optimisation we can make. If you run docker image command, you will realise that Debian image is, at the time of writing, 84.98 MB, while your sample-app image is clocking at 541.90 MB.

It is to your benefit to make the image as small as possible. Not only does it improve the runtime performance of the container, it also speeds up pushing and pulling images to and from registries. Looking at our Dockerfile, a quick way to reduce the size would be to clean-up after we install each software. Ideally you want to perform the installation and clean-up in the same layer to avoid creating a large intermediate layer which defeats the purpose of cleaning up.

In my examples, you can see that I am in the habit of cleaning up as the last chain in a RUNcommand. For example, invoking apt-get clean following an apt-get install.

FROM debian:wheezy
MAINTAINER sohrab <sohrab@dpe.com.au>

# install supporting tools
RUN apt-get update && \
    apt-get install -y procps wget && \
    apt-get clean && \
    apt-get purge 


WORKDIR /tmp

# install Oracle JRE
RUN wget --no-check-certificate --no-cookies \
         --header "Cookie: oraclelicense=accept-securebackup-cookie" \
         http://download.oracle.com/otn-pub/java/jdk/7u75-b13/jre-7u75-linux-x64.tar.gz && \
    tar -zxf jre-7u75-linux-x64.tar.gz -C /opt && \
    ln -s /opt/jre1.7.0_75 /opt/jre && \
    update-alternatives --install /usr/bin/java java /opt/jre/bin/java 100 && \
    rm -rf jre-7u75-linux-x64.tar.gz

ENV JAVA_HOME /opt/jre
ENV JRE_HOME $JAVA_HOME

# install Mule CE
RUN wget --no-check-certificate \
         https://repository-master.mulesoft.org/nexus/content/repositories/releases/org/mule/distributions/mule-standalone/3.6.1/mule-standalone-3.6.1.tar.gz && \
    tar -zxf mule-standalone-3.6.1.tar.gz -C /opt && \
    ln -s /opt/mule-standalone-3.6.1 /opt/mule && \
    rm -rf mule-standalone-3.6.1.tar.gz /opt/mule/apps/default /opt/mule/src

...

You would be interested to know that following these changes, the size of the image was reduced to 368.90 MB. You can further analyse the size of each intermediate layer using docker history command and optimise as needed.

     Compare Compare: Chain RUN commands

     Compare Compare: Add clean-up steps

     Compare Compare: Round 2 to Round 3

Round 4: Productionise!

Once you start building more and more images, some best practices tend to emerge. Here I will share some of the ones we have come across during our engagements.

Allow Complex CMD

Seldom, in our experience, CMD command ends up being a single action. As a result, we have developed the habit to encapsulate all those start-up actions in a standard script to allow for future enhancements.

We do this by introducing a start.sh file to the root folder. Currently the script is very simple. But even before the end of the article, we would have used this mechanism to perform more complex actions.

#!/bin/bash

echo "Starting Mule CE Server"
exec mule

I must also draw your attention to exec command. This standard Linux command replaces the shell process with the application process, rather than running the application through the shell. This means that any Unix signal sent to the container is received by your application, rather than being captured by the shell. So I recommend always ending your script with a exec.

Obviously, we need to also modify the Dockerfile to add start.sh into the container and assign it as the CMD command.

...

# run server
COPY start.sh /start.sh
RUN chmod +x /start.sh
CMD ["/start.sh"]

Compare Compare: Externalise Docker command to start.sh

Refactor Constants

Similar to programming, you want to factor out constants that may be changed at a later date to the preamble of your Dockerfile. For example, factoring out the version of Mule runtime would allow another developer to upgrade the version without being tangled in all your Dockerfile logic.

Variables are represented by environment variables in Dockerfile, similar to shell scripts with difference being that they are exported by default.

...

ENV MULE_VERSION 3.6.1

...

# install Mule CE
RUN wget --no-check-certificate \
         https://repository-master.mulesoft.org/nexus/content/repositories/releases/org/mule/distributions/mule-standalone/${MULE_VERSION}/mule-standalone-${MULE_VERSION}.tar.gz && \
    tar -zxf mule-standalone-${MULE_VERSION}.tar.gz -C /opt && \
    ln -s /opt/mule-standalone-${MULE_VERSION} /opt/mule && \
    rm -rf mule-standalone-${MULE_VERSION}.tar.gz /opt/mule/apps/default /opt/mule/src

...

I have also attempted a similar refactoring for Oracle JRE but it is not nearly as elegant, because consistency is a lost art to some people.

Compare Compare: Factor out versions into environment variables

Drop Privileges

I have seen a lot of Dockerers (Dockerites? – still working on it) run their applications inside the container as root, since that is what Docker defaults to. Running applications in a container does not suddenly absolve one from common-sense security practices. There is another article in the pipeline about Docker security but for now, just assume if it was a bad idea outside the container, it is a bad idea inside too.

To this end, I prefer Mule server to run as its own user, which has permissions for the Mule installation and nothing else.

...

# run Mule server as non-root
RUN useradd mule && \
    chown -RL mule /opt/mule
ENV RUN_AS_USER mule

...

Here, Mule start-up script gave us a mechanism to run as another user by simply setting an environment variable. Obviously the approach would differ based on what software is being containerised but ensure that you are observing the principle of least privilegewhere possible.

Compare Compare: Run Mule server as non-root

Persist State Outside the Container

Containers must be ephemeral and not hold state. Unfortunately we do not live in a dream-like fairyland with stateless containers as far as eyes can see. In real life, eventually something has to hold state.

If you are committed to containerising everything (something I have often been accused of taking too far), you need to mount your physical disk into the container so your application can write to it.

For example, I want all the Mule and application logs to be persisted outside the container so they are preserved even if the container is destroyed or replaced by a new instance. VOLUME command does just that.

...

VOLUME /opt/mule/logs

...

We can now modify our docker run command to use this volume:

# docker run -it -v /data/mule-app:/opt/mule/logs sample-app

We tend to pick a common location, like /data or /volumes to store all running container mounted volumes. This would mean that migration of a server is boiled down to moving this directory to the new server and running the containers again.

One aspect that tends to confuse users is that -v arguments mounts a physical directory or file into the container, and not the other way around. In our example, Docker will blow away whatever is stored in /opt/mule/logs inside the container and replace its content with the content of /data/mule-app.

Compare Compare: Mount log folder as a volume

Let the Applications Out

sample-app Mule application is actually a HTTP service, listening to 0.0.0.0 on port 9000. For this application to be of any use, we need to expose this port to the outside the container.

...

EXPOSE 9000

...

The EXPOSEd port can be mapped to any physical port on the host. For example the following run command, maps it to 8080:

# docker run -it -p 8080:9000 -v /data/mule-app:/opt/mule/logs sample-app

Compare Compare: Expose HTTP port of the Mule app

Compare Compare: Round 3 to Round 4

Bonus Round: Parameterise!

Another use case we encounter regularly is promoting containers between different environments, e.g. from development to test or from UAT to production. Each environment has specific configurations that need to be applied to the application inside the container, for example different HTTP endpoints for prod and non-prod.

Obviously re-building the image, each time we want to promote the application, defeats the purpose of using containers and in the context of Continuous Delivery, it is akin to blasphemy. The solution is to build parameterised Docker images. There are two common ways to configure a container at runtime:

  • Mounting configuration files as volumes
  • Leveraging environment variables

I tend to use the former for one-off applications. For example, I would install Go CD server only once per client project so it makes sense to externalise its configuration to the real file system. On the other hand, if the containers are changed often, e.g. each code commit spins up a new Mule app container, I prefer the latter approach since no file system clean-up is required once the container is decommissioned.

The parameterisation is usually achieved through configuration file templating. While not the most light-weight approach, we tend to use erb templates for this purpose. erbtemplates are almost de facto in DevOps world and reduces migration overhead when moving templates from the likes of Chef or Puppet.

Everyone has a variation on this but my technique is to place the erb files in /builddirectory of the container and compile them on container start-up as needed.

In our example, let us assume that the Mule application is expecting sample-app.properties to be present on the classpath. We start with the template:

sample.user.name=<%= ENV['SAMPLE_USER_NAME'] %>

Which is placed inside the container:

...

COPY sample-app.properties.erb /build/sample-app.properties.erb

...

And compiled at container start-up (start.sh approach paying for itself already):

...

# compile the templated properties file, if not already
PROPERTIES_FILE=/opt/mule/conf/sample-app.properties
if [ ! -f $PROPERTIES_FILE ] ; then
	echo "Replacing $PROPERTIES_FILE"
	erb /build/sample-app.properties.erb > $PROPERTIES_FILE
	chown mule $PROPERTIES_FILE
fi

...

Now we can run the container in each environment with a different value for SAMPLE_USER_NAME:

# docker run -it -p 8080:9000 -v /data/mule-app:/opt/mule/logs -e "SAMPLE_USER_NAME=Sohrab" sample-app

Compare Compare: Round 4 to Bonus Round

Final Round

There is no final round. You are already done. Congrats.

But that is not to say that there are no more improvements to be made. For example, you may want to consider creating intermediate images to improve their reusability. In a recent client, I created the following hierarchy of Docker images:

Example of Docker Image Hierarchy

You can even go a step further and integrate Docker into your build system, such as Maven or Gradle. This way Docker images are created as part of the build lifecycle, freeing developers to concentrate on creating decent Mule applications. We will show you an example of this in a future article.

As you may have already noticed, the code for this blog is available on GitHub. I have aligned the commits to the sections in this article so you could have skipped reading this article and just looked through that. But that way, you would have only known the how but not the why

Compare Compare: The quality of your Docker images before this article and after

Attribution: Git Compare logo from GitHub, released under MIT License

 

If you like what you read, join our team as we seek to solve wicked problems within Complex Programs, Process Engineering, Integration, Cloud Platforms, DevOps & more!

 

Have a look at our opening positions in Deloitte. You can search and see which ones we have in Cloud & Engineering.

 

Have more enquiries? Reach out to our Talent Team directly and they will be able to support you best.

Leave a comment on this blog: