Running R in Docker - Part 2 - Creating Images

Posted: July 22, 2018

This post is the second in a series of three blog posts that cover the basics of running R in Docker. The three parts are:

  • Part 1: Running R in Docker containers interactively and as server processes
  • Part 2: Extending Rocker images to install packages
  • Part 3: Knitting RMarkdown in containers for reproducible research

In this post, we will discuss how to install packages in an image using rocker/tidyverse as the base image. The basic steps are the same for extending rocker/rstudio or rocker/r-ver...or any Docker image, for that matter.

Installing Packages in a Running Container

In Part 1 we saw that it was quite easy to run an instance of RStudio Server with the tidyverse installed:

$: docker run -d --name rstudio -p 80:8787 rocker/tidyverse

What do we do if we find we really need the ggthemes package consistently in our work? Since ggthemes is not part of the tidyverse, it will not be available in the RStudio Server container that we just started. One option would be to simply ask users to install it if they need it, by running install.packages("ggthemes") from the console prompt in RStudio. This would, of course, work...to a point. But remember that the package would only be available in the container in which the package was installed. If you remove the container, and start up a new one, you would need to repeat the package install.

It would be much better to create an image with the environment you want, so that the package install happens once for all the future containers that need it, without any additional effort required at container runtime.

Creating an Image that Installs R Packages

Creating our own image requires writing a Dockerfile, which is a sort of script that defines the image filesystem and environment. There are many useful things that you can do in a Dockerfile to control the image environment. Here, we only need two know two Dockerfile instructions: FROM, which identifies the image to serve as the base for our new image, and RUN, which runs a command in the image assembly process. The following Dockerfile customizes the rocker/tidyverse image to include ggthemes:

FROM rocker/tidyverse

RUN R -e 'install.packages("ggthemes")'

To build the image (which we'll call tidyverse-ggthemes), put the Dockerfile (and be sure to name it "Dockerfile") in an empty directory (whch is usually inside a directory structure maintained in a source control repository, but for now, any empty directory will do). Then, from the directory where the Dockerfile lives, run:

$: docker build -t tidyverse-ggthemes .

If the image build succeeded, you should see tidyverse-ggthemes in the list of images produced by running docker images. You can also do a quick test to verify that the package was installed properly:

$: docker run -ti --rm tidyverse-ggthemes R -e 'library(ggthemes)'

If you see:

Error in library(ggthemes) : there is no package called ‘ggthemes’
Execution halted
then something is wrong...make sure the Dockerfile and build command are exactly as above. If instead you see:
> library(ggthemes)
>
Then all is well! (This command also illustrates how it's possible to pass arguments to the process executable when running a container.)

You can now run RStudio Server from this image, and ggthemes will be installed and ready to use:

$: docker run -d --name rstudio -p 80:8787 tidyverse-ggthemes
  

In the next post

In the third and final post in this series, we will see how to use Docker to reproduce research results in the form of an RMarkdown document knitted by a Docker container.