Introduction

In essence, liftr aims to solve the problem of persistent reproducible reporting. To achieve this goal, it extends the R Markdown metadata format, and uses Docker to containerize and render R Markdown documents.

Metadata for containerization

To containerize your R Markdown document, the first step is adding liftr fields to the YAML metadata section of the document. For example:

---
title: "The Missing Example of liftr"
author: "Author Name"
date: "2017-10-15"
output: rmarkdown::html_document
liftr:
  maintainer: "Maintainer Name"
  email: "[email protected]"
  from: "rocker/r-base:latest"
  pandoc: true
  texlive: false
  sysdeps:
    - gfortran
  cran:
    - glmnet
  bioc:
    - Gviz
  remotes:
    - "road2stat/liftr"
  include: "DockerfileSnippet"
---

All available metadata fields are expained below.

Required metadata

  • maintainer

    Maintainer’s name for the Dockerfile.

  • email

    Maintainer’s email address for the Dockerfile.

Optional metadata

  • from

    Base image for building the docker image. Default is "rocker/r-base:latest". For R users, the images offered by the rocker project and Bioconductor can be considered first.

  • pandoc

    Should we install pandoc in the container? Default is true.

    If pandoc was already installed in the base image, this should be set to false to avoid potential errors. For example, for rocker/rstudio images and bioconductor/... images, this option will be automatically set to false since they already have pandoc installed.

  • texlive

    Is TeX environment needed when rendering the document? Default is false. Should be true particularly when the output format is PDF.

  • sysdeps

    Debian/Ubuntu system software packages depended in the document.

    Please also include software packages depended by the R packages below. For example, here gfortran is required for compiling glmnet.

  • cran

    CRAN packages depended in the document.

    If only pkgname is provided, liftr will install the latest version of the package on CRAN. To improve reproducibility, we recommend to use the package name with a specified version number: pkgname/pkgversion (e.g. ggplot2/1.0.0), even if the version is the current latest version. Note: pkgversion must be provided to install the archived versions of packages.

  • bioc

    Bioconductor packages depended in the document.

  • remotes

    Remote R packages that are not available from CRAN or Bioconductor.

    The remote package naming specification from devtools is adopted here. Packages can be installed from GitHub, Bitbucket, Git/SVN servers, URLs, etc.

  • include

    The path to a text file that contains custom Dockerfile snippet. The snippet will be included in the generated Dockerfile. This can be used to install additional software packages or further configure the system environment.

    Note that this file should be in the same directory as the input R Markdown file.

Containerize the document

After adding proper liftr metadata to the document YAML data block, we can use lift() to parse the document and generate a Dockerfile.

We will use a minimal example included in the liftr package. First, we create a new directory and copy the R Markdown document into the directory:

dir_example = "~/liftr-minimal/"
dir.create(dir_example)
file.copy(system.file("examples/liftr-minimal.Rmd", package = "liftr"), dir_example)

Then, we use lift() to parse the document and generate the Dockerfile:

library("liftr")

input = paste0(dir_example, "liftr-minimal.Rmd")
lift(input)

After successfully running lift(), the Dockerfile will be in the ~/liftr-minimal/ directory.

Render the document

Now we can use render_docker() to render the document into an HTML file, under a Docker container:

The function render_docker() will parse the Dockerfile, build a new Docker image, and run a Docker container to render the input document. If successfully rendered, the output liftr-minimal.html will be in the ~/liftr-minimal/ directory. You can also pass additional arguments in rmarkdown::render to this function.

In order to share the dockerized R Markdown document, simply share the .Rmd file. Other users can use the lift() and render_docker() functions to render the document as above.

Cleaning up

To clean up the (unused) Docker image after sucessful rendering, you can use purge_image():

purge_image(paste0(dir_example, "liftr-minimal.docker.yml"))

The above input YAML file contains the basic information of the Docker container, image, and commands to render the document. It is generated by setting purge_info = TRUE (default) in render_docker().

System requirements

Docker is an essential system requirement when using liftr to render the R Markdown documents. Here is a detailed guide for installing Docker on major operation systems.

For Linux, we should configure Docker to run without sudo. To avoid sudo when using the docker command, simply create a group named docker and add yourself to it:

sudo groupadd docker
sudo usermod -aG docker $USER