B Reproducible Report

Creating reproducible reports is important in data analysis, or in general, open science. The idea behind the reproducible reports is that any readers should be able to implement the same analysis on the dataset and obtain the same results.

A reproducible report usually includes:

  • Your codes for data processing and analysis
  • Your results of the data analysis
  • Your descriptions, summaries and/or interpretation of the analysis

A reproducible report can be presented in many different forms:

  • HTML
  • PDF
  • MS-DOC
  • Slide Formats
  • Website

B.1 R Markdown

R markdown is a variant of the Markdown language, which is a markup language that provides a way of creating easy to read plain text file which can incorporate formatted text, images, headers and links to other documents. Another classic example of the markup language is the HTML (Hypertext Markup Language)

B.2 Open Science

Open science is a common goal in the academia. As a member of this scientific communicity, we hope that everyone can do as much as they can to make their data, methods, results and inferences transparent and available to everyone.

There are several important tenets for open science (See Open science):

  • Transparency in experimental methodology, observation, collection of data and analytical methods.
  • Public availability and re-usability of scientific data
  • Public accessibility and transparency of scientific communication
  • Using web-based tools to facilitate scientific collaboration

Scenario I

After data collection, you load the data into R and write R code to explore and analyze the data and save the code in an R script. Then you save the analysis results and plots as external files and manually combine all of these and your written prose into an MS Word Document.

Scenario II

After data collection, you use R for data exploration and analysis as well. But this time you include all the R code used for exploration and analysis, as well as the analysis outputs (statistical reports and graphs) and your written text in one single R markdown document. This R markdown document can be used to automatically create the final document (e.g., PDF, DOC, HTML etc.).

B.3 Installing R Markdown

# Install from CRAN
install.packages('rmarkdown', dep = TRUE)

If you need to generate PDF output from R markdown, you will need to install LaTeX. If you don’t have LaTeX yet, it is recommended that you install TinyTeX.

## Optional (Only needed for PDF format)
install.packages('tinytex')
tinytex::install_tinytex()  # install TinyTeX

B.4 R Markdown Components

Let’s check the R Markdown Reference Guide.

  • YAML header: metadata and options for the entire document
  • Formatted text: texts with markup formatting
  • Code chunks: R code (or any other language code)
  • Adding figures: R-generated graphs
  • Adding tables: R-generated tables
  • Inline R code: In-text R code

B.5 Tips

  • We can control the text length to make sure that the codes do not run off the page edge when rendering the R markdown to PDF.
knitr::opts_chunk$set(
  message = FALSE,
  warning = FALSE,
  tidy.opts = list(width.cutoff = 60)
)
  • We can suppress the startup messages and/or warnings.
suppressPackageStartupMessages(library(ggplot2))
  • We can re-format the R code in the markdown in a tidy way.
knitr::opts_chunk$set(message=FALSE, tidy.opts=list(width.cutoff=60), tidy=TRUE)