[r-markdown] How to convert R Markdown to PDF?

I've previously asked about the commands for converting R Markdown to HTML.

What is a good way to convert R Markdown files to PDF documents?

A good solution would preserve as much as possible of the content (e.g., images, equations, html tables, etc.). The solution needs to be able to be run from the command-line. A good solution would also be cross-platform, and ideally minimise dependencies to make it easier to share makefiles and so forth.

Specifically, there are a lot of options:

  • Whether to convert RMD to MD to HTML to PDF; or RMD to MD to PDF; or RMD to PDF
  • If using the markdown package in R, which options to specify
  • Whether to use pandoc, a package built into R, or something else

Here's an example rmd file that presumably provides a reasonable test of any proposed solution. It was used as the basis for this blog post.

This question is related to r-markdown knitr pandoc

The answer is


I think you really need pandoc, which great software was designed and built just for this task :) Besides pdf, you could convert your md file to e.g. docx or odt among others.

Well, installing an up-to-date version of Pandoc might be challanging on Linux (as you would need the entire haskell-platform?to build from the sources), but really easy on Windows/Mac with only a few megabytes of download.

If you have the brewed/knitted markdown file you can just call pandoc in e.g bash or with the system function within R. A POC demo of that latter is implemented in the ?andoc.convert function of my little package (which you must be terribly bored of as I try to point your attention there at every opportunity).


Only two steps:

  1. Install the latest release "pandoc" from here:

    https://github.com/jgm/pandoc/releases

  2. Call the function pandoc in the library(knitr)

    library(knitr)
    pandoc('input.md', format = 'latex')
    

Thus, you can convert your "input.md" into "input.pdf".


I found using R studio the easiest way, but if wanting to control from the command line, then a simple R script can do the trick using rmarkdown render command (as mentioned above). Full script details here

#!/usr/bin/env R

# Render R markdown to PDF.
# Invoke with:
# > R -q -f make.R --args my_report.Rmd

# load packages
require(rmarkdown)

# require a parameter naming file to render
if (length(args) == 0) {
    stop("Error: missing file operand", call. = TRUE)
} else {
    # read report to render from command line
    for (rmd in commandArgs(trailingOnly = TRUE)) {
        # render Rmd to PDF
        if ( grepl("\\.Rmd$", rmd) && file.exists(rmd)) {
            render(rmd, pdf_document())
        } else {
            print(paste("Ignoring: ", rmd))
        }
    }
}

Follow these simple steps :

1: In the Rmarkdown script run Knit(Ctrl+Shift+K) 2: Then after the html markdown is opened click Open in Browser(top left side) and the html is opened in your web browser 3: Then use Ctrl+P and save as PDF .


If you don't want to install anything you can output html. Then open the html file - it should open in a browser window, then right click to print. In the print window, select "save as pdf" in the bottom right hand corner if you're on a Mac. Voila!


Updated Answer (10 Feb 2013)

rmarkdown package: There is now an rmarkdown package available on github that interfaces with Pandoc. It includes a render function. The documentation makes it pretty clear how to convert rmarkdown to pdf among a range of other formats. This includes including output formats in the rmarkdown file or running supplying an output format to the rend function. E.g.,

render("input.Rmd", "pdf_document")

Command-line: When I run render from the command-line (e.g., using a makefile), I sometimes have issues with pandoc not being found. Presumably, it is not on the search path. The following answer explains how to add pandoc to the R environment.

So for example, on my computer running OSX, where I have a copy of pandoc through RStudio, I can use the following:

Rscript -e "Sys.setenv(RSTUDIO_PANDOC='/Applications/RStudio.app/Contents/MacOS/pandoc');library(rmarkdown);  library(utils); render('input.Rmd', 'pdf_document')"

Old Answer (circa 2012)

So, a number of people have suggested that Pandoc is the way to go. See notes below about the importance of having an up-to-date version of Pandoc.

Using Pandoc

I used the following command to convert R Markdown to HTML (i.e., a variant of this makefile), where RMDFILE is the name of the R Markdown file without the .rmd component (it also assumes that the extension is .rmd and not .Rmd).

RMDFILE=example-r-markdown  
Rscript -e "require(knitr); require(markdown); knit('$RMDFILE.rmd', '$RMDFILE.md'); markdownToHTML('$RMDFILE.md', '$RMDFILE.html', options=c('use_xhml'))"

and then this command to convert to pdf

Pandoc -s example-r-markdown.html -o example-r-markdown.pdf


A few notes about this:

  • I removed the reference in the example file which exports plots to imgur to host images.
  • I removed a reference to an image that was hosted on imgur. Figures appear to need to be local.
  • The options in the markdownToHTML function meant that image references are to files and not to data stored in the HTML file (i.e., I removed 'base64_images' from the option list).
  • The resulting output looked like this. It has clearly made a very LaTeX style document in contrast to what I get if I print the HTML file to pdf from a browser.

Getting up-to-date version of Pandoc

As mentioned by @daroczig, it's important to have an up-to-date version of Pandoc in order to output pdfs. On Ubuntu as of 15th June 2012, I was stuck with version 1.8.1 of Pandoc in the package manager, but it seems from the change log that for pdf support you need at least version 1.9+ of Pandoc.

Thus, I installed caball-install. And then ran:

cabal update
cabal install pandoc

Pandoc was installed in ~/.cabal/bin/pandoc Thus, when I ran pandoc it was still seeing the old version. See here for adding to the path.


Right now (August 2014) You could use RStudio for converting R Markdown to PDF. Basically, RStudio use pandoc to convert Rmd to PDF.

You could change metadata to:

  1. Add table of contents
  2. Change figure options
  3. Change syntax highlighting style
  4. Add LaTeX options
  5. And many more...

For more details - http://rmarkdown.rstudio.com/pdf_document_format.htmlenter image description here


For an option that looks more like what you get when you print from a browser, wkhtmltopdf provides one option.

On Ubuntu

sudo apt-get install wkhtmltopdf

And then the same command as for the pandoc example to get to the HTML:

RMDFILE=example-r-markdown  
Rscript -e "require(knitr); require(markdown); knit('$RMDFILE.rmd', '$RMDFILE.md'); markdownToHTML('$RMDFILE.md', '$RMDFILE.html', options=c('use_xhml'))"

and then

wkhtmltopdf example-r-markdown.html example-r-markdown.pdf

The resulting file looked like this. It did not seem to handle the MathJax (this issue is discussed here), and the page breaks are ugly. However, in some cases, such a style might be preferred over a more LaTeX style presentation.