Sat 23 August 2014

Filed under Document Workflow

Tags Pandoc Windows OSX LaTeX PDF

Pardon the overflowing services menu.


This is, for me at least, the most magical thing about using Pandoc: I can set things up in OSX and Windows such that I can just right click on a plain text, pick an item out of the context menu, and have a well-formatted, professional-looking PDF (via LaTeX) or .docx file appear beside it. No need to go to the command line!

  • Here is the source2 for this page
  • Here is a PDF generated by Pandoc1
  • Here is a .docx generated by Pandoc

It does take a bit of work to set up, but it's really not that hard.

Step 1: Install Pandoc

Installing Pandoc is pretty straightforward. (Usually.) Just download and run the installer for whichever OS you're on.

Update: Step 1.5: Install some TeX stuff

I forgot to mention this before: You will also need to install some TeX-related things. This also shouldn't be too much more difficult than downloading and running an installer. There are links at the Pandoc installation page.

Step 2: Set up the OSX Service or the Windows Whatever the Equivalent of Services Is

This is the fun part. Well, if you're me.

OSX

The Pandoc Extras page has a ton of different utilities, programs, etc. to help use Pandoc on different platforms. For this, I'm starting with David Sanson's Pandoc Droplets and Services.

Download the files (click the "Download Zip" button on the right side), un-zip them somewhere, and then double-click the one that starts with "Convert to PDF..." OSX should then prompt you to either open the service up in Automator, or install it. Don't install it yet. Instead, open it up, and you'll find it just contains a shell script.

PATH=$HOME/.cabal/bin:/usr/local/bin:/usr/texbin:$PATH

for file in "$@"
do
    output=${file%%.*}.pdf
    pandoc "$file" -o "$output" --latex-engine xelatex
done

NOTE: since these services were created, something changed in Pandoc's handling of relative paths that totally borked them. Now, I don't know anything about shell scripting, but some furious googling got me a solution that -- well, I don't know if it's the most correct, but it works.

Just add cd "$(dirname "$@")" on a line after do. The result should look like this:

PATH=$HOME/.cabal/bin:/usr/local/bin:/usr/texbin:$PATH

for file in "$@"
do
    cd "$(dirname "$@")"
    output=$@.pdf
    pandoc "$file" -o "$output" --latex-engine xelatex
done

At this point you may want to pause and take a look at the possible options for running Pandoc. If there are any that you want to use when creating PDFs, you can add them to the shell script.

You can also create copies of the workflow, assign different options to them, and give them appropriate names.

For example, my most-used Pandoc service is called "Convert to PDF from MMD using Pandoc"3, becuse I mostly write text in MultiMarkdown rather than Pandoc's own syntax.

PATH=$HOME/.cabal/bin:/usr/local/bin:/usr/texbin:$PATH

for file in "$@"
do
    cd "$(dirname "$@")"
    output=$@.pdf
    pandoc -f markdown_mmd "$file" -o "$output" --latex-engine xelatex
done

By adding -f markdown_mmd, I've told Pandoc that the source is in Markdown, and Pandoc is (amazingly) clever enough to change how it reads the plain text, looking for MultiMarkdown syntax and even handling MultiMarkdown's metadata format, which is different from Pandoc's. Pandoc's MMD mode isn't perfect, but it's good enough for the vast majority of the MMD documents I create.

Once you've figured out what commands you want to give Pandoc and set the appropriate flags, you can then save the workflow, double-click on it again in the finder, and choose "install." Now it will be available in the Services context menu.

If you need to go back later and make changes, you can find the installed workflow (User folder)/Library/Services. To get to the Library, you may have to use the "Go to Folder" command in the "Go" menu of the finder, because OSX doesn't want you to know where the library is for some reason.

You can repeat the same process for the .docx workflow and the other workflows, if desired. The .docx one is very handy if you need to be able to produce MS Word documents for distribution to folks who have to use Word for institutional reasons.

Windows

For Windows folks, Luke Maciak created a .reg file to add similar context menu options for converting files within Windows Explorer.

And again, you can add whatever options you need to change Pandoc's behavior. Here's the addition of -f markdown_mmd that I use:

Windows Registry Editor Version 5.00

[HKEY_CLASSES_ROOT\*\shell\mkd2doc]
[HKEY_CLASSES_ROOT\*\shell\mkd2doc\command]
@="\"C:\\Users\\nshere\\AppData\\Local\\Pandoc\\pandoc.exe\" -s -S \"%1\" # -f markdown_mmd # -o \"%1.docx\""

[HKEY_CLASSES_ROOT\*\shell\mkd2pdf]
[HKEY_CLASSES_ROOT\*\shell\mkd2pdf\command]
@="\"C:\\Users\\nshere\\AppData\\Local\\Pandoc\\pandoc.exe\" \"%1\" -f markdown_mmd -o \"%1.pdf\""

By having this available on both my PC at work and my OSX machine, I can easily compose in plain text, save my files in Dropbox, and be able to instantly generate good-looking, consistent PDF and .docx files on either.

Templates

If you aren't happy with happy with the look of stock LaTeX PDFs -- which are a little, um retro -- then you may want to look into alternate Pandoc's templating. Here is the relevant bit of the user guide. Here are the default templates in GitHub. Here are some user-contributed templates.

Note that among those, there is a very nice template by W. Caleb McDaniel for tufte-latex handouts.4

If you want to use this template, you need to save the template on your computer and then set an option to Pandoc letting it know to look for it.

The default location for installing the template on my OSX installation is (User folder)/.pandoc/templates. I think this is standard, but you can confirm the default data directory by opening up a terminal window5 and typing pandoc --version. The output will include the default user data directory, like:

Default user data directory: /Users/kukkurovaca/.pandoc

You will either have to use "Go to Folder" in the finder or navigate in the terminal, etc., because again, this folder will be hidden in OSX.

Templates go in a folder called "templates" below that directory. In my case case, I saved the template as wcaleb-tufte-handout.latex, then created a new copy of the PDF service, and modified it to include --template=wcaleb-tufte-handout.

Here's the complete script for that service:

PATH=$HOME/.cabal/bin:/usr/local/bin:/usr/texbin:$PATH

for file in "$@"
do
    cd "$(dirname "$@")"
    output=$@.pdf
    pandoc -f markdown_mmd "$file" -o "$output" --template=wcaleb-tufte-handout
done

And here is the resulting PDF.

The same could be done with the other user-contributed templates, and of course, if you're savvy enough, you can also create your own. That's a bit out of my league for now, though.

Update: LaTeX / PDF Options in Pandoc

You can also set some options to control your PDF output without having to get into templating, by setting certain values either in the metadata of your files, or as part of the options that are passed to Pandoc.

To set variables on the command line, use -V or --variable. To set them from within your document, the format will vary depending on what flavor you're writing in. In my case, I'm relying on Pandoc's ability to read MultiMarkdown-style metadata with -f markdown_mmd, like this:

    Title: Instant PDFs
    Date: 2014-09-05
    fontfamily: libertine

That last line will tell Pandoc to include \usepackage{libertine} in the preamble of the LaTeX file that generates your PDF. That would give you a document set in Libertine and Biolinum

Adding -V fontfamily=libertine to the options in your script will have the same effect. In terms of the workflow I explained in this post, whether you want to use metadata or set -V will depend on when and how you want to make choices about the options in question.

For example, if there are a couple different sets of fonts that you want to use regularly, you may want to set those up as separate menu options. But if that's a choice you'd want to make at the time you're writing the document, or if you're going to be changing things up all the time, it probably makes more sense to set it in the document metadata on a per-document basis. (Note: variables set with -V or --variable will override whatever's in the document metadata.)

You can take a look at the TUG Font Catalog for additional options. It may take some experimentation to find a font package or combination of font packages for your needs.

Note that some will include serif, sans serif, and mono faces (and beyond, sometimes), while others may only define one of those. If you need to use two font packages, just separate them with a comma, like tgschola,tgheros,tgcursor to use TeX Gyre Schola, Heros, and Cursor. (Note: You may need to install these packages, or they may already be installed with your TeX distribution.)

I'm not a font expert (at all), but from poking around a little bit, libertine and droid seem like good one-stop-shopping options for serif/sans/mono. And ebgaramond and fbb are really pretty serif faces.

In a similar fashion, you can set documentclass and classoption. You can also cause Pandoc to generate a table of contents in your PDF by adding -V toc, but I have not been able to figure out how to accomplish this using metadata in the document itself yet.

If you prefer to use arbitrary fonts you have installed on your system, rather than TeX font packages, you can do this with a different set of options:

mainfont, sansfont, monofont, mathfont

But if you do so, you should use --latex-engine xelatex when running Pandoc.


  1. You may notice there is a bit of an issue with a code block that stretches off the page. This is a soluble problem, but I can't be bothered to solve it today. Update: This may be useful later if I come back to it. 

  2. It probably won't be exactly current by the time this gets published. 

  3. I'm pretty literal-minded about naming things. 

  4. Tufte-latex is a popular LaTeX package for emulating the design of Edward Tufte books.  

  5. Sorry! Not totally command line free after all. 

Comment

Plain Text Adventure © kukkurovaca Powered by Pelican and Twitter Bootstrap. Icons by Font Awesome and Font Awesome More