For starters, this is how you might want to turn your well-written Markdown file (with common metadata fields like , and ) into a properly typeset PDF document:
However, Markdown is not TeX. Not even close. Once you need to have some bleeding edge control over the typesetting outcome, or perhaps just a little refinement on its LaTeX templating, you’ll soon notice that Pandoc has its quirks and gotchas. I’ve been utilizing Pandoc in all my serious academic writing (incl. homework reports) for years, ever since I gave up on learning more about the overwhelmingly sophisticated TeX ecosystem and turned to something that “just works”. Pandoc fits my needs well. And when it doesn’t, there’s almost always a workaround that achieves the same thing neatly. And this is what this write-up is mostly about.
Tweaking ? Bad idea.
You could, of course, modify the default template () provided by Pandoc, as long as you’re no stranger to LaTeX. In this way, you can achieve anything you want – in pure LaTeX.
There are, however, a few problems with this naïve approach:
- If you are tweaking the template just for something you’re currently working on, you will end up with some highly document-specific, hardly reusable template. Also this won’t give you any good for using Pandoc – you could just write plain LaTeX anyway.
- If Pandoc improves its default template for a newer version, your home-brewed template won’t benefit from this (unless you’re willing to merge the diffs and resolve any conflicts by hand).
I’m conservative about changing the templates. If it’s a general issue that needs to be fixed in the default template, sending a pull request to pandoc-templates might be a better idea. Of course, if there’s a certain submission format you have to stick with (given LaTeX templates for conference papers), then you will fall back on your own.
Separating the formatting stuff
I wouldn’t claim that I know the best practice of using Pandoc, but there’s such a common idiom that cannot be overstressed: Separate presentation and content!
In the YAML front matter of (the main Markdown file you’re writing), put only things that matter to your potential readers:
And in a separate YAML file (let’s call it ), here goes the formatting stuff:
Above is my personal default, and it’s worth a few words to explain:
- is where you control the geometric settings of your document. For example, you may narrow down the page margin to , and this is equivalent to raw LaTeX:
- Set to any value other than if paragraph indentation is desired. (And it is often desired in formal publications.)
- is where you define your own macros, configure existing ones, or claim in case you want to use a package not enabled by Pandoc (e.g., ). Although you might as well define those in other places (e.g., in the content of a Markdown file), don’t do that.
- This decent Q.E.D. tombstone: is my favorite of all time. It doesn’t require the package.
With a separate , now here we are:
While the Markdown syntax for citing is rather easy (), it takes effort to make things right, especially if you have a certain preferred citation format (APA, MLA, Chicago, IEEE, etc.).
The suggestion is: Use pandoc-citeproc. Once you have a list of references you’re interested in, you need two things to typeset those nicely in your document:
- A CSL (Citation Style Language) file (), to specify the citation format you want to use.
- A BibTeX file (), which is a list of all entries you might cite.
- Citation entries in BibTeX format may be found easily on the Internet, through academic search engines and databases. Concatenate them one by one.
As part of the YAML metadata: (Assume you have and )
Using as a filter, generate the document with citations:
The list of references is appended to the end of the document. It is often desirable to give the references an obvious title (“References”), start from a new page and avoid any further indentation, so the following comes in the end of the Markdown source:
Putting it all together!
Basically, we need 5 files in total:
- For content:
- (Markdown + possibly LaTeX mixed format): Main text.
- (BibTeX/BibLaTeX format): List of references.
- For presentation:
- (YAML format): Format-related metadata.
- (LaTeX format): Content of ; package imports and macro definitions.
- (CSL XML format): Citation style.
And one command:
Open question: Lightweight replacement for ?
Pandoc doesn’t provide native support for (and I wonder if there will ever be). You can still have the same thing in Pandoc Markdown:
However, everything in between and will be treated as raw LaTeX, and the expressiveness of Markdown is lost there. More importantly, this is purely a LaTeX-specific thing, so there’s no way for Pandoc to convert this to HTML or any other format (unless you have a filter that does the trick). Consequently, I tend to write all definitions / theorems (lemmas, claims, corollaries, propositions…) in simple Markdown:
It does have some advantages over :
- Using , you cannot see the numbering of each theorem (definition, etc.) in the text editor (well, you can’t without a dedicated plugin at least). This is inconvenient when you need to refer to a prior one later. By numbering them explicitly, you can clearly see these ordinals in the Markdown source.
- It is perfectly valid Markdown, so it converts to any format as you wish (HTML, for example).
This also has some drawbacks compared to using , though:
- It doesn’t have theorem counters. You need to number things explicitly, manually. (Clearly you can’t have implicit numbering and explicit numbering at the same time, so here’s the trade-off.)
- It doesn’t have automatic formatting. That is, you could possibly get the style for a certain entry (plain, definition, remark) wrong.
- Semantically, they are not recognized as theorems, just normal text paragraphs. This is problematic if you want to prevent definitions and theorems from being indented, since there’s no way for LaTeX to tell them from a normal text.
(Probably) The best solution is to write a filter that (conventionally) converts any plain text like (and , , etc.) in the beginning of a paragraph to proper Markdown (for HTML target) or corresponding block (for LaTeX target). Even better, it should be able to do cross-references accordingly (Remember ? Let’s put an anchored link on that!). This is yet to be done, but would be very helpful to someone who does a lot of theorems and proofs thus wants to avoid the kludge of mixing raw LaTeX with semantically evident Markdown.
I am trying to use a custom citation style in a markdown file, but the citation uses the default (Chicago) style each time I knit. I have tried changing the output format from a JS reveal presentation to an HTML document to a PDF document, but it still does not work. I am using the knitcitations package to cite using the document's DOI, and the bibliography() function to write the bibliography. I have also tried using the apa.csl style found on Zotero, yet the citation is still done in the default styple. The apa.csl file is stored in the same folder as the file that I am trying to use citations in, as is the newbiblio.bib file, in which I have stored the bibliographical information for the item I want to cite.
Below is my markdown code:
This link (http://rmarkdown.rstudio.com/authoring_bibliographies_and_citations.html) says that I should be able to format my YAML header like this:
However, when I do that, the file knits to a markdown (.md) file, but it is not processed into the output. I recieve this error:
The contents of my .bib file are:
I also do not understand why the biblio-style option in the YAML header does not to do anything. Essentially, all I need is a way to use a custom citation style I have already made with a markdown document. Any help would be greatly appreciated!