Why Linux Makes Pandoc More Powerful
Pandoc converts Markdown to DOCX reliably on every platform, but Linux is where it earns its reputation. On Linux you get a native package manager, shell scripting with no friction, cron-scheduled batch runs, and no installer wizard to click through. You also hit Linux-specific wrinkles: distribution repositories often ship Pandoc versions that are 12 to 18 months behind the upstream release, and some common errors — Unicode glyph fallbacks, image path resolution, missing fonts — behave differently than on macOS or Windows.
This guide is Linux-focused. If you want a cross-platform overview including Typora, VS Code extensions, and online tools, the Markdown to DOCX overview guide covers that ground. Here we go deep on installation across distros, reference-document workflow, batch scripting, and the errors that catch Linux users specifically.
Installing Pandoc on Linux
The approach that works best depends on your distribution and whether you need the current upstream version or can accept the packaged one.
Ubuntu and Debian
sudo apt update
sudo apt install pandoc
Caveat: Ubuntu LTS ships an older Pandoc. Ubuntu 22.04 (Jammy) ships Pandoc 2.9.2.1, and Ubuntu 24.04 (Noble) ships 3.1.3. Pandoc 3.x introduced breaking changes to the DOCX writer that affect table rendering and heading styles. Check your version:
pandoc --version
If you need the latest release (currently 3.6.x), bypass the APT package and install via the GitHub release:
# Download the current .deb for your architecture (amd64 shown)
curl -L https://github.com/jgm/pandoc/releases/latest/download/pandoc-3.6-1-amd64.deb -o pandoc.deb
sudo dpkg -i pandoc.deb
rm pandoc.deb
Alternatively, install via conda if you have Miniconda or Anaconda:
conda install -c conda-forge pandoc
Fedora and RHEL/CentOS Stream
sudo dnf install pandoc
Fedora tracks upstream more aggressively than Debian-family distros, so the packaged version is usually one minor release behind. RHEL 9 and CentOS Stream may ship an older build — check with pandoc --version.
For the latest version on RHEL-family systems, use the tarball distribution:
curl -L https://github.com/jgm/pandoc/releases/latest/download/pandoc-3.6-linux-amd64.tar.gz | tar xz
sudo mv pandoc-3.6/bin/pandoc /usr/local/bin/
Arch Linux and Manjaro
sudo pacman -S pandoc
Arch tracks upstream closely and packages Pandoc in the community repository. On Arch you rarely hit the version-lag issue that affects Ubuntu LTS.
openSUSE
sudo zypper install pandoc
Snap (Universal, Any Distro)
If you want version independence without manually downloading binaries:
sudo snap install pandoc --classic
The --classic flag is required because Pandoc needs filesystem access beyond the snap sandbox. Note that snap-installed Pandoc may have slower startup (cold JVM-like delay) on first invocation.
Flatpak (Not Recommended for CLI Use)
Flatpak Pandoc exists but wraps the binary in a sandbox that complicates file path resolution. For interactive DOCX generation from a GUI tool like Typora, it works. For scripted batch conversion, the path sandbox causes frustrating read/write permission errors. Use APT, DNF, Pacman, or the GitHub tarball instead.
Basic Markdown to DOCX Conversion
Once installed, the simplest conversion is:
pandoc input.md -o output.docx
Pandoc infers input format from the .md extension and output format from .docx. The result opens cleanly in LibreOffice Writer and Microsoft Word.
To be explicit about formats (useful when your file has a non-standard extension):
pandoc -f markdown -t docx input.md -o output.docx
Combining Multiple Files Into One Document
For reports or documentation split across chapters:
pandoc intro.md chapter1.md chapter2.md conclusion.md -o report.docx
Pandoc reads files left to right, inserts page breaks between them (in recent versions, configurable), and writes a single DOCX. Heading numbering continues across files.
Controlling Styles: The Reference Document
By default, Pandoc applies a built-in style template that produces clean, functional DOCX output. For corporate documents, technical specifications, or anything that needs to match a house style, you need a reference document — a DOCX file whose paragraph and character styles Pandoc copies into every conversion.
Generating the Default Reference Document
pandoc -o reference.docx --print-default-data-file reference.docx
This creates a DOCX file in your current directory that contains Pandoc's built-in styles. Open it in LibreOffice Writer or Word:
- In LibreOffice: View → Styles (F11) to open the Styles panel
- Modify
Heading 1,Heading 2,Body Text,Code,Verbatim Char,Block Textto match your requirements - Save the file (keep it as
.docx, not.odt)
Applying the Reference Document
pandoc input.md --reference-doc=reference.docx -o output.docx
Every subsequent conversion using --reference-doc will inherit the fonts, spacing, margins, and heading colours from your reference file. The content comes from your Markdown; the presentation comes from the reference document.
What the Reference Document Does (and Does Not) Control
The reference document controls styles — it maps Pandoc's internal style names to your customised Word styles. It does not control inline formatting applied directly to text. For example, if your Markdown has **bold**, that will always be bold regardless of the reference document. But the paragraph style that wraps it (Body Text or First Paragraph) follows the reference document.
Styles that Pandoc uses from the reference document:
| Pandoc Style Name | Applies To |
|---|---|
Heading 1 through Heading 6 | # through ###### headings |
Body Text | Regular paragraphs |
Verbatim Char | Inline code |
Source Code | Fenced code blocks |
Block Text | Blockquotes |
Compact | Tight list items |
Table | Table cell text |
Caption | Image and table captions |
YAML Front Matter: Document Metadata
Pandoc reads YAML front matter from the top of your Markdown file and populates DOCX document properties:
---
title: "Q3 Engineering Report"
author: "Engineering Team"
date: "2026-06-12"
abstract: "Summary of Q3 technical milestones."
---
These fields appear in File → Properties in Word / LibreOffice and are visible in document metadata tools. The abstract field is written as a styled paragraph before the body if the reference document has an Abstract style defined.
Batch Conversion on Linux
Convert All .md Files in a Directory
for f in *.md; do
pandoc "$f" -o "${f%.md}.docx"
done
The ${f%.md} parameter expansion strips the .md suffix. Each Markdown file becomes a sibling .docx file.
Batch With Reference Document Applied
for f in *.md; do
pandoc "$f" --reference-doc=reference.docx -o "${f%.md}.docx"
done
Recursive Conversion (All Subdirectories)
find . -name "*.md" | while read -r f; do
outdir=$(dirname "$f")
base=$(basename "${f%.md}")
pandoc "$f" --reference-doc=reference.docx -o "${outdir}/${base}.docx"
done
This mirrors the directory tree, placing each .docx file beside its source .md.
Combining All Files in Directory Order Into One Document
pandoc $(ls *.md | sort) --reference-doc=reference.docx --toc -o combined.docx
Note: Use process substitution carefully if filenames contain spaces — quote them or use find -print0 | xargs -0.
Scheduled Conversion With Cron
For automated documentation pipelines, add a cron entry:
# crontab -e
0 2 * * 1 cd /home/user/docs && pandoc index.md --reference-doc=reference.docx -o /var/www/html/docs/spec.docx
This runs every Monday at 02:00 and regenerates the DOCX from the latest Markdown source.
Table of Contents and Section Numbers
# Add a table of contents
pandoc input.md --toc --toc-depth=3 -o output.docx
# Add numbered sections
pandoc input.md --number-sections -o output.docx
# Both together
pandoc input.md --toc --toc-depth=3 --number-sections -o output.docx
--toc-depth=3 includes H1, H2, and H3 in the TOC. The table of contents is a native Word field that Word and LibreOffice can update via right-click → Update Field.
Common Errors on Linux
pandoc: Could not find reference.docx
Pandoc resolves --reference-doc relative to the working directory, not the Markdown file's directory. If you run Pandoc from /home/user/ but your reference file is in /home/user/docs/, you need:
pandoc docs/input.md --reference-doc=docs/reference.docx -o docs/output.docx
Or use an absolute path:
pandoc input.md --reference-doc=/home/user/templates/reference.docx -o output.docx
Images Missing From Output
Pandoc resolves image paths relative to the working directory, not the source file. For a Markdown file at docs/guide.md with , run Pandoc from docs/:
cd docs && pandoc guide.md -o guide.docx
Or use --resource-path:
pandoc docs/guide.md --resource-path=docs/images -o guide.docx
For images under multiple subdirectories:
pandoc input.md --resource-path=.:images:assets/figures -o output.docx
The colon-separated list adds directories to Pandoc's search path. The . includes the working directory itself.
Unicode Characters Appearing as Boxes or ?
This almost always means your terminal locale is set to something other than UTF-8, or the Markdown source file is not UTF-8 encoded.
Check and set your locale:
locale
# Should show UTF-8 in LANG and LC_ALL
# Set temporarily
export LANG=en_US.UTF-8
export LC_ALL=en_US.UTF-8
# Set permanently (Debian/Ubuntu)
sudo locale-gen en_US.UTF-8
sudo update-locale LANG=en_US.UTF-8
To check and re-encode a Markdown file:
# Check encoding
file -i input.md
# Expected: text/plain; charset=utf-8
# Re-encode from ISO-8859-1 to UTF-8 if needed
iconv -f ISO-8859-1 -t UTF-8 input.md > input-utf8.md
pandoc input-utf8.md -o output.docx
[WARNING] Missing character: ...
Pandoc emits this when it encounters a Unicode codepoint it cannot map to a glyph in the selected font. This warning does not stop conversion but leaves a blank where the character should be.
Fix: ensure your reference document uses a font with broad Unicode coverage. Liberation Serif, DejaVu Sans, Noto Serif, and Linux Libertine all cover Latin Extended, Cyrillic, Greek, and many other scripts. Noto fonts have the widest coverage and are free.
To install Noto fonts on Ubuntu/Debian:
sudo apt install fonts-noto
Then open your reference document and change the body font to Noto Serif or Noto Sans.
pandoc: /usr/bin/pandoc: not found After Snap Install
The Snap binary is at /snap/bin/pandoc, which requires /snap/bin to be in your PATH. Add it:
echo 'export PATH="/snap/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc
Older Pandoc Produces Different DOCX Styles
Pandoc 2.x and 3.x have different internal style names for some elements. If you generate a reference document with Pandoc 3.x and then use it with a Pandoc 2.x install (or vice versa), some styles will not apply. The solution is to generate the reference document with the same Pandoc version you use for conversion:
# Always generate reference.docx with the installed version
pandoc -o reference.docx --print-default-data-file reference.docx
Advanced: Lua Filters for Custom Output
Pandoc supports Lua filters — small scripts that transform the document AST during conversion. This is the right tool for changes that cannot be expressed through styles alone.
A practical example: automatically capitalising all Heading 1 elements:
-- uppercase-h1.lua
function Header(el)
if el.level == 1 then
el.content = pandoc.walk_inline(el, {
Str = function(s)
return pandoc.Str(string.upper(s.text))
end
})
end
return el
end
Apply it:
pandoc input.md --lua-filter=uppercase-h1.lua -o output.docx
The Pandoc Lua filter library at github.com/pandoc/lua-filters has community-maintained filters for cross-references, list formatting, citation styles, and figure numbering.
When a GUI or Online Converter Is Easier
The command line is the right choice for automation, batch jobs, and reproducible pipelines. For single-file conversions where you do not need scripted output, there are simpler options.
ConvertIntoMP4's Markdown to DOCX converter runs Pandoc on the server and returns the DOCX file immediately — no install, no terminal. Upload your .md file, click Convert, and download the result. Useful when you are on a new machine without Pandoc installed, converting a file sent by a colleague, or helping a non-technical team member produce a Word document from a Markdown source.
For conversions that are part of a larger document format workflow — combining a DOCX with a PDF, converting DOCX to PDF for distribution — the document converter handles those steps.
Frequently Asked Questions
Does Pandoc on Linux support GitHub-Flavored Markdown (GFM)?
Yes. Use -f gfm as the input format:
pandoc -f gfm input.md -o output.docx
GFM adds strikethrough (~~text~~), task lists, and auto-linked URLs. The default markdown format is Pandoc's own superset of CommonMark, which supports most GFM features already. If your Markdown source was written for GitHub, try the default first — if something renders incorrectly, switch to -f gfm.
Why does sudo apt install pandoc give me an old version?
APT installs the version in the Ubuntu repository, which lags behind upstream. Newer features (better DOCX table support, improved style mapping) require Pandoc 3.x. Install the .deb from the GitHub releases page or use conda install -c conda-forge pandoc to get the current upstream release.
Can I use a corporate Word template (.dotx) as the reference document?
Not directly. Pandoc's --reference-doc flag accepts a .docx file, not a .dotx template. Open your .dotx in LibreOffice Writer, save it as a regular .docx, and use that file as the reference document. The styles from the template will be preserved.
Does the reference document need to contain text?
No. Pandoc uses only the styles from the reference document — it ignores the actual content. Many teams maintain an empty reference document (just a DOCX with custom styles defined) specifically for Pandoc conversions. Some prefer to keep a few styled paragraphs as visual style guides within the file.
How do I add page numbers to the DOCX?
Page numbers are defined in the reference document's header/footer, not through a Pandoc command-line flag. Open your reference document in LibreOffice Writer, add a footer (Insert → Header and Footer → Footer → Default Page Style), insert a page number field (Insert → Field → Page Number), save, and use the file as your --reference-doc. All DOCX files produced with that reference document will include page numbers.
Is there a way to preview the output without opening LibreOffice?
unoconv (if installed) can convert DOCX to PDF or HTML for quick inspection:
unoconv -f pdf output.docx
Alternatively, Pandoc can convert the same Markdown directly to HTML for a fast browser preview, which is much quicker than the DOCX roundtrip:
pandoc input.md -o preview.html && xdg-open preview.html



