How to Convert PDF to EPUB for E-Readers — Blog | ConvertIntoMP4

How to Convert PDF to EPUB for E-Readers — Blog | ConvertIntoMP4

What Converts Well (and What Does Not)

PDF Type	Conversion Quality	Notes
Text-only documents	Excellent	Clean paragraph extraction
Simple books (fiction)	Good	Minor formatting quirks
Technical books with code	Fair	Code blocks may lose formatting
Multi-column layouts	Poor	Columns get interleaved
Scanned PDFs (image-based)	Requires OCR first	See note below
Forms and interactive PDFs	Very poor	Interactivity is lost
PDFs with complex tables	Poor	Tables rarely survive intact

Scanned PDFs: If your PDF is a scan (the pages are images, not text), you must run OCR first to extract the text. See our guide on how to OCR scanned documents.

Method 1: Using Calibre (Best Results)

Calibre is the gold standard for ebook conversion. Its PDF input plugin includes heuristic processing that detects paragraph boundaries, headers, and chapter breaks:

ebook-convert input.pdf output.epub \
  --enable-heuristics \
  --title "Book Title" \
  --authors "Author Name"

Key Calibre Options

ebook-convert input.pdf output.epub \
  --enable-heuristics \
  --unwrap-factor 0.45 \
  --no-images \
  --title "Book Title" \
  --authors "Author Name" \
  --language en \
  --chapter "//*[re:test(., 'Chapter|CHAPTER')]" \
  --page-breaks-before "//*[re:test(., 'Chapter|CHAPTER')]"

--enable-heuristics — activates paragraph detection and line unwrapping
--unwrap-factor 0.45 — controls how aggressively short lines are joined (0.0-1.0)
--no-images — skips images for a text-only conversion (faster, cleaner)
--chapter — XPath expression to detect chapter headings
--page-breaks-before — inserts page breaks before detected chapters

Fine-Tuning Output

If Calibre merges lines incorrectly (common with poetry or code), reduce the unwrap factor:

ebook-convert input.pdf output.epub \
  --enable-heuristics --unwrap-factor 0.2

If headings are not detected, specify them manually:

ebook-convert input.pdf output.epub \
  --chapter "//h:h1|//h:h2" \
  --level1-toc "//h:h1" \
  --level2-toc "//h:h2"

Method 2: Using Pandoc

Pandoc can convert PDF to EPUB, though it relies on extracting text via pdftotext first:

pandoc input.pdf -o output.epub \
  --metadata title="Book Title" \
  --metadata author="Author Name" \
  --toc --toc-depth=2

Pandoc's PDF reading is simpler than Calibre's — it works well for straightforward text documents but struggles with complex layouts.

Method 3: Online Conversion

Use the PDF to EPUB converter for quick conversions without installing any software. Upload your PDF and download a reflowable EPUB. For more ebook options, check our ebook format guide.

Improving Conversion Quality

Pre-Processing the PDF

For best results, prepare the PDF before conversion:

Extract text first to verify it is selectable (not scanned):
```
pdftotext input.pdf - | head -20
```

Remove headers and footers that will appear on every page in the EPUB:

# Calibre can handle this with regex
ebook-convert input.pdf output.epub \
  --search-replace '[["running header text", ""]]'

Check for embedded fonts — unusual fonts may cause character mapping issues

Post-Processing the EPUB

After conversion, open the EPUB in Calibre's editor (calibre → right-click → Edit Book) to:

Fix misdetected chapters
Remove duplicate headers/footers that survived conversion
Clean up formatting artifacts
Add a proper cover image
Edit the table of contents

Quality and Settings Tips

Font embedding: By default, Calibre embeds fonts in the EPUB. For maximum compatibility with e-readers, use --subset-embedded-fonts to include only the glyphs actually used in the text, reducing file size.

Image quality: If the PDF contains images you want to preserve, Calibre extracts and re-encodes them. Control the quality with:

ebook-convert input.pdf output.epub \
  --output-profile kindle_oasis \
  --jpeg-quality 85

Cover image: PDFs do not have a designated cover. Extract the first page as an image and use it:

# Extract first page as PNG
pdftoppm -f 1 -l 1 -png input.pdf cover
# Use in conversion
ebook-convert input.pdf output.epub --cover cover-1.png

Table of contents: Calibre generates a TOC from detected chapters. If your PDF has a table of contents page, Calibre may detect it — but for best results, specify chapter detection patterns manually with --chapter.

For more on PDF processing, see our guide on how to convert PDF to Word.

Common Issues and Troubleshooting

Text runs together without paragraph breaks

The PDF uses line breaks instead of paragraph spacing. Increase the heuristic unwrap factor:

ebook-convert input.pdf output.epub \
  --enable-heuristics --unwrap-factor 0.6

Paragraphs are split mid-sentence

The unwrap factor is too low, so Calibre is not joining continued lines. Increase it to 0.45-0.55.

Gibberish characters in output

The PDF uses non-standard character encoding or embedded fonts with custom character mappings. Try extracting text with pdftotext -layout input.pdf first. If the text is garbled there too, the issue is in the PDF itself — not the converter.

Headers and footers repeated on every page

PDF headers/footers are positioned text, not metadata. The converter cannot automatically distinguish them from body text. Use Calibre's search-and-replace during conversion to remove known header/footer text, or manually remove them in the EPUB editor after conversion.

Images are missing or low quality

PDF images may be compressed in formats the converter does not handle well. Try converting with explicit image quality settings. For image-heavy PDFs (textbooks, art books), a fixed-layout EPUB may be more appropriate than a reflowable one.

Conclusion

PDF-to-EPUB conversion works best with simple, text-heavy documents — novels, reports, articles, and documentation. Calibre with heuristic processing gives the best results for most content. For complex, multi-column, or image-heavy PDFs, expect to do some post-conversion cleanup in Calibre's EPUB editor. The effort is worth it: a properly converted EPUB provides a dramatically better reading experience on e-readers and phones than a PDF ever can.

Ready to convert? Try our free PDF to EPUB converter — no registration required.