From Browser to Document
Web pages are designed for screens. PDFs are designed for paper. Converting between them requires bridging two fundamentally different layout models -- one that scrolls infinitely and adapts to any screen width, and one that has fixed pages with defined dimensions and margins.
Despite this mismatch, HTML-to-PDF conversion is one of the most common document workflows. Invoices generated by web applications need to be downloaded as PDFs. Articles and reports published online need printable versions. Web-based dashboards need to produce executive summaries. Documentation sites need offline-readable exports.
The good news is that the conversion tools have matured significantly. Browser engines now produce high-quality PDF output through their built-in print functionality. Automation tools like Puppeteer and Playwright make this programmable. CSS provides dedicated print media queries that let you control exactly how your web content translates to pages.
This guide covers every approach: manual browser-based conversion for one-off needs, automated conversion for production workflows, and CSS optimization for print-ready web pages.

Method 1: Browser Print-to-PDF
Every modern browser includes a built-in PDF generator through the print dialog. This is the simplest approach and works for any web page.
Chrome / Edge / Brave
- Open the web page you want to convert
- Press Ctrl+P (Windows/Linux) or Cmd+P (macOS)
- Change the "Destination" to "Save as PDF"
- Configure options:
- Layout: Portrait or Landscape
- Paper size: Letter, A4, Legal, etc.
- Margins: Default, None, Minimum, or Custom
- Scale: Adjust to fit content (default is 100%)
- Headers and footers: Toggle date, URL, page numbers, and title
- Background graphics: Enable to include background colors and images
- Click "Save" and choose a file location
Firefox
- Open the web page
- Press Ctrl+P / Cmd+P
- Select "Microsoft Print to PDF" (Windows) or "Save to PDF" (macOS)
- Configure paper size, orientation, and scale
- Click Print/Save
Safari (macOS)
- Open the web page
- Press Cmd+P
- Click the "PDF" dropdown in the lower-left corner
- Select "Save as PDF"
- Configure options and save
Pro Tip: Before printing to PDF, check the page's print preview carefully. Many websites look dramatically different in print mode because their CSS includes @media print rules that hide navigation, ads, and interactive elements. If content is missing from the PDF that was visible on screen, the site's print stylesheet may be hiding it. Try disabling "Background graphics" or enabling it, depending on what is missing.
Browser Print-to-PDF Limitations
| Limitation | Impact | Workaround |
|---|---|---|
| No control over page breaks | Content splits awkwardly between pages | Use CSS page-break rules (see below) |
| Headers/footers limited | Only basic metadata (URL, date, page number) | Use CSS-based headers/footers |
| No programmatic access | Cannot automate or batch process | Use Puppeteer/Playwright |
| JavaScript rendering | Some SPAs may not render fully | Wait for page load, or use SSR version |
| Authentication | Cannot access login-protected pages easily | Log in first, then print |
| Dynamic content | Infinite scroll pages only capture visible content | Expand all content before printing |
Method 2: Online Conversion Tools
For quick one-off conversions without dealing with browser settings, online tools provide a streamlined workflow.
The HTML converter on ConvertIntoMP4 accepts HTML files or URLs and converts them to PDF with configurable options for page size, margins, and orientation. The PDF converter also handles HTML input as part of its multi-format conversion pipeline.
For converting other document types to PDF, the document converter supports a wide range of formats. If you are working with Markdown files specifically, see our guide on Markdown to PDF conversion, which covers the unique considerations for plain-text markup conversion.
Method 3: Puppeteer (Node.js Automation)
Puppeteer controls a headless Chrome browser programmatically, giving you full control over the PDF generation process. This is the standard approach for production applications that generate PDFs from HTML templates.
Basic Setup
npm install puppeteer
Basic Conversion
const puppeteer = require("puppeteer");
async function htmlToPdf(url, outputPath) {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto(url, { waitUntil: "networkidle0" });
await page.pdf({
path: outputPath,
format: "A4",
margin: {
top: "20mm",
right: "15mm",
bottom: "20mm",
left: "15mm",
},
printBackground: true,
});
await browser.close();
}
htmlToPdf("https://example.com/report", "report.pdf");
Advanced Options
await page.pdf({
path: "report.pdf",
format: "A4",
landscape: false,
printBackground: true,
margin: { top: "25mm", right: "20mm", bottom: "25mm", left: "20mm" },
displayHeaderFooter: true,
headerTemplate: `
<div style="font-size: 9px; width: 100%; text-align: center; color: #666;">
Company Name — Confidential Report
</div>
`,
footerTemplate: `
<div style="font-size: 9px; width: 100%; text-align: center; color: #666;">
Page <span class="pageNumber"></span> of <span class="totalPages"></span>
</div>
`,
preferCSSPageSize: false,
scale: 1,
});
Converting HTML Strings
You do not need a live URL. Puppeteer can convert raw HTML strings:
const html = `
<!DOCTYPE html>
<html>
<head>
<style>
body { font-family: Arial, sans-serif; padding: 20px; }
h1 { color: #333; }
table { border-collapse: collapse; width: 100%; }
th, td { border: 1px solid #ddd; padding: 8px; text-align: left; }
</style>
</head>
<body>
<h1>Invoice #12345</h1>
<table>
<tr><th>Item</th><th>Quantity</th><th>Price</th></tr>
<tr><td>Widget A</td><td>10</td><td>$50.00</td></tr>
<tr><td>Widget B</td><td>5</td><td>$30.00</td></tr>
</table>
</body>
</html>
`;
await page.setContent(html, { waitUntil: "networkidle0" });
await page.pdf({ path: "invoice.pdf", format: "A4" });

Method 4: Playwright (Multi-Browser Automation)
Playwright is Puppeteer's successor with multi-browser support. The PDF API is similar:
const { chromium } = require("playwright");
async function htmlToPdf(url, outputPath) {
const browser = await chromium.launch();
const page = await browser.newPage();
await page.goto(url, { waitUntil: "networkidle" });
await page.pdf({
path: outputPath,
format: "A4",
margin: { top: "20mm", right: "15mm", bottom: "20mm", left: "15mm" },
printBackground: true,
});
await browser.close();
}
Playwright supports Chromium, Firefox, and WebKit, which matters if you need to test PDF output across rendering engines. However, PDF generation is currently only supported by the Chromium engine in Playwright.
Method 5: Command-Line Tools
wkhtmltopdf
A standalone command-line tool that uses WebKit to render HTML to PDF:
# Basic conversion
wkhtmltopdf https://example.com/report report.pdf
# With options
wkhtmltopdf \
--page-size A4 \
--margin-top 20mm \
--margin-bottom 20mm \
--margin-left 15mm \
--margin-right 15mm \
--header-center "Company Report" \
--footer-center "Page [page] of [topage]" \
--print-media-type \
input.html output.pdf
WeasyPrint (Python)
A Python library that converts HTML/CSS to PDF using its own rendering engine:
pip install weasyprint
weasyprint https://example.com/report report.pdf
from weasyprint import HTML
# From URL
HTML("https://example.com/report").write_pdf("report.pdf")
# From file
HTML(filename="report.html").write_pdf("report.pdf")
# From string
HTML(string="<h1>Hello World</h1>").write_pdf("hello.pdf")
WeasyPrint has excellent CSS support, including CSS Paged Media, making it ideal for documents that need precise page-level control.
| Tool | Rendering Engine | CSS Support | Speed | Best For |
|---|---|---|---|---|
| Browser Print | Chrome/Safari/Firefox | Full | Fast (manual) | One-off conversions |
| Puppeteer | Chromium | Full | Medium | Node.js automation |
| Playwright | Chromium | Full | Medium | Multi-browser testing |
| wkhtmltopdf | WebKit (older) | Good | Fast | Simple CLI automation |
| WeasyPrint | Custom | Excellent (Paged Media) | Slow | Precise page layout |
| Prince XML | Custom | Best (Paged Media) | Fast | Professional publishing |
CSS for Print: Controlling the Output
CSS provides dedicated features for controlling how content renders in print. These rules are ignored on screen but applied when generating PDFs.
The @media print Query
@media print {
/* Hide navigation, ads, interactive elements */
nav,
.sidebar,
.ads,
.no-print,
button {
display: none !important;
}
/* Reset backgrounds and colors for print */
body {
background: white;
color: black;
font-size: 12pt;
line-height: 1.5;
}
/* Ensure links show their URL */
a[href]::after {
content: " (" attr(href) ")";
font-size: 10pt;
color: #666;
}
}
Page Break Control
/* Force a page break before chapter headings */
h1 {
page-break-before: always;
}
/* Prevent headings from appearing at the bottom of a page */
h2,
h3 {
page-break-after: avoid;
}
/* Keep images and their captions together */
figure {
page-break-inside: avoid;
}
/* Keep table rows from splitting across pages */
tr {
page-break-inside: avoid;
}
CSS Page Size and Margins
@page {
size: A4;
margin: 25mm 20mm;
}
/* Different margins for first page */
@page :first {
margin-top: 40mm;
}
/* Different margins for left/right pages (for binding) */
@page :left {
margin-left: 30mm;
margin-right: 15mm;
}
@page :right {
margin-left: 15mm;
margin-right: 30mm;
}
Pro Tip: Test your print CSS using Chrome DevTools. Open DevTools, press Ctrl+Shift+P (Command Palette), type "rendering," open the Rendering panel, and change "Emulate CSS media type" to "print." This shows you exactly how the page will render in a PDF without actually generating one, making it much faster to iterate on print styles.

Headers, Footers, and Page Numbers
Browser and Puppeteer Headers/Footers
Puppeteer and browser-based PDF generation support header and footer templates with special CSS classes:
<span class="date"></span>-- Current date<span class="title"></span>-- Document title<span class="url"></span>-- Document URL<span class="pageNumber"></span>-- Current page number<span class="totalPages"></span>-- Total page count
await page.pdf({
displayHeaderFooter: true,
headerTemplate: `
<div style="font-size: 8px; width: 100%; padding: 0 20mm;">
<span style="float: left;">My Company</span>
<span style="float: right;"><span class="date"></span></span>
</div>
`,
footerTemplate: `
<div style="font-size: 8px; width: 100%; text-align: center;">
Page <span class="pageNumber"></span> of <span class="totalPages"></span>
</div>
`,
margin: { top: "30mm", bottom: "25mm", left: "15mm", right: "15mm" },
});
Note that the header and footer render in their own context with very limited CSS support. They must use inline styles and cannot reference external stylesheets.
CSS-Based Page Numbers (WeasyPrint / Prince)
For tools with full CSS Paged Media support:
@page {
@bottom-center {
content: "Page " counter(page) " of " counter(pages);
font-size: 9pt;
color: #666;
}
@top-left {
content: "Company Name";
font-size: 9pt;
color: #666;
}
@top-right {
content: string(chapter-title);
font-size: 9pt;
color: #666;
}
}
h1 {
string-set: chapter-title content();
}
Common Conversion Challenges
Dynamic Content and Single-Page Applications
Modern web applications built with React, Vue, or Angular render content dynamically with JavaScript. A basic HTTP fetch of the page URL will get an empty shell. Browser-based tools (Puppeteer, Playwright, browser print) handle this correctly because they execute JavaScript. Non-browser tools (wget, curl, basic converters) will fail.
When using Puppeteer, wait for dynamic content to render:
// Wait for a specific element to appear
await page.waitForSelector(".report-data");
// Or wait for network activity to settle
await page.goto(url, { waitUntil: "networkidle0" });
// Or wait for a specific amount of time (last resort)
await page.waitForTimeout(3000);
Web Fonts
Web pages that load fonts from Google Fonts, Adobe Fonts, or custom servers need those fonts to be fully loaded before PDF generation. Most browser-based tools handle this automatically during the networkidle wait. If fonts are missing in the PDF, add an explicit font-loading wait:
await page.evaluateHandle("document.fonts.ready");
Responsive Layouts
Responsive web pages adapt their layout based on viewport width. The PDF generator uses a specific viewport size that may not match the layout you expect. Set the viewport explicitly:
await page.setViewport({ width: 1200, height: 800 });
For content optimized for print, consider setting a viewport width that matches your page width (e.g., 794px for A4 at 96 DPI).
Authentication
Pages behind a login require authentication before conversion. With Puppeteer:
// Navigate to login page
await page.goto("https://example.com/login");
// Fill in credentials
await page.type("#email", "user@example.com");
await page.type("#password", "password");
await page.click("#login-button");
// Wait for redirect to dashboard
await page.waitForNavigation();
// Now convert the authenticated page
await page.goto("https://example.com/report");
await page.pdf({ path: "report.pdf", format: "A4" });
Production Workflow: Invoice Generation Example
Here is a complete example of a production invoice generation workflow using Puppeteer:
const puppeteer = require("puppeteer");
const fs = require("fs");
async function generateInvoice(invoiceData) {
const html = buildInvoiceHTML(invoiceData);
const browser = await puppeteer.launch({
args: ["--no-sandbox", "--disable-setuid-sandbox"],
});
const page = await browser.newPage();
await page.setContent(html, { waitUntil: "networkidle0" });
const pdfBuffer = await page.pdf({
format: "A4",
margin: { top: "20mm", right: "15mm", bottom: "25mm", left: "15mm" },
printBackground: true,
displayHeaderFooter: true,
footerTemplate: `
<div style="font-size: 8px; width: 100%; text-align: center; color: #999;">
Invoice ${invoiceData.number} — Generated ${new Date().toLocaleDateString()}
— Page <span class="pageNumber"></span>
</div>
`,
});
await browser.close();
return pdfBuffer;
}
This pattern works for any document type: reports, receipts, certificates, shipping labels, and more. For documents that start as Word files rather than HTML, see our guide on how to convert Word to PDF.
Wrapping Up
HTML-to-PDF conversion sits at the intersection of web development and document management. For casual one-off conversions, the browser's built-in print-to-PDF is sufficient. For production applications generating documents at scale, Puppeteer or Playwright provide the automation and control that developers need. And for pixel-perfect print output with advanced page features, WeasyPrint and Prince XML offer full CSS Paged Media support.
The key to good results is controlling the output through CSS @media print rules, explicit page break management, and proper header/footer configuration. Invest time in your print stylesheet, and the conversion becomes a single function call rather than a manual formatting exercise.



