OCR API — Free Tier + Pay-As-You-Go

Name: OCR API
Brand: ConvertIntoMP4
Availability: InStock

Optical Character Recognition for PDFs and images via API. 100+ languages, searchable PDF output, plain-text extraction.

Start integrating in 5 minutes Get an API key

What it does

The OCR API runs Tesseract 5 over scanned PDFs, photographs of documents, and image files, producing either plain-text extraction, hOCR (XML with bounding boxes), or searchable PDF output (the original document with an invisible text layer overlaid for full-text search). Tesseract supports 100+ languages with strong accuracy for Latin scripts (typically 95-99% character recognition on clean 300 DPI scans) and acceptable accuracy for Cyrillic, Greek, Arabic, Hebrew, Chinese, Japanese, and Korean (90-95% on equivalent quality sources). Per-job parameters expose the full Tesseract surface: language (`ocrLanguage=eng+fra+deu` for multi-lingual), PSM (page segmentation mode, 1-13), OEM (engine mode, 0-3), confidence threshold, output format, dewarping (`autoRotate=true` corrects skew), and pre-processing (`enhance=true` applies adaptive thresholding before OCR for better noisy-scan results).

For PDFs, OCR is applied page-by-page with parallel processing; multi-language detection runs first to pick the optimal language model per page when `ocrLanguage=auto` is set.

Supported formats

Source formats (9)

pdf
jpg
jpeg
png
tiff
tif
bmp
webp
heic

Target formats (4)

pdf
txt
hocr
tsv

Quick start

All three SDK languages show the same conversion: a single POST to /v1/convert with your API key in the X-Api-Key header.

curl

curl -X POST https://api.convertintomp4.com/v1/ocr \
  -H "X-Api-Key: ck_your_api_key" \
  -F "file=@scan.pdf" \
  -F "ocrLanguage=eng" \
  -F "output=searchable-pdf"

Node.js (@convertintomp4/sdk)

import { ConvertIntoMP4 } from "@convertintomp4/sdk";
import fs from "node:fs";

const client = new ConvertIntoMP4({ apiKey: process.env.CIM4_API_KEY });

const job = await client.ocr({
  file: fs.createReadStream("scan.pdf"),
  ocrLanguage: "eng",
  output: "searchable-pdf",
});

const result = await client.waitForJob(job.id);
console.log("Searchable PDF:", result.outputUrl);

Python (convertintomp4)

from convertintomp4 import Client

client = Client(api_key="ck_your_api_key")

with open("scan.pdf", "rb") as f:
    job = client.ocr(file=f, ocr_language="eng", output="searchable-pdf")

result = client.wait_for_job(job.id)
print("Searchable PDF:", result.output_url)

Features

Tesseract 5 with 100+ language packs
Output: searchable PDF, plain text, hOCR, TSV
Auto-rotation and dewarping
Adaptive thresholding for noisy scans
Per-page parallel processing
Auto-language detection (multi-page PDFs)
Confidence-threshold filtering

Pricing

From $9.99/mo (Pro) or $24.99/mo (Business) — or pay-as-you-go on the API plan.

Free tier: 5 conversions/day, 100 MB file size, no API key required (IP-gated). Pro $9.99/mo: 100/day (2,000/month), 2 GB files. Business $24.99/mo: 1,000/day (20,000/month), 10 GB files, GPU encoding, dedicated support.

See full pricing breakdown →

Built for production

99.9% uptime SLA

Multi-region failover, transparent status page, 60-second response-time guarantee on Business.

Encryption + auto-delete

TLS 1.2+ in transit, AES-256 at rest. Files deleted after 1h / 24h / 7d depending on plan, or instantly via DELETE endpoint. See the security page.

~7s median latency

Most sub-100 MB jobs complete in 6-9 seconds. Webhook-driven async for heavier workloads; waitForJob for synchronous flows.

Frequently Asked Questions

How accurate is OCR for English documents?

95-99% character recognition on clean 300 DPI scans of printed text in standard fonts. Accuracy drops for handwriting (50-70%), skewed scans (80-90%), low-DPI mobile photos (75-90%), and stylised fonts (85-95%). Use `enhance=true` to pre-process noisy sources for better results.

What's a searchable PDF?

The original PDF with an invisible text layer overlaid behind the image content. Visually identical to the source, but full-text searchable in any PDF reader and indexable by search engines / DMS systems. The most common OCR output mode for document archival.

Can the API detect language automatically?

Yes. Set `ocrLanguage=auto` and the API runs language detection on a per-page basis, then applies the optimal Tesseract language model. Slower than explicit language selection but invaluable for mixed-language archives where the language varies page-by-page.

Which languages have the best OCR accuracy?

Latin-script languages (English, French, German, Spanish, Italian, Portuguese, Dutch) — typically 95-99% on clean scans. Cyrillic (Russian, Bulgarian, Ukrainian) — 90-95%. Arabic, Hebrew, Chinese, Japanese, Korean — 85-95% depending on font and source quality. See Tesseract's per-language quality matrix in our docs.

Are tables preserved in OCR output?

For plain-text output, table structure is approximated via whitespace; cells are space-separated, rows are line-separated. For hOCR output, cells have explicit bounding boxes you can use to reconstruct the table. For native table extraction with row/column structure, use the PDF to DOCX API with OCR mode.

Or browse the full catalogue of 23 API products →

Get an API key

Start integrating the OCR API in five minutes. Read the docs, grab a key, and ship your first conversion before the trial coffee cools.

Create your API key Read the API docs Browse all APIs

OCR API — Free Tier + Pay-As-You-Go

Optical Character Recognition for PDFs and images via API. 100+ languages, searchable PDF output, plain-text extraction.

Start integrating in 5 minutes Get an API key

What it does

For PDFs, OCR is applied page-by-page with parallel processing; multi-language detection runs first to pick the optimal language model per page when `ocrLanguage=auto` is set.

Supported formats

Source formats (9)

pdf
jpg
jpeg
png
tiff
tif
bmp
webp
heic

Target formats (4)

pdf
txt
hocr
tsv

Quick start

All three SDK languages show the same conversion: a single POST to /v1/convert with your API key in the X-Api-Key header.

curl

curl -X POST https://api.convertintomp4.com/v1/ocr \
  -H "X-Api-Key: ck_your_api_key" \
  -F "file=@scan.pdf" \
  -F "ocrLanguage=eng" \
  -F "output=searchable-pdf"

Node.js (@convertintomp4/sdk)

import { ConvertIntoMP4 } from "@convertintomp4/sdk";
import fs from "node:fs";

const client = new ConvertIntoMP4({ apiKey: process.env.CIM4_API_KEY });

const job = await client.ocr({
  file: fs.createReadStream("scan.pdf"),
  ocrLanguage: "eng",
  output: "searchable-pdf",
});

const result = await client.waitForJob(job.id);
console.log("Searchable PDF:", result.outputUrl);

Python (convertintomp4)

from convertintomp4 import Client

client = Client(api_key="ck_your_api_key")

with open("scan.pdf", "rb") as f:
    job = client.ocr(file=f, ocr_language="eng", output="searchable-pdf")

result = client.wait_for_job(job.id)
print("Searchable PDF:", result.output_url)

Features

Tesseract 5 with 100+ language packs
Output: searchable PDF, plain text, hOCR, TSV
Auto-rotation and dewarping
Adaptive thresholding for noisy scans
Per-page parallel processing
Auto-language detection (multi-page PDFs)
Confidence-threshold filtering

Pricing

From $9.99/mo (Pro) or $24.99/mo (Business) — or pay-as-you-go on the API plan.

See full pricing breakdown →

Built for production

99.9% uptime SLA

Multi-region failover, transparent status page, 60-second response-time guarantee on Business.

Encryption + auto-delete

TLS 1.2+ in transit, AES-256 at rest. Files deleted after 1h / 24h / 7d depending on plan, or instantly via DELETE endpoint. See the security page.

~7s median latency

Most sub-100 MB jobs complete in 6-9 seconds. Webhook-driven async for heavier workloads; waitForJob for synchronous flows.

Frequently Asked Questions

How accurate is OCR for English documents?

What's a searchable PDF?

Can the API detect language automatically?

Which languages have the best OCR accuracy?

Are tables preserved in OCR output?

Or browse the full catalogue of 23 API products →

Get an API key

Start integrating the OCR API in five minutes. Read the docs, grab a key, and ship your first conversion before the trial coffee cools.

Create your API key Read the API docs Browse all APIs

What it does

Supported formats

Source formats (9)

Target formats (4)

Quick start

Features

Pricing

Built for production

99.9% uptime SLA

Encryption + auto-delete

~7s median latency

Frequently Asked Questions

Related APIs

Get an API key

What it does

Supported formats

Source formats (9)

Target formats (4)

Quick start

Features

Pricing

Built for production

99.9% uptime SLA

Encryption + auto-delete

~7s median latency

Frequently Asked Questions

Related APIs

Get an API key