DOC (Microsoft Word 97-2003)
The binary word processor format that defined office documents for two decades.
| Full name | Microsoft Word 97-2003 |
| Extension | .doc |
| MIME type | application/msword |
| Developer | Microsoft |
| Released | 1997 (Word 97); format frozen with Word 2003 |
| Type | Binary word processor document |
| Encoding | Compound File Binary Format (CFB / OLE2) |
| Superseded by | DOCX (Office Open XML), introduced in Word 2007 |
What is a DOC file?
DOC is the binary file format used by Microsoft Word from Word 97 through Word 2003. It stores text, formatting, images, tables, and embedded objects in a single binary file. The format was the default for Word until 2007, when DOCX took over.
A DOC file packages its content inside a Compound File Binary (CFB) structure, the same container format used across older Microsoft Office applications. Inside that container, Word stores a stream of formatted text, paragraph styles, fonts, and layout data in a proprietary binary encoding. The format also supports macros written in VBA, revision history, and embedded OLE objects such as spreadsheets or images. Because every element is packed into one binary blob, DOC files are not human-readable without special software.
History
Microsoft introduced the .doc extension with Word 1.0 in 1983, but the internal format changed significantly with each major release. The Word 97 binary format became the stable baseline shared by Word 97, Word 2000, Word 2002, and Word 2003. Microsoft published the full specification for this format in 2007 under its Open Specification Promise, making it possible for third-party applications to read and write DOC files accurately. Word 2007 replaced DOC with DOCX as the default, though Word still supports DOC to this day.
How it works
The file starts with a Compound File Binary header that organizes content into storage containers and streams, similar to a small file system embedded in the file. The main text stream is called WordDocument, and a separate File Information Block (FIB) at the start of that stream acts as an index pointing to where each data section lives. Character and paragraph formatting is stored in compressed property tables called PLCs (Plex structures). Embedded images, OLE objects, and macro code each occupy their own streams within the same container.
What it is used for
- Exchanging editable documents with colleagues still using Office 2003 or earlier
- Archiving contracts, reports, and letters from the pre-2007 era
- Opening legacy business documents from law firms, government offices, or educational institutions
- Converting old DOC files to PDF or DOCX for long-term storage
How to open it
Microsoft Word (any version), LibreOffice Writer, and Google Docs all open DOC files directly. On mobile, apps like Microsoft Word for iOS and Android handle DOC files without conversion.
Pros and cons
Strengths
- Universal compatibility with software from the late 1990s through today
- Single-file format keeps text, images, and formatting together without extra assets
- Supported natively by virtually every word processor on every platform
- Well-documented specification allows reliable third-party support
Trade-offs
- Binary encoding makes files larger and harder to repair if corrupted
- VBA macros embedded in DOC files are a common malware delivery vector
- No built-in support for modern features like advanced typography or live collaboration
- Superseded by DOCX, which is smaller, more open, and easier to parse
Convert DOC files
Free, in your browser, no signup. Start at the DOC converter, or jump straight to a popular conversion below.
From DOC
Curious how fast and how small? See our measured conversion benchmarks.
DOC FAQ
What is the difference between DOC and DOCX?
DOC is the older binary format used through Word 2003. DOCX is the newer format introduced in Word 2007, based on Open XML. DOCX files are zip archives of XML files, which makes them smaller and easier to inspect or repair.
Can I open a DOC file without Microsoft Word?
Yes. LibreOffice Writer, Google Docs, WPS Office, and Apple Pages all open DOC files. Quality varies for complex documents with heavy formatting or macros.
Are DOC files safe to open?
Generally yes, but DOC files can contain VBA macros that execute code when the file is opened. Always disable macros or use a macro-free viewer if the file comes from an unknown source.
Why do some DOC files look different when opened on different computers?
DOC stores font names by reference, not the actual font data. If the fonts used in the original document are not installed on your system, the word processor substitutes a different font, which can shift line breaks and page layouts.