Three Formats With Different Histories
Spreadsheets have three common formats:
- XLS (BIFF8): Microsoft's binary format, used in Excel 97-2003
- XLSX (Office Open XML): Microsoft's modern format, since Excel 2007
- CSV (Comma-Separated Values): plain text, format-agnostic
Each has different strengths. XLS is mostly historical now. XLSX is the daily working format. CSV is the data-interchange standard.
This post covers when each format is right, the conversion workflows, and the gotchas that bite people. For broader file format choice, see our document converter.
Format Comparison
| Format | File extension | Type | Year | Status |
|---|---|---|---|---|
| XLS | .xls | Binary BIFF8 | 1997 | Legacy |
| XLSX | .xlsx | OOXML (ZIP) | 2007 | Current |
| XLSM | .xlsm | OOXML with macros | 2007 | Macro variant |
| XLSB | .xlsb | Binary OOXML | 2007 | Optimized binary |
| CSV | .csv | Plain text | 1972 | Universal |
| TSV | .tsv | Plain text (tabs) | 1972 | Tab-separated |
| ODS | .ods | OpenDocument Spreadsheet | 2005 | LibreOffice |
For most modern workflow: XLSX. For data interchange: CSV. For automation/scripting: CSV or TSV.
XLSX Internals
XLSX is a ZIP archive containing XML files:
example.xlsx
├── [Content_Types].xml
├── _rels/
├── docProps/
│ ├── app.xml
│ └── core.xml
├── xl/
│ ├── workbook.xml
│ ├── styles.xml
│ ├── sharedStrings.xml
│ └── worksheets/
│ └── sheet1.xml
└── ...
You can rename .xlsx to .zip and unzip to inspect. The XML files are human-readable.
This structure makes XLSX accessible to programming languages without Excel installed. Python's openpyxl, JavaScript's xlsx, R's readxl all parse XLSX directly.
CSV Variants
CSV looks simple but has variants:
| Variant | Delimiter | Quote char | Encoding | Use |
|---|---|---|---|---|
| Standard CSV | comma | double quote | UTF-8 | Modern default |
| Excel CSV | comma | double quote | Windows-1252 | Excel default on Windows |
| TSV | tab | none | UTF-8 | Database export |
| PSV | pipe | none | UTF-8 | Mainframe legacy |
| Semicolon CSV | semicolon | double quote | UTF-8 | European Excel |
| RFC 4180 | comma | double quote | UTF-8 | Standardized |
For cross-tool interchange: RFC 4180 standard CSV. For Excel-only: Excel CSV (semicolon variant in EU locales).
When to Use XLSX
Use XLSX when:
- Multiple sheets in one file
- Formulas, charts, formatting
- Large files (XLSX has 1M row limit; CSV has none but Excel still imposes 1M)
- Complex spreadsheets with cell formatting, comments, conditional formatting
- File size under 10 MB (above this, performance suffers)
For most office work: XLSX is the default.
When to Use CSV
Use CSV when:
- Database export/import
- Programming language data interchange
- Large datasets (multi-GB)
- Cross-platform compatibility
- Version control (Git can diff CSV)
- Simple tabular data with no formatting needs
For scripting, ETL pipelines, scientific data: CSV.
When to Use XLS
XLS in 2026 is mostly:
- Historical files from before 2007
- Some specialty industrial software that requires it
- Legacy government databases
For new content: never write XLS. Use XLSX. If you need to maintain XLS files, save as XLSX after each edit and keep XLS as a one-time legacy artifact.
Conversion Workflows
For XLSX to CSV:
# Python with pandas
import pandas as pd
df = pd.read_excel("input.xlsx", sheet_name="Sheet1")
df.to_csv("output.csv", index=False, encoding="utf-8")
For each sheet in a multi-sheet XLSX:
import pandas as pd
from openpyxl import load_workbook
wb = load_workbook("input.xlsx")
for sheet_name in wb.sheetnames:
df = pd.read_excel("input.xlsx", sheet_name=sheet_name)
df.to_csv(f"{sheet_name}.csv", index=False)
For CSV to XLSX:
import pandas as pd
df = pd.read_csv("input.csv")
df.to_excel("output.xlsx", index=False, engine="openpyxl")
Our XLSX to CSV converter handles single-file conversions in the browser.
For batch conversion patterns, see Batch Processing Files Guide.
File Size Reality
For 100,000 rows × 20 columns of typical sales data:
| Format | File size |
|---|---|
| XLSX (with formatting) | 12 MB |
| XLSX (basic) | 5 MB |
| XLS | 8 MB |
| XLSB | 3 MB |
| CSV | 18 MB |
| TSV | 17 MB |
| Parquet | 1 MB |
For multi-million row datasets: CSV becomes unwieldy. Consider Parquet (Apache) for analytical workloads.
Date and Number Pitfalls
Dates and numbers are the most common gotcha:
Date format mismatch: Excel stores dates as numbers (days since 1900). CSV stores dates as text. The conversion can mangle dates if locale settings differ.
Excel cell (2026-05-08): displays as "5/8/2026"
CSV export (US locale): "5/8/2026"
CSV import in EU locale: parsed as "8 May 2026" or "May 8, 2026" → ambiguous
For unambiguous dates: ISO 8601 format (2026-05-08).
Leading zeros lost: ZIP codes, phone numbers with leading zeros become integers. "01234" → 1234.
Scientific notation: large numbers display as "1.23E+15". Force as text in source.
Currency symbols: "$1,234.56" might be parsed as "1234.56" with symbol stripped. Or might fail entirely.
For specific format choices, our document converter handles common gotchas.
Macros and Code
XLSM (macro-enabled XLSX) and legacy XLS contain embedded VBA code:
| File | Macro support | Security risk |
|---|---|---|
| XLS | Yes | High (VBA macros frequent malware vector) |
| XLSX | No | Low (text-only data) |
| XLSM | Yes | High (similar to XLS) |
| XLSB | Yes | Medium |
| CSV | No | None |
| ODS | Limited | Low |
For sensitive environments: XLSX or CSV preferred. For automation needing macros: XLSM with code signing.
For sharing across organizations: CSV avoids the macro risk entirely.
OpenDocument Spreadsheet (ODS)
ODS is LibreOffice's native format:
- Open standard (ISO 26300)
- Similar features to XLSX
- Smaller file size than XLSX (similar internal structure)
- Less Excel compatibility (some features lost in conversion)
For LibreOffice-only workflows: ODS. For Excel compatibility: XLSX.
Common Issues
XLSX file corrupted message: ZIP file integrity issue. Try opening the file as a ZIP, copy contents to a fresh ZIP, rename to .xlsx.
CSV parsing fails on multi-line cells: cells with line breaks need proper quoting. Use a tool that handles RFC 4180 escaping.
Mac CSV Comma: Excel for Mac sometimes saves with semicolon by default. Change in Preferences > Save.
Large XLSX takes minutes to open: Excel calculating formulas. Save as XLSB for binary speed.
CSV characters look broken in Excel: encoding mismatch. Save as UTF-8 with BOM, or use Power Query to import.
For batch automation, see Batch Processing Files Guide.
Frequently Asked Questions
Should I save as XLSX or XLSB?
XLSX for compatibility with non-Excel tools. XLSB for performance with very large workbooks (millions of rows).
Why is my CSV showing wrong characters in Excel?
Excel for Windows defaults to Windows-1252 encoding for CSV. Save as UTF-8 with BOM, or use Power Query (Data > From Text/CSV).
Is CSV always better than XLSX for big data?
For >1M rows: CSV is more practical (Excel's 1M row limit is a hard cap). For under 1M rows with formulas/formatting: XLSX is fine.
Can I edit XLSX without Excel?
Yes. LibreOffice Calc, Google Sheets, Apple Numbers, Excel for Web (free), and dozens of other tools edit XLSX. The format is open-standard.
What about Google Sheets?
Google Sheets uses its own internal format. Export to XLSX or CSV for sharing. Imports XLSX cleanly.
How do I batch convert XLSX to CSV?
Python with pandas (above), Power Query for Excel users, or our XLSX to CSV converter for one-off jobs.
Related Reading
Bottom Line
For modern spreadsheet work: XLSX as the default. CSV for data interchange and large datasets. XLS only for legacy compatibility. ODS for LibreOffice workflows. Watch dates, encoding, and quoting in CSV. Our document converter handles XLSX-to-CSV and back.



