Why Not Just Open CSV in Excel?
You can double-click a CSV file and Excel will open it — so why bother with a proper conversion? Because Excel's automatic CSV import makes assumptions that silently corrupt your data.
The most notorious issue: leading zeros are stripped. A ZIP code like 07042 becomes 7042. A product code like 000123 becomes 123. Excel sees digits and converts to a number, permanently destroying the original text formatting. Similarly, dates formatted as 01/02/03 get interpreted differently depending on your locale — is that January 2, February 1, or 2003? Excel guesses, and often guesses wrong.
A proper CSV-to-XLSX conversion lets you control column data types, preserve text formatting, handle encoding correctly, and add formatting that makes the data usable. This guide covers both automated tools and manual methods to convert CSV to XLSX with data integrity intact.
Common CSV Import Problems
| Problem | Example | What Excel Does | Correct Behavior |
|---|---|---|---|
| Leading zeros | 07042 | Converts to 7042 | Preserves as text 07042 |
| Long numbers | 4111111111111111 | Rounds to 4.11111E+15 | Preserves full number |
| Dates | 01-02-2026 | May interpret as Feb 1 | Respects specified format |
| Encoding | Müller, José | Shows Müller, José | Displays correctly |
| Delimiters | "Smith, Jr.",John | May split on internal comma | Respects quoted fields |
Method 1: LibreOffice Command Line
LibreOffice handles CSV import with explicit encoding and delimiter settings:
libreoffice --headless --calc --convert-to xlsx \
--infilter="CSV:44,34,76,1" input.csv
The filter codes mean:
44— field delimiter (ASCII 44 = comma)34— text delimiter (ASCII 34 = double quote)76— character set (76 = UTF-8)1— starting row
For semicolon-delimited CSVs (common in European locales):
libreoffice --headless --calc --convert-to xlsx \
--infilter="CSV:59,34,76,1" input.csv
Batch Convert Multiple CSVs
mkdir -p xlsx
for file in *.csv; do
[ -f "$file" ] || continue
libreoffice --headless --calc --convert-to xlsx \
--infilter="CSV:44,34,76,1" --outdir xlsx "$file"
done
Method 2: Python with openpyxl (Full Control)
For maximum control over column types and formatting, Python gives you precision:
import csv
from openpyxl import Workbook
from openpyxl.utils import get_column_letter
wb = Workbook()
ws = wb.active
with open('input.csv', 'r', encoding='utf-8') as f:
reader = csv.reader(f)
for row_idx, row in enumerate(reader, 1):
for col_idx, value in enumerate(row, 1):
cell = ws.cell(row=row_idx, column=col_idx, value=value)
# First row is header - bold it
if row_idx == 1:
cell.font = cell.font.copy(bold=True)
# Auto-fit column widths
for col in ws.columns:
max_length = max(len(str(cell.value or '')) for cell in col)
col_letter = get_column_letter(col[0].column)
ws.column_dimensions[col_letter].width = min(max_length + 2, 50)
wb.save('output.xlsx')
This approach lets you explicitly control which columns are text (preserving leading zeros), which are numbers, and which are dates.
Method 3: Online Conversion
Use the CSV to Excel converter online for a quick conversion with proper encoding handling. The tool preserves text formatting and handles UTF-8 encoding automatically. For other spreadsheet conversions, try the Document Converter.
Handling Encoding Issues
CSV encoding is the single biggest source of conversion problems. The file might be UTF-8, Windows-1252 (Latin-1), Shift-JIS (Japanese), or any number of encodings — and it usually has no metadata declaring which one.
Detect Encoding
file -bi input.csv
# Output: text/plain; charset=utf-8
Or with Python:
import chardet
with open('input.csv', 'rb') as f:
result = chardet.detect(f.read())
print(result['encoding']) # e.g., 'utf-8', 'windows-1252'
Convert Encoding Before Import
If the CSV is not UTF-8, convert it first:
iconv -f WINDOWS-1252 -t UTF-8 input.csv > input_utf8.csv
Then convert the UTF-8 CSV to XLSX.
Preserving Data Types
Force Text for Specific Columns
When you need certain columns treated as text (ZIP codes, phone numbers, product IDs), the Python approach is most reliable:
# Columns that must remain text (0-indexed)
text_columns = {0, 3, 7} # e.g., ID, ZIP, phone
for row_idx, row in enumerate(reader, 1):
for col_idx, value in enumerate(row, 1):
cell = ws.cell(row=row_idx, column=col_idx)
if (col_idx - 1) in text_columns:
cell.value = value
cell.number_format = '@' # Text format
else:
# Try to parse as number
try:
cell.value = float(value) if '.' in value else int(value)
except ValueError:
cell.value = value
Date Formatting
Parse dates explicitly rather than letting Excel guess:
from datetime import datetime
date_columns = {2, 5} # Columns containing dates
date_format = '%Y-%m-%d' # Adjust to match your CSV
for col_idx, value in enumerate(row, 1):
if (col_idx - 1) in date_columns and row_idx > 1:
cell.value = datetime.strptime(value, date_format)
cell.number_format = 'YYYY-MM-DD'
Quality and Settings Tips
Large CSVs: Excel has a row limit of 1,048,576 rows. If your CSV exceeds this, split it across multiple sheets or use a database instead. LibreOffice Calc has the same limit.
Performance: For CSVs over 100,000 rows, openpyxl's write-only mode is significantly faster:
wb = Workbook(write_only=True)
ws = wb.create_sheet()
Number precision: Excel uses IEEE 754 double-precision floating point, which gives 15-16 significant digits. If your CSV has numbers longer than 15 digits (like credit card numbers or certain IDs), they must be stored as text to avoid rounding.
Separator detection: If you are unsure whether your CSV uses commas, semicolons, or tabs as delimiters, check the first few lines. Tools like csvlint can auto-detect the delimiter.
For more on spreadsheet formats, see our ODS vs XLSX comparison.
Common Issues and Troubleshooting
All data in one column
The CSV uses a delimiter different from what the converter expects. Most European CSVs use semicolons (;) instead of commas. Specify the correct delimiter in your conversion command.
Garbled characters (mojibake)
The encoding is wrong. Detect the actual encoding with file -bi or chardet, then specify it correctly during conversion. The most common culprit is opening a Windows-1252 CSV as if it were UTF-8.
Numbers displayed in scientific notation
Large numbers (12+ digits) default to scientific notation in Excel. Format those columns as Number (with 0 decimal places) or Text to show the full value. In the Python approach, set cell.number_format = '0' for those columns.
Date columns not recognized
Dates in non-standard formats (DD/MM/YYYY vs MM/DD/YYYY) confuse automatic parsers. Parse dates explicitly with the correct format string in Python, or pre-format them as ISO 8601 (YYYY-MM-DD) which is unambiguous.
Conclusion
A proper CSV-to-XLSX conversion preserves data integrity that a simple "open in Excel" does not. Leading zeros stay intact, dates are parsed correctly, encoding is handled, and you can add formatting that makes the data immediately usable. For quick conversions, LibreOffice command-line is the fastest path. For precision control over data types and formatting, Python with openpyxl gives you complete authority over every cell.
Ready to convert? Try our free CSV to Excel converter — no registration required.



