JSON to CSV With Nested Fields: Flattening, Pivoting, and the Pandas Workflow

Why JSON to CSV Is Non-Trivial

JSON allows nested objects and arrays:

{
  "user": {
    "id": 1,
    "name": "Alice",
    "address": {
      "city": "Boston",
      "country": "USA"
    },
    "orders": [
      { "id": 100, "amount": 50 },
      { "id": 101, "amount": 75 }
    ]
  }
}

CSV is flat:

id,name,city,country
1,Alice,Boston,USA

Converting nested JSON to CSV requires deciding:

Flatten nested objects (separate columns for address.city, address.country)
Explode arrays (create one row per array element)
Keep nested as JSON strings in cells

Different decisions produce dramatically different CSV outputs. This post covers the practical pandas workflow. For broader CSV context, see XLS vs XLSX vs CSV.

Pandas Approach to Flat JSON

For simple JSON arrays:

import pandas as pd

# JSON array of flat objects
data = [
  {"id": 1, "name": "Alice", "city": "Boston"},
  {"id": 2, "name": "Bob", "city": "Seattle"},
]

df = pd.DataFrame(data)
df.to_csv("output.csv", index=False)

For most "data export" JSON: flat structure converts cleanly.

Flattening Nested Objects

For JSON with nested objects (not arrays):

import pandas as pd
import json

with open("input.json") as f:
    data = json.load(f)

# pandas json_normalize handles nested
df = pd.json_normalize(data)
# Nested keys become 'parent.child' columns
df.to_csv("output.csv", index=False)

Result:

id,name,address.city,address.country
1,Alice,Boston,USA

The dot notation indicates nesting. Tool-friendly but ugly. Rename for production:

df.rename(columns={"address.city": "city", "address.country": "country"}, inplace=True)

For batch processing, see Batch Processing Files Guide.

Exploding Arrays

For JSON with array fields:

import pandas as pd

data = {
    "user": "Alice",
    "orders": [{"id": 100, "amount": 50}, {"id": 101, "amount": 75}]
}

df = pd.json_normalize(data, record_path="orders", meta="user")

Result:

id,amount,user
100,50,Alice
101,75,Alice

The array is "exploded" into multiple rows. The non-array fields are repeated for each row.

For arrays of multiple types, pandas requires explicit definition. Complex JSON often needs custom handling.

Multi-level Nesting

For deeply nested JSON:

{
  "company": "Acme",
  "departments": [
    {
      "name": "Eng",
      "employees": [{ "name": "Alice" }, { "name": "Bob" }]
    },
    {
      "name": "Sales",
      "employees": [{ "name": "Charlie" }]
    }
  ]
}

To produce one row per employee:

data = {
    "company": "Acme",
    "departments": [
        {"name": "Eng", "employees": [{"name": "Alice"}, {"name": "Bob"}]},
        {"name": "Sales", "employees": [{"name": "Charlie"}]}
    ]
}

# Multi-level flatten
df = pd.json_normalize(
    data,
    record_path=["departments", "employees"],
    meta=["company", ["departments", "name"]]
)

Result:

name,company,departments.name
Alice,Acme,Eng
Bob,Acme,Eng
Charlie,Acme,Sales

For deeply nested data, the pandas API has options for navigating the structure.

Mixed Array Types

Arrays of different types (strings + objects mixed) are tricky:

{
  "user": "Alice",
  "tags": ["important", { "label": "VIP", "level": 3 }]
}

Solution: pre-process to normalize types:

# Convert all elements to dicts
def normalize(item):
    return item if isinstance(item, dict) else {"value": item}

data["tags"] = [normalize(t) for t in data["tags"]]

For complex mixed-type arrays: write custom handling logic.

Streaming Large JSON

For multi-GB JSON files (don't fit in memory):

import ijson

with open("large.json", "rb") as f:
    for record in ijson.items(f, "items.item"):
        # Process each record individually
        yield record

ijson is a streaming JSON parser. Doesn't load entire file into memory.

For batch CSV writing:

import ijson
import csv

with open("large.json", "rb") as fin, open("output.csv", "w") as fout:
    writer = csv.DictWriter(fout, fieldnames=["id", "name", "amount"])
    writer.writeheader()

    for record in ijson.items(fin, "items.item"):
        writer.writerow({
            "id": record["id"],
            "name": record["name"],
            "amount": record["amount"]
        })

Streaming approach handles arbitrarily large JSON files.

CSV to JSON

For the reverse direction:

import pandas as pd

df = pd.read_csv("input.csv")
df.to_json("output.json", orient="records", indent=2)

orient="records" produces an array of objects (most common JSON format).

Other orients:

index: JSON object keyed by row index
columns: JSON object keyed by column name
values: just the data, no metadata
split: separate metadata and data
table: with schema metadata

For most CSV-to-JSON: orient="records".

Pivot and Aggregation

For JSON like:

[
  { "date": "2026-01", "product": "A", "sales": 100 },
  { "date": "2026-01", "product": "B", "sales": 150 },
  { "date": "2026-02", "product": "A", "sales": 120 },
  { "date": "2026-02", "product": "B", "sales": 180 }
]

Pivot to wide format:

df = pd.read_json("data.json")
pivoted = df.pivot(index="date", columns="product", values="sales")
pivoted.to_csv("output.csv")

Result:

date,A,B
2026-01,100,150
2026-02,120,180

For complex pivots: pandas's pivot_table with aggregation functions.

Common Issues

Numbers showing as strings: JSON has different types than CSV. Force types:

df = pd.read_json("data.json", dtype={"id": int, "amount": float})

Date format inconsistent: parse explicitly:

df["date"] = pd.to_datetime(df["date"])
df["date"] = df["date"].dt.strftime("%Y-%m-%d")

Encoding issues with special characters: ensure UTF-8 throughout:

df.to_csv("output.csv", encoding="utf-8", index=False)

Memory error on large JSON: use streaming with ijson.

Nested JSON in CSV cells: store as JSON string:

df["nested"] = df["nested"].apply(json.dumps)
df.to_csv("output.csv", index=False)

For batch CSV processing, see Batch Text Replacement in CSV.

Tools Beyond Pandas

Tool	Use case
jq (command-line)	Quick JSON manipulation
miller (mlr)	CSV/JSON command-line conversion
csvkit	CSV-focused tools
jc	Convert command output to JSON
dasel	Multi-format query language

For one-off conversions: jq or miller. For complex transformations or pipelines: pandas.

# jq example: extract specific field
jq -r '.users[] | [.id, .name, .email] | @csv' input.json > output.csv

# miller example: nested JSON to flat CSV
mlr --ijson --ocsv flatten input.json > output.csv

Frequently Asked Questions

Should I use pandas or jq for JSON to CSV?

For one-off command-line work: jq. For complex programmatic transformations or large data: pandas.

How do I handle null values?

Pandas treats null as NaN by default. CSV represents as empty string. Configure:

df.to_csv("output.csv", na_rep="NULL")

What about Excel format?

df.to_excel("output.xlsx", index=False, engine="openpyxl")

For Excel-specific work, see XLS vs XLSX vs CSV.

Can I convert JSON Lines (JSONL)?

Yes:

df = pd.read_json("input.jsonl", lines=True)
df.to_csv("output.csv", index=False)

JSONL has one JSON object per line. Common for log files and streaming data.

Performance for very large files?

Streaming with ijson + csv module is fastest. Pandas with chunking works for moderate sizes.

How do I keep nested structure as a single CSV cell?

df["nested"] = df["nested"].apply(json.dumps)
df.to_csv("output.csv", index=False)

Stored as JSON string. Re-parseable later.

Bottom Line

For JSON to CSV conversion: pandas with json_normalize for flattening nested objects, with record_path for exploding arrays. For large files: ijson streaming. For one-off work: jq or miller. Always handle types and encoding explicitly. Our document converter handles related format conversions in pipelines.

JSON to CSV With Nested Fields: Flattening, Pivoting, and the Pandas Workflow

Why JSON to CSV Is Non-Trivial

Pandas Approach to Flat JSON

Flattening Nested Objects

Exploding Arrays

Multi-level Nesting

Mixed Array Types

Streaming Large JSON

CSV to JSON

Pivot and Aggregation

Common Issues

Tools Beyond Pandas

Frequently Asked Questions

Should I use pandas or jq for JSON to CSV?

How do I handle null values?

What about Excel format?

Can I convert JSON Lines (JSONL)?

Performance for very large files?

How do I keep nested structure as a single CSV cell?

Related Reading

Bottom Line

Related Articles

How to Convert CSV to Excel With Formatting

How to Convert Excel XLSX to CSV

Batch Text Replacement in CSV Files: sed, awk, and Pandas Workflows

About the Author