JSON to CSV
Data Conversion
File Formats
Data Processing
Programming Guide

How can I convert JSON to CSV?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Converting JSON to CSV sounds simple until you hit nested objects, arrays, or records with inconsistent keys. CSV is a flat table format, so a good converter has to decide how to flatten structure and how to represent missing or repeated values consistently.

Start with the Simple Flat-Object Case

If the input is already a list of objects with the same keys, Python's standard library is enough.

python
1import csv
2import json
3
4with open('input.json', 'r', encoding='utf-8') as handle:
5    data = json.load(handle)
6
7fieldnames = ['id', 'name', 'email']
8
9with open('output.csv', 'w', newline='', encoding='utf-8') as handle:
10    writer = csv.DictWriter(handle, fieldnames=fieldnames)
11    writer.writeheader()
12    writer.writerows(data)

This works well only when the JSON schema is already flat and stable.

Flatten Nested Objects Before Writing CSV

Once the JSON contains nested dictionaries, you need to turn those nested paths into flat column names.

python
1def flatten(obj, parent_key='', sep='.'):
2    items = []
3    for key, value in obj.items():
4        new_key = f'{parent_key}{sep}{key}' if parent_key else key
5        if isinstance(value, dict):
6            items.extend(flatten(value, new_key, sep=sep).items())
7        else:
8            items.append((new_key, value))
9    return dict(items)

That helper converts a record such as {"user": {"name": "Ana"}} into a flat shape like {"user.name": "Ana"}.

You can then flatten every record before writing:

python
rows = [flatten(record) for record in data]

Build the Column Set Dynamically

Real JSON records often do not all contain the same keys. A safe converter gathers the union of keys across all flattened rows and uses that as the CSV header.

python
1import csv
2import json
3
4with open('input.json', 'r', encoding='utf-8') as handle:
5    raw = json.load(handle)
6
7rows = [flatten(record) for record in raw]
8columns = sorted({key for row in rows for key in row.keys()})
9
10with open('output.csv', 'w', newline='', encoding='utf-8') as handle:
11    writer = csv.DictWriter(handle, fieldnames=columns)
12    writer.writeheader()
13    for row in rows:
14        writer.writerow(row)

Missing keys naturally become empty cells in the CSV output.

Decide What Arrays Should Mean

Arrays are where JSON-to-CSV conversion becomes a design problem, not just a syntax problem. There is no single correct answer. Common strategies include:

  • join the array into one delimited cell
  • explode each array element into a separate row
  • keep only the first element

A simple join policy looks like this:

python
1def normalize_value(value):
2    if isinstance(value, list):
3        return '|'.join(map(str, value))
4    return value

Then call normalize_value on each non-dictionary value while flattening. The important part is to document the rule, because downstream users need to know what the CSV actually represents.

Pandas Is Convenient for Analysis Workflows

If the goal is quick analysis rather than a custom ETL tool, Pandas can reduce the amount of code.

python
1import pandas as pd
2
3frame = pd.read_json('input.json')
4flat = pd.json_normalize(frame.to_dict(orient='records'))
5flat.to_csv('output.csv', index=False)

This is very handy in notebooks, but you still need to understand how nested data and arrays are being interpreted.

Validate the Output

A trustworthy conversion step should check more than whether a file was created. Useful checks include row count, required columns, encoding, and whether cells with commas or newlines are quoted correctly.

These checks matter because CSV often becomes the boundary between engineering systems and business tooling. Silent data loss at this step is expensive.

Common Pitfalls

Assuming nested JSON maps directly to CSV columns without a flattening rule leads to missing or unusable data.

Ignoring arrays until the end forces an implicit policy decision that may not match downstream expectations.

Letting column order vary between runs makes output harder to diff and harder to consume reliably.

Summary

  • Flat JSON can be written directly with csv.DictWriter.
  • Nested objects need to be flattened before CSV export.
  • Arrays require an explicit policy such as joining, exploding, or truncating.
  • Dynamic column discovery is useful when records do not all share the same keys.

Course illustration
Course illustration

All Rights Reserved.