Pandas
Unix Time
Dataframe
Date Conversion
Python

Convert unix time to readable date in pandas dataframe

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Unix timestamps are efficient for storage and transport, but not ideal for analysis or reporting. In pandas, converting Unix time to readable datetime is common and simple with pd.to_datetime, yet errors appear when unit assumptions are wrong. Seconds, milliseconds, and nanoseconds are all common in real data. If you parse milliseconds as seconds, dates jump decades into the future; if you parse seconds as milliseconds, dates collapse near 1970.

Reliable conversion requires explicit unit, timezone handling, and validation steps. This article shows robust patterns for single columns, mixed datasets, and production data pipelines.

Core Sections

1. Basic conversion with explicit unit

python
1import pandas as pd
2
3df = pd.DataFrame({
4    "event_ts": [1704067200, 1704153600, 1704240000]  # seconds
5})
6
7df["event_time"] = pd.to_datetime(df["event_ts"], unit="s", utc=True)
8print(df)

Always pass unit explicitly. Do not rely on guesswork for timestamp scale.

2. Milliseconds and other scales

If source systems emit milliseconds (common in logs and JS APIs), use unit='ms'.

python
df = pd.DataFrame({"ts_ms": [1704067200000, 1704153600000]})
df["event_time"] = pd.to_datetime(df["ts_ms"], unit="ms", utc=True)

Other valid units include us and ns for microseconds and nanoseconds.

3. Convert timezone for local reporting

Store in UTC, then convert for presentation.

python
1df["event_time_toronto"] = (
2    pd.to_datetime(df["event_ts"], unit="s", utc=True)
3      .dt.tz_convert("America/Toronto")
4)

Avoid localizing naive times after the fact unless you are certain about source timezone.

4. Handle invalid or mixed data safely

Use errors='coerce' to avoid conversion crashes and inspect bad rows.

python
1df = pd.DataFrame({"raw_ts": [1704067200, "bad", None]})
2df["parsed"] = pd.to_datetime(df["raw_ts"], unit="s", errors="coerce", utc=True)
3
4invalid = df[df["parsed"].isna()]
5print(invalid)

This is essential when ingesting external CSV or API payloads.

5. Build a reusable conversion utility

python
1def parse_unix(series: pd.Series, unit: str = "s", tz: str = "UTC") -> pd.Series:
2    ts = pd.to_datetime(series, unit=unit, errors="coerce", utc=True)
3    return ts if tz == "UTC" else ts.dt.tz_convert(tz)
4
5orders["created_at"] = parse_unix(orders["created_unix"], unit="ms", tz="America/New_York")

Centralizing conversion avoids inconsistent timestamp handling across notebooks and services.

6. Format for human-readable output

For display/export only, format datetime strings explicitly.

python
df["event_time_str"] = df["event_time"].dt.strftime("%Y-%m-%d %H:%M:%S %Z")

Keep original datetime dtype for analysis; convert to string only when necessary.

Common Pitfalls

  • Parsing milliseconds as seconds (or vice versa), producing wildly incorrect dates.
  • Creating naive datetimes and then mixing them with timezone-aware columns.
  • Using string formatting too early and losing datetime operations.
  • Ignoring invalid timestamp rows instead of coercing and auditing them.
  • Relying on implicit timezone assumptions from source systems.

Summary

Converting Unix time in pandas is reliable when you specify unit and timezone explicitly. Use pd.to_datetime(..., unit=..., utc=True) as the baseline, convert to local timezone only for presentation, and keep datetimes typed until final output. Add coercion and validation for messy sources, and wrap the logic in a reusable helper to enforce consistency across pipelines.

A practical way to harden this topic in real projects is to add a small operational checklist and treat it as part of your engineering standard, not a one-off fix. Start by creating one minimal failing case and one passing case that represent real input from production logs. Then automate those checks in CI so regressions are caught before release. Add lightweight instrumentation around the critical branch where this logic runs, and include structured fields that let you filter by version, environment, and error type. This gives you fast feedback when behavior changes after dependency upgrades or refactors.

For long-term maintainability on convert unix time to readable date in pandas dataframe, keep one source of truth for helper logic instead of duplicating variants across services or UI layers. Document assumptions near the code, including data format, edge-case behavior, and expected fallback policy. During code review, verify that example inputs and tests cover empty values, malformed values, and high-volume scenarios. Teams that combine explicit assumptions, repeatable tests, and basic observability typically avoid the same category of bug recurring every quarter.


Course illustration
Course illustration

All Rights Reserved.