Add missing dates to pandas dataframe

pandas

data manipulation

dataframe

missing data

Python

Add missing dates to pandas dataframe

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

When a pandas DataFrame has gaps in its date sequence, the usual fix is to create the full date range you expect and reindex the data onto it. That gives you explicit rows for missing dates, after which you can decide whether to leave values as missing, fill them, or compute them from neighboring observations.

Reindex against a complete date range

The basic pattern is to convert the date column to datetime, set it as the index, create a full range, and call reindex.

python

1import pandas as pd
2
3sales = pd.DataFrame(
4    {
5        "date": ["2024-01-01", "2024-01-03", "2024-01-06"],
6        "orders": [10, 15, 7],
7    }
8)
9
10sales["date"] = pd.to_datetime(sales["date"])
11sales = sales.set_index("date")
12
13full_index = pd.date_range(sales.index.min(), sales.index.max(), freq="D")
14completed = sales.reindex(full_index)
15print(completed)

Missing dates now exist as rows with NaN values.

Fill missing values only after deciding what the gap means

Adding dates and filling values are separate decisions. Sometimes missing dates mean zero activity. Sometimes they mean data was not collected.

python

completed["orders"] = completed["orders"].fillna(0)
print(completed)

This is appropriate for cases such as daily order counts where a missing date should be interpreted as zero orders. It is not appropriate when the missing value means unknown.

Use `asfreq` when the index is already a `DatetimeIndex`

If the DataFrame already uses a datetime index, asfreq is a concise alternative.

python

1import pandas as pd
2
3series = pd.DataFrame(
4    {"orders": [10, 15, 7]},
5    index=pd.to_datetime(["2024-01-01", "2024-01-03", "2024-01-06"]),
6)
7
8completed = series.asfreq("D")
9print(completed)

This is especially clean in time-series pipelines where the index is already set up correctly.

Add missing dates per group when multiple entities exist

Real datasets often contain several independent series, such as one store per date. In that case, fill gaps within each group rather than across the entire table.

python

1import pandas as pd
2
3sales = pd.DataFrame(
4    {
5        "store": ["A", "A", "B", "B"],
6        "date": ["2024-01-01", "2024-01-03", "2024-01-01", "2024-01-02"],
7        "orders": [10, 15, 5, 8],
8    }
9)
10
11sales["date"] = pd.to_datetime(sales["date"])
12
13frames = []
14for store, group in sales.groupby("store"):
15    group = group.set_index("date").sort_index()
16    idx = pd.date_range(group.index.min(), group.index.max(), freq="D")
17    filled = group.reindex(idx)
18    filled["store"] = store
19    frames.append(filled.reset_index(names="date"))
20
21result = pd.concat(frames, ignore_index=True)
22print(result)

That keeps each store's timeline independent.

Sort first and validate the frequency assumption

Before filling dates, sort the data and confirm the intended frequency. Daily, hourly, business-day, and monthly data each need a different frequency string. Adding calendar days to business-day data can create rows you never wanted.

A clean fill starts with an explicit decision about the expected cadence, not with trial and error.

Preserve the date as a column when needed

After reindex, the generated dates live in the index. If downstream code expects a normal column, reset the index explicitly.

python

completed = completed.reset_index(names="date")
print(completed.head())

That small step often prevents confusion when joining the result back to non-time-series tables.

Common Pitfalls

Filling missing dates before converting the date column to actual datetime values.
Treating missing dates as zero automatically when the real meaning is unknown data.
Using the wrong frequency such as calendar days for business-day series.
Reindexing the full table at once when each entity should have its own date range.
Forgetting to sort the index before resampling or reindexing.

Summary

Create a complete date_range and reindex the DataFrame onto it.
Leave the inserted rows as missing until you decide how gaps should be interpreted.
Use asfreq when the DataFrame already has a proper DatetimeIndex.
Reindex within groups for multi-entity time series.
Be explicit about the intended date frequency before filling anything.

Add missing dates to pandas dataframe

Master System Design with Codemia

Introduction

Reindex against a complete date range

Fill missing values only after deciding what the gap means

Use asfreq when the index is already a DatetimeIndex

Add missing dates per group when multiple entities exist

Sort first and validate the frequency assumption

Preserve the date as a column when needed

Common Pitfalls

Summary

Use `asfreq` when the index is already a `DatetimeIndex`