Select DataFrame rows between two dates

pandas

DataFrame

date filtering

data analysis

Python

Select DataFrame rows between two dates

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

Filtering pandas rows between two dates is easy once the date column is actually stored as a datetime type. Most mistakes happen before the filter itself, such as comparing strings instead of timestamps or forgetting whether the date boundaries should be inclusive or exclusive.

Convert the column to datetime first

Do not assume a CSV-imported date column is already typed correctly. Convert it explicitly.

python

1import pandas as pd
2
3df = pd.DataFrame({
4    "order_date": ["2026-03-01", "2026-03-05", "2026-03-10"],
5    "amount": [100, 150, 90],
6})
7
8df["order_date"] = pd.to_datetime(df["order_date"])
9print(df.dtypes)

Once the column is datetime64, filtering becomes reliable and readable.

Boolean mask between two dates

The most common pattern is a boolean mask:

python

1start = pd.Timestamp("2026-03-02")
2end = pd.Timestamp("2026-03-08")
3
4filtered = df[(df["order_date"] >= start) & (df["order_date"] <= end)]
5print(filtered)

This gives an inclusive range on both boundaries.

If you want an exclusive end boundary, switch the second comparison:

python

filtered = df[(df["order_date"] >= start) & (df["order_date"] < end)]

Be explicit about that decision in analytics code because reporting boundaries often matter.

Use the index when the DataFrame is time-indexed

If the date column is the index, label-based slicing is cleaner.

python

df_indexed = df.set_index("order_date").sort_index()

print(df_indexed.loc["2026-03-02":"2026-03-08"])

For time-indexed workflows, this is often more natural than rebuilding boolean masks repeatedly.

Include times, not just dates

If the values include time-of-day, the boundary behavior becomes more subtle.

python

1df = pd.DataFrame({
2    "event_time": pd.to_datetime([
3        "2026-03-05 08:30:00",
4        "2026-03-05 18:00:00",
5        "2026-03-06 09:15:00",
6    ]),
7    "value": [1, 2, 3],
8})
9
10start = pd.Timestamp("2026-03-05 09:00:00")
11end = pd.Timestamp("2026-03-06 00:00:00")
12
13filtered = df[(df["event_time"] >= start) & (df["event_time"] < end)]
14print(filtered)

If you filter by date strings only, remember that midnight interpretation may include or exclude more rows than you expect.

Use `between` for readability

Pandas also offers Series.between:

python

mask = df["event_time"].between("2026-03-05", "2026-03-06", inclusive="left")
print(df[mask])

This can be cleaner than manually writing both comparisons, especially when the range logic appears in many places.

`query` can be readable for notebook workflows

For ad hoc analysis, query can make date filters easier to scan:

python

1start = pd.Timestamp("2026-03-05")
2end = pd.Timestamp("2026-03-10")
3
4filtered = df.query("@start <= order_date <= @end")
5print(filtered)

This is mainly a readability option. The core requirement is still the same: the column must be properly typed as datetime before you compare it.

Handle time zones deliberately

If your timestamps are timezone-aware, make sure the boundaries are too.

python

1df = pd.DataFrame({
2    "ts": pd.to_datetime([
3        "2026-03-05 08:00:00+00:00",
4        "2026-03-05 12:00:00+00:00",
5    ], utc=True)
6})
7
8start = pd.Timestamp("2026-03-05 09:00:00", tz="UTC")
9end = pd.Timestamp("2026-03-05 13:00:00", tz="UTC")
10
11print(df[df["ts"].between(start, end)])

Mixing naive and timezone-aware timestamps is a frequent source of errors.

Common Pitfalls

The most common mistake is filtering a string column as if it were a datetime column. Another is forgetting whether the end boundary should be inclusive, which causes quiet off-by-one-day or off-by-one-second bugs in reports. Developers also often use index slicing without sorting the datetime index first. Timezone mismatches are another major source of confusion when comparing aware and naive timestamps. Finally, when a column includes time-of-day, people often write an end date such as 2026-03-06 and accidentally exclude events later that day because the timestamp resolves to midnight.

Summary

Convert the date column with pd.to_datetime before filtering.
Use boolean masks or between for ordinary column-based filters.
Use .loc slicing when the DataFrame is indexed by datetime.
Be explicit about inclusive versus exclusive boundaries.
Handle time-of-day and timezone information deliberately.
Treat date filtering as a typing and boundary problem, not just a syntax problem.

Select DataFrame rows between two dates

Master System Design with Codemia

Introduction

Convert the column to datetime first

Boolean mask between two dates

Use the index when the DataFrame is time-indexed

Include times, not just dates

Use between for readability

query can be readable for notebook workflows

Handle time zones deliberately

Common Pitfalls

Summary

Use `between` for readability

`query` can be readable for notebook workflows