pandas
DataFrame
date filtering
data analysis
Python

Select DataFrame rows between two dates

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Filtering pandas rows between two dates is easy once the date column is actually stored as a datetime type. Most mistakes happen before the filter itself, such as comparing strings instead of timestamps or forgetting whether the date boundaries should be inclusive or exclusive.

Convert the column to datetime first

Do not assume a CSV-imported date column is already typed correctly. Convert it explicitly.

python
1import pandas as pd
2
3df = pd.DataFrame({
4    "order_date": ["2026-03-01", "2026-03-05", "2026-03-10"],
5    "amount": [100, 150, 90],
6})
7
8df["order_date"] = pd.to_datetime(df["order_date"])
9print(df.dtypes)

Once the column is datetime64, filtering becomes reliable and readable.

Boolean mask between two dates

The most common pattern is a boolean mask:

python
1start = pd.Timestamp("2026-03-02")
2end = pd.Timestamp("2026-03-08")
3
4filtered = df[(df["order_date"] >= start) & (df["order_date"] <= end)]
5print(filtered)

This gives an inclusive range on both boundaries.

If you want an exclusive end boundary, switch the second comparison:

python
filtered = df[(df["order_date"] >= start) & (df["order_date"] < end)]

Be explicit about that decision in analytics code because reporting boundaries often matter.

Use the index when the DataFrame is time-indexed

If the date column is the index, label-based slicing is cleaner.

python
df_indexed = df.set_index("order_date").sort_index()

print(df_indexed.loc["2026-03-02":"2026-03-08"])

For time-indexed workflows, this is often more natural than rebuilding boolean masks repeatedly.

Include times, not just dates

If the values include time-of-day, the boundary behavior becomes more subtle.

python
1df = pd.DataFrame({
2    "event_time": pd.to_datetime([
3        "2026-03-05 08:30:00",
4        "2026-03-05 18:00:00",
5        "2026-03-06 09:15:00",
6    ]),
7    "value": [1, 2, 3],
8})
9
10start = pd.Timestamp("2026-03-05 09:00:00")
11end = pd.Timestamp("2026-03-06 00:00:00")
12
13filtered = df[(df["event_time"] >= start) & (df["event_time"] < end)]
14print(filtered)

If you filter by date strings only, remember that midnight interpretation may include or exclude more rows than you expect.

Use between for readability

Pandas also offers Series.between:

python
mask = df["event_time"].between("2026-03-05", "2026-03-06", inclusive="left")
print(df[mask])

This can be cleaner than manually writing both comparisons, especially when the range logic appears in many places.

query can be readable for notebook workflows

For ad hoc analysis, query can make date filters easier to scan:

python
1start = pd.Timestamp("2026-03-05")
2end = pd.Timestamp("2026-03-10")
3
4filtered = df.query("@start <= order_date <= @end")
5print(filtered)

This is mainly a readability option. The core requirement is still the same: the column must be properly typed as datetime before you compare it.

Handle time zones deliberately

If your timestamps are timezone-aware, make sure the boundaries are too.

python
1df = pd.DataFrame({
2    "ts": pd.to_datetime([
3        "2026-03-05 08:00:00+00:00",
4        "2026-03-05 12:00:00+00:00",
5    ], utc=True)
6})
7
8start = pd.Timestamp("2026-03-05 09:00:00", tz="UTC")
9end = pd.Timestamp("2026-03-05 13:00:00", tz="UTC")
10
11print(df[df["ts"].between(start, end)])

Mixing naive and timezone-aware timestamps is a frequent source of errors.

Common Pitfalls

The most common mistake is filtering a string column as if it were a datetime column. Another is forgetting whether the end boundary should be inclusive, which causes quiet off-by-one-day or off-by-one-second bugs in reports. Developers also often use index slicing without sorting the datetime index first. Timezone mismatches are another major source of confusion when comparing aware and naive timestamps. Finally, when a column includes time-of-day, people often write an end date such as 2026-03-06 and accidentally exclude events later that day because the timestamp resolves to midnight.

Summary

  • Convert the date column with pd.to_datetime before filtering.
  • Use boolean masks or between for ordinary column-based filters.
  • Use .loc slicing when the DataFrame is indexed by datetime.
  • Be explicit about inclusive versus exclusive boundaries.
  • Handle time-of-day and timezone information deliberately.
  • Treat date filtering as a typing and boundary problem, not just a syntax problem.

Course illustration
Course illustration

All Rights Reserved.