pandas
loc function
data manipulation
python
data analysis

loc function in pandas

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

loc is pandas' label-based selection and assignment tool. Once you understand that it works with row labels and column labels rather than with numeric positions, a large part of everyday pandas slicing becomes easier and safer.

The Basic Shape of loc

The core form is:

python
df.loc[row_selector, column_selector]

Both selectors are label-oriented. If your index contains strings, dates, or other explicit labels, loc uses those values directly.

python
1import pandas as pd
2
3df = pd.DataFrame(
4    {
5        "city": ["Toronto", "Montreal", "Vancouver"],
6        "sales": [120, 95, 150],
7        "active": [True, False, True],
8    },
9    index=["a", "b", "c"],
10)
11
12print(df.loc["a"])
13print(df.loc["a", "sales"])
14print(df.loc[:, ["city", "sales"]])

If you want positional indexing, that is iloc, not loc.

Use Boolean Masks with loc

A very common pattern is combining loc with a boolean mask.

python
mask = (df["sales"] >= 100) & (df["active"])
result = df.loc[mask, ["city", "sales"]]
print(result)

This reads well because the rows are defined by a condition and the columns are defined explicitly.

The mask must align with the DataFrame index. Misaligned masks are a common source of confusion.

Label Slices Are Inclusive

One important difference from standard Python slicing is that label slices with loc include both endpoints.

python
print(df.loc["a":"c", "sales"])

This includes rows a, b, and c.

That behavior is often convenient for time-series and labeled data, but it surprises people who expect Python's usual exclusive upper bound.

Use loc for Assignment

loc is also the preferred way to update values conditionally.

python
df.loc[df["city"] == "Toronto", "sales"] = 130
print(df)

This is much safer than chained indexing patterns that can trigger SettingWithCopy confusion.

A more complex example assigns categories based on rules.

python
1df.loc[df["sales"] >= 130, "grade"] = "A"
2df.loc[(df["sales"] >= 100) & (df["sales"] < 130), "grade"] = "B"
3df.loc[df["sales"] < 100, "grade"] = "C"
4print(df)

The order matters when conditions overlap.

Working with MultiIndex

loc also works well with hierarchical indexes.

python
1arrays = [["2026", "2026", "2027", "2027"], ["Q1", "Q2", "Q1", "Q2"]]
2idx = pd.MultiIndex.from_arrays(arrays, names=["year", "quarter"])
3
4m = pd.DataFrame({"revenue": [100, 120, 140, 160]}, index=idx)
5print(m.loc[("2026", "Q2")])
6print(m.loc["2027"])

This is one reason loc is so central in serious pandas work: it scales from simple tables to more structured indexes naturally.

Reordering and Subsetting by Explicit Labels

Because loc respects label lists, it is also useful for custom ordering.

python
ordered = df.loc[["c", "a"], ["city", "sales"]]
print(ordered)

This is very handy when preparing reports or aligning data to a business-defined sequence that is not the same as the original order.

Debugging loc Problems

When loc behaves unexpectedly, check these things first:

  • what the DataFrame index actually contains
  • whether you intended labels or positions
  • whether the boolean mask aligns with the DataFrame index
  • whether the slice endpoints are labels and therefore inclusive

A quick sanity check is often enough:

python
print(df.index)
print(mask.index.equals(df.index))

Most loc bugs come from label misunderstandings rather than from pandas doing something arbitrary.

Common Pitfalls

A common mistake is using loc when you really meant positional access. That is iloc.

Another mistake is forgetting that label slices are inclusive.

Developers also often create chained indexing like df[df["sales"] > 100]["city"] = ... instead of using loc for explicit assignment.

Finally, a boolean mask with a mismatched index can silently create confusion or errors. Always check alignment when filtering complex pipelines.

Summary

  • 'loc selects and updates pandas data by labels, not positions.'
  • The basic pattern is df.loc[row_selector, column_selector].
  • Boolean masks and explicit column lists make loc especially powerful.
  • Label slices are inclusive, unlike normal Python slicing.
  • Use loc for conditional assignment to avoid ambiguous chained-index behavior.

Course illustration
Course illustration

All Rights Reserved.