loc function in pandas
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
loc is pandas' label-based selection and assignment tool. Once you understand that it works with row labels and column labels rather than with numeric positions, a large part of everyday pandas slicing becomes easier and safer.
The Basic Shape of loc
The core form is:
Both selectors are label-oriented. If your index contains strings, dates, or other explicit labels, loc uses those values directly.
If you want positional indexing, that is iloc, not loc.
Use Boolean Masks with loc
A very common pattern is combining loc with a boolean mask.
This reads well because the rows are defined by a condition and the columns are defined explicitly.
The mask must align with the DataFrame index. Misaligned masks are a common source of confusion.
Label Slices Are Inclusive
One important difference from standard Python slicing is that label slices with loc include both endpoints.
This includes rows a, b, and c.
That behavior is often convenient for time-series and labeled data, but it surprises people who expect Python's usual exclusive upper bound.
Use loc for Assignment
loc is also the preferred way to update values conditionally.
This is much safer than chained indexing patterns that can trigger SettingWithCopy confusion.
A more complex example assigns categories based on rules.
The order matters when conditions overlap.
Working with MultiIndex
loc also works well with hierarchical indexes.
This is one reason loc is so central in serious pandas work: it scales from simple tables to more structured indexes naturally.
Reordering and Subsetting by Explicit Labels
Because loc respects label lists, it is also useful for custom ordering.
This is very handy when preparing reports or aligning data to a business-defined sequence that is not the same as the original order.
Debugging loc Problems
When loc behaves unexpectedly, check these things first:
- what the DataFrame index actually contains
- whether you intended labels or positions
- whether the boolean mask aligns with the DataFrame index
- whether the slice endpoints are labels and therefore inclusive
A quick sanity check is often enough:
Most loc bugs come from label misunderstandings rather than from pandas doing something arbitrary.
Common Pitfalls
A common mistake is using loc when you really meant positional access. That is iloc.
Another mistake is forgetting that label slices are inclusive.
Developers also often create chained indexing like df[df["sales"] > 100]["city"] = ... instead of using loc for explicit assignment.
Finally, a boolean mask with a mismatched index can silently create confusion or errors. Always check alignment when filtering complex pipelines.
Summary
- '
locselects and updates pandas data by labels, not positions.' - The basic pattern is
df.loc[row_selector, column_selector]. - Boolean masks and explicit column lists make
locespecially powerful. - Label slices are inclusive, unlike normal Python slicing.
- Use
locfor conditional assignment to avoid ambiguous chained-index behavior.

