Pandas
loc
data manipulation
Python programming
data analysis

Why use loc in Pandas?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Pandas gives you several ways to select data, and that flexibility is exactly why beginners get confused. .loc is worth learning early because it makes selections explicit: you choose rows and columns by label, and the same syntax works for reading, filtering, and assignment.

.loc Is Label-Based

The core idea is simple: .loc selects by index labels and column labels, not by integer position. That makes code easier to read when your rows and columns have meaningful names.

python
1import pandas as pd
2
3df = pd.DataFrame(
4    {
5        "city": ["Toronto", "Montreal", "Vancouver"],
6        "sales": [120, 95, 140],
7        "region": ["east", "east", "west"],
8    },
9    index=["a1", "a2", "a3"],
10)
11
12print(df.loc["a2"])
13print(df.loc["a1":"a2", ["city", "sales"]])

With .loc, the intent is obvious: pick row labels a1 through a2, then only the city and sales columns.

Use .loc for Filtering

.loc becomes especially useful when you combine it with boolean conditions:

python
1import pandas as pd
2
3df = pd.DataFrame(
4    {
5        "name": ["Ada", "Grace", "Linus", "Margaret"],
6        "team": ["ml", "ml", "platform", "ml"],
7        "score": [91, 88, 76, 95],
8    }
9)
10
11top_ml = df.loc[(df["team"] == "ml") & (df["score"] >= 90), ["name", "score"]]
12print(top_ml)

This is better than chaining separate selections because the row filter and the chosen columns live in one expression. It is easier to review and less likely to create accidental copies.

Use .loc for Assignment

One of the biggest reasons to prefer .loc is safe, explicit mutation. If you need to update part of a frame, .loc is usually the clearest tool.

python
1import pandas as pd
2
3df = pd.DataFrame(
4    {
5        "product": ["A", "B", "C"],
6        "inventory": [10, 3, 0],
7    }
8)
9
10df.loc[df["inventory"] == 0, "inventory"] = 5
11print(df)

This avoids the chained-assignment style that often produces SettingWithCopyWarning. In other words, .loc does not just read cleanly; it also encourages safer update patterns.

.loc Versus .iloc

.loc and .iloc solve different problems:

  • '.loc uses labels'
  • '.iloc uses integer positions'

That distinction matters because Pandas index labels are not always numeric ranges. A DataFrame can have dates, strings, UUIDs, or custom identifiers as the index. In those cases .loc describes the data model, while .iloc describes physical position.

python
1import pandas as pd
2
3df = pd.DataFrame({"value": [100, 200, 300]}, index=["row10", "row20", "row30"])
4
5print(df.loc["row20"])
6print(df.iloc[1])

Both return the second logical row here, but for very different reasons.

Inclusive Slicing Is a Feature

Another useful detail is that .loc label slicing is inclusive on both ends when the labels exist. That is different from normal Python slice behavior.

python
1import pandas as pd
2
3df = pd.DataFrame({"value": [1, 2, 3, 4]}, index=["a", "b", "c", "d"])
4print(df.loc["b":"c"])

This returns rows b and c. That behavior is handy for date ranges and named intervals, but it surprises people who expect the stop label to be excluded.

Common Pitfalls

  • Using .loc[1] and expecting the second row only works if the actual index label is 1.
  • Forgetting that .loc slices are inclusive can return one more row than expected.
  • Chaining df[df["score"] > 80]["score"] = 100 is brittle; df.loc[df["score"] > 80, "score"] = 100 is the safer form.
  • Duplicate index labels make .loc return multiple rows for one label, which may be correct but surprising.
  • Mixing .loc and .iloc in the same block without checking the index often leads to off-by-one mistakes.

Summary

  • Use .loc when you want selections and assignments to be based on labels, not positions.
  • It works well for filtering rows and selecting only the columns you need in one step.
  • '.loc is one of the best ways to avoid chained-assignment bugs in Pandas.'
  • Remember that label slices with .loc include both endpoints.

Course illustration
Course illustration

All Rights Reserved.