Pandas
DataFrame
True/False to 1/0
Python
Data Manipulation

How can I map True/False to 1/0 in a Pandas DataFrame?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Converting boolean values to 1 and 0 is a common cleanup step before exporting data or feeding it into a model. In pandas, the right approach depends on whether the column contains real booleans, string values, or nullable booleans with missing data.

Convert Real Boolean Columns with astype

If a column already has boolean dtype, the simplest solution is astype(int). True becomes 1 and False becomes 0.

python
1import pandas as pd
2
3df = pd.DataFrame({
4    "is_active": [True, False, True],
5    "is_admin": [False, False, True],
6})
7
8df["is_active"] = df["is_active"].astype(int)
9df["is_admin"] = df["is_admin"].astype(int)
10
11print(df)
12print(df.dtypes)

This is concise and fast. It works best when the column is already clean and contains only boolean values.

If several columns share that dtype, convert them together:

python
bool_cols = ["is_active", "is_admin"]
df[bool_cols] = df[bool_cols].astype(int)

That keeps the transformation readable and avoids repeated lines.

Check the Actual Dtype Before Converting

Many DataFrame columns look boolean when printed, but are really strings or generic object dtype. That difference matters because astype(int) will fail or do the wrong thing if the source values are "True" and "False" text instead of real booleans.

python
1import pandas as pd
2
3df = pd.DataFrame({
4    "flag": ["True", "False", "True"]
5})
6
7print(df.dtypes)

In this case, map the strings explicitly:

python
df["flag"] = df["flag"].map({"True": 1, "False": 0})
print(df)

This is why checking df.dtypes early saves time. The correct conversion depends on the actual dtype, not on how the values look in the notebook.

Handle Missing Values with Nullable Integers

Boolean-like data often includes missing entries. Plain astype(int) does not work well if nulls are present, because regular integer dtype cannot represent missingness. Use pandas nullable types instead.

python
1import pandas as pd
2
3df = pd.DataFrame({
4    "flag": pd.Series([True, False, None], dtype="boolean")
5})
6
7df["flag_num"] = df["flag"].astype("Int64")
8
9print(df)
10print(df.dtypes)

Now True becomes 1, False becomes 0, and the missing value stays missing instead of being forced into an invalid integer.

This is important in analytics pipelines where null means "unknown" rather than false.

Use replace for Mixed, Messy Columns

Some real datasets contain a mix of booleans and string values in the same column. It is better to normalize them explicitly than to rely on a cast that only works for one subtype.

python
1import pandas as pd
2
3df = pd.DataFrame({
4    "flag": [True, False, "True", "False", None]
5})
6
7df["flag"] = df["flag"].replace({
8    True: 1,
9    False: 0,
10    "True": 1,
11    "False": 0,
12})
13
14print(df)

After the column is normalized, you can cast again if you need a specific numeric dtype. This is slightly more verbose, but it makes messy data rules explicit.

Select Boolean Columns Automatically

In a wider table, you may not want to list every boolean column by hand. pandas can select them dynamically:

python
1import pandas as pd
2
3df = pd.DataFrame({
4    "is_active": [True, False, True],
5    "score": [10, 20, 30],
6    "is_admin": [False, True, False],
7})
8
9bool_cols = df.select_dtypes(include=["bool"]).columns
10df[bool_cols] = df[bool_cols].astype(int)
11
12print(df)

That is useful in preprocessing pipelines where the schema evolves but boolean columns should always be encoded numerically.

Choose the Output Type for the Next Step

There is no single "best" numeric target. The right one depends on what consumes the data next:

  • CSV export often only needs visible 1 and 0
  • a feature matrix for a model may want a plain integer or numeric array
  • analytic tables may need nullable integer dtype so missing values survive

Think beyond the immediate conversion. Data pipelines become more reliable when the output dtype matches the downstream contract.

Common Pitfalls

  • Using astype(int) on string values that only look like booleans.
  • Forgetting that missing values may require nullable Int64 instead of plain int.
  • Converting an entire DataFrame when only a few columns need mapping.
  • Using replace everywhere when a direct dtype conversion would be simpler.
  • Skipping dtype inspection and then debugging the wrong issue.

Summary

  • Use astype(int) for clean boolean columns.
  • Use map or replace for string or mixed true-and-false values.
  • Preserve nulls with pandas nullable integer dtype such as Int64.
  • Convert multiple boolean columns together when the schema is stable.
  • Inspect the real dtypes first so the conversion matches the actual data.

Course illustration
Course illustration

All Rights Reserved.