How can I map True/False to 1/0 in a Pandas DataFrame?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Converting boolean values to 1 and 0 is a common cleanup step before exporting data or feeding it into a model. In pandas, the right approach depends on whether the column contains real booleans, string values, or nullable booleans with missing data.
Convert Real Boolean Columns with astype
If a column already has boolean dtype, the simplest solution is astype(int). True becomes 1 and False becomes 0.
This is concise and fast. It works best when the column is already clean and contains only boolean values.
If several columns share that dtype, convert them together:
That keeps the transformation readable and avoids repeated lines.
Check the Actual Dtype Before Converting
Many DataFrame columns look boolean when printed, but are really strings or generic object dtype. That difference matters because astype(int) will fail or do the wrong thing if the source values are "True" and "False" text instead of real booleans.
In this case, map the strings explicitly:
This is why checking df.dtypes early saves time. The correct conversion depends on the actual dtype, not on how the values look in the notebook.
Handle Missing Values with Nullable Integers
Boolean-like data often includes missing entries. Plain astype(int) does not work well if nulls are present, because regular integer dtype cannot represent missingness. Use pandas nullable types instead.
Now True becomes 1, False becomes 0, and the missing value stays missing instead of being forced into an invalid integer.
This is important in analytics pipelines where null means "unknown" rather than false.
Use replace for Mixed, Messy Columns
Some real datasets contain a mix of booleans and string values in the same column. It is better to normalize them explicitly than to rely on a cast that only works for one subtype.
After the column is normalized, you can cast again if you need a specific numeric dtype. This is slightly more verbose, but it makes messy data rules explicit.
Select Boolean Columns Automatically
In a wider table, you may not want to list every boolean column by hand. pandas can select them dynamically:
That is useful in preprocessing pipelines where the schema evolves but boolean columns should always be encoded numerically.
Choose the Output Type for the Next Step
There is no single "best" numeric target. The right one depends on what consumes the data next:
- CSV export often only needs visible
1and0 - a feature matrix for a model may want a plain integer or numeric array
- analytic tables may need nullable integer dtype so missing values survive
Think beyond the immediate conversion. Data pipelines become more reliable when the output dtype matches the downstream contract.
Common Pitfalls
- Using
astype(int)on string values that only look like booleans. - Forgetting that missing values may require nullable
Int64instead of plainint. - Converting an entire DataFrame when only a few columns need mapping.
- Using
replaceeverywhere when a direct dtype conversion would be simpler. - Skipping dtype inspection and then debugging the wrong issue.
Summary
- Use
astype(int)for clean boolean columns. - Use
maporreplacefor string or mixed true-and-false values. - Preserve nulls with pandas nullable integer dtype such as
Int64. - Convert multiple boolean columns together when the schema is stable.
- Inspect the real dtypes first so the conversion matches the actual data.

