How to add a new column to an existing DataFrame

Data Science

Python

Pandas Library

DataFrame Operations

Programming Tips

How to add a new column to an existing DataFrame

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

Adding a column to a pandas DataFrame is one of the most common data-manipulation tasks in Python. The simplest form is direct assignment, but the best method depends on whether you want a fixed value, a computed expression, a specific column position, or a transformation that returns a new DataFrame instead of mutating the original one.

Core Sections

Direct assignment is the default tool

The most common way to add a column is direct assignment with bracket syntax.

python

1import pandas as pd
2
3
4df = pd.DataFrame({
5    "A": [1, 2, 3],
6    "B": [4, 5, 6],
7})
8
9df["C"] = [7, 8, 9]
10print(df)

This is concise and works for:

a list or array of matching length
a scalar value that should be broadcast to every row
a Series aligned by index

For many cases, this is the only method you need.

Computed columns should prefer vectorized expressions

When the new column comes from existing columns, pandas expressions are usually clearer and faster than row-wise apply.

python

1import pandas as pd
2
3
4df = pd.DataFrame({
5    "price": [10, 20, 30],
6    "quantity": [2, 1, 4],
7})
8
9df["total"] = df["price"] * df["quantity"]
10print(df)

This is vectorized, which is the pandas-friendly way to do column math. It is usually better than iterating row by row.

Use `.insert()` when column position matters

If you need the new column at a specific location rather than at the end, use .insert().

python

1import pandas as pd
2
3
4df = pd.DataFrame({
5    "A": [1, 2, 3],
6    "B": [4, 5, 6],
7})
8
9df.insert(1, "middle", [10, 20, 30])
10print(df)

This places the new column at index position 1. It is useful when column order matters for reports, exports, or interactive inspection.

Use `.assign()` when you want a new DataFrame

.assign() returns a new DataFrame instead of mutating the original one in place. That is useful when you want a more functional style or want to keep the original object untouched.

python

1import pandas as pd
2
3
4df = pd.DataFrame({
5    "A": [1, 2, 3],
6    "B": [4, 5, 6],
7})
8
9new_df = df.assign(sum_col=lambda x: x["A"] + x["B"])
10print(new_df)
11print(df)

This is especially nice in method chains where you are transforming data step by step.

Conditional columns with `where` or `np.where`

A new column often depends on a condition. In those cases, use vectorized condition handling rather than a Python loop.

python

1import numpy as np
2import pandas as pd
3
4
5df = pd.DataFrame({
6    "score": [45, 72, 88, 59],
7})
8
9df["passed"] = np.where(df["score"] >= 60, "yes", "no")
10print(df)

This is a common and efficient pattern for category flags, thresholds, and derived status columns.

Avoid `apply(axis=1)` unless you really need it

Many tutorials jump to apply for computed columns, but row-wise apply(axis=1) is slower and usually unnecessary for simple arithmetic or conditions.

Use it only when the column logic genuinely requires row-level Python code that is hard to express with vectorized operations.

Common Pitfalls

Assigning a list of the wrong length causes shape errors because pandas expects the values to align with the existing rows.
Reaching for apply(axis=1) too early often makes the code slower and more complex than a vectorized expression.
Using .assign() while expecting the original DataFrame to change leads to confusion because .assign() returns a new object.
Forgetting that Series assignment aligns by index can produce NaN values when indexes do not match.
Inserting columns by position with .insert() without checking the target index can disrupt report or export layout unexpectedly.

Summary

Direct assignment with df["col"] = ... is the standard way to add a column.
Use vectorized expressions for computed columns whenever possible.
Use .insert() when the placement of the new column matters.
Use .assign() when you want a transformed copy instead of mutating the original DataFrame.
Prefer vectorized conditional logic over row-wise apply for performance and clarity.

How to add a new column to an existing DataFrame

Master System Design with Codemia

Introduction

Core Sections

Direct assignment is the default tool

Computed columns should prefer vectorized expressions

Use .insert() when column position matters

Use .assign() when you want a new DataFrame

Conditional columns with where or np.where

Avoid apply(axis=1) unless you really need it

Common Pitfalls

Summary

Use `.insert()` when column position matters

Use `.assign()` when you want a new DataFrame

Conditional columns with `where` or `np.where`

Avoid `apply(axis=1)` unless you really need it