Add a string prefix to each value in a pandas string column
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Adding a prefix to every value in a Pandas string column is a common data transformation. The simplest approach is string concatenation with the + operator: df['col'] = 'prefix_' + df['col']. The .str accessor provides cat() for more complex operations, and apply() with a lambda works for custom formatting. All these methods are vectorized (except apply) and handle the operation without Python loops.
Method 1: String Concatenation with +
This is the fastest and most readable approach for simple prefixes.
Method 2: .str.cat() (Concatenation Accessor)
str.cat() is more useful for joining two columns or adding a separator:
Method 3: apply() with Lambda
apply is slower than vectorized operations but useful for complex transformations.
Method 4: map() with Format String
Adding Both Prefix and Suffix
Conditional Prefix
Add prefix only to rows that meet a condition:
Handling Non-String Columns
If the column contains numbers, convert to string first:
Handling NaN Values
Performance Comparison
String concatenation with + is 10-20x faster than apply because it uses Pandas' internal C-optimized string operations.
Common Pitfalls
- TypeError with non-string columns:
'prefix' + df['int_column']raisesTypeError. Convert the column to string first with.astype(str)before concatenation. - NaN propagation: String concatenation with
NaNproducesNaN, not'prefix_nan'. Use.fillna('')before concatenation if you want to preserve the prefix for missing values. - Using
applyfor simple prefix:apply(lambda x: 'prefix_' + x)works but is 10-20x slower than'prefix_' + df['col']. Use vectorized string operations for simple transformations. - Modifying a copy instead of the original:
df['col'].str.upper()returns a new Series. Assigning back withdf['col'] = ...is required. Without assignment, the original DataFrame is unchanged. - Mixed types in column: If a column has both strings and numbers (object dtype with mixed types), string concatenation may fail on numeric rows. Use
df['col'].astype(str)to normalize the column first.
Summary
- Use
'prefix_' + df['col']for the fastest and simplest prefix operation - Use
.astype(str)first if the column contains non-string values - Use
.fillna()to handle NaN values before concatenation - Use
np.whereordf.locfor conditional prefix application - Avoid
apply()for simple string operations — vectorized concatenation is 10-20x faster

