String Manipulation
Text Formatting
Programming Tips
Capitalization
Coding Basics

How can I capitalize the first letter of each word in a string?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Capitalizing the first letter of each word seems trivial until real user text includes punctuation, acronyms, multiple spaces, and language-specific casing rules. A good solution starts by defining what counts as a word and which terms should preserve original case. After that, you can choose between built-in title-casing and custom token logic.

Built-In Solutions for Simple Cases

For controlled text, built-in methods are fast and readable. In Python, str.title is the quickest option.

python
text = "hello world from codemia"
print(text.title())
# Hello World From Codemia

Another basic approach is splitting and capitalizing token by token.

python
text = "hello world from codemia"
result = " ".join(word.capitalize() for word in text.split())
print(result)

These work for straightforward input, but they can fail on acronyms and special punctuation.

Preserve Acronyms and Brand Casing

If your domain includes words like API, SQL, or iOS, default title-casing can corrupt output. Add explicit exception rules.

python
1import re
2
3ACRONYMS = {"API", "SQL", "SDK"}
4PRESERVE = {"iOS", "macOS"}
5
6
7def smart_title(text: str) -> str:
8    def transform(token: str) -> str:
9        if token in PRESERVE:
10            return token
11        if token.upper() in ACRONYMS:
12            return token.upper()
13        if not token:
14            return token
15        return token[0].upper() + token[1:].lower()
16
17    parts = re.split(r"(\s+)", text)
18    output = [part if part.isspace() else transform(part) for part in parts]
19    return "".join(output)
20
21
22print(smart_title("learn api design on iOS"))

This keeps intentional casing intact while still formatting normal words.

Handle Hyphens and Apostrophes Explicitly

Real text often contains hyphenated and apostrophe words. Decide your policy, then encode it directly.

Example policy:

  • split by spaces into words
  • within each word, split by hyphen and capitalize each segment
  • keep apostrophes inside segments
python
1def hyphen_title(text: str) -> str:
2    words = text.split(" ")
3    out = []
4
5    for word in words:
6        segments = word.split("-")
7        cap_segments = []
8
9        for seg in segments:
10            if seg:
11                cap_segments.append(seg[0].upper() + seg[1:].lower())
12            else:
13                cap_segments.append(seg)
14
15        out.append("-".join(cap_segments))
16
17    return " ".join(out)
18
19
20print(hyphen_title("state-of-the-art model"))
21print(hyphen_title("rock'n'roll handbook"))

Custom policy avoids inconsistent formatting across screens and services.

Preserve Spacing or Normalize It

Some applications must preserve original spacing exactly, such as text editors or legal document tools. Others should normalize spacing for UI consistency.

  • Preserve spacing with regex split using captured whitespace.
  • Normalize spacing with default split and join.

Choose one policy and document it. Many bugs come from hidden whitespace normalization in helper functions.

Multi-Language and Unicode Notes

Capitalization is language-dependent. English-focused rules may not work correctly for all scripts and locale-specific characters. If your app is multilingual, evaluate locale-aware libraries or platform-specific internationalization APIs. At minimum, include representative international examples in tests before shipping.

Testing Strategy

Build a focused test matrix with realistic samples:

python
1cases = [
2    "hello world",
3    "  multiple   spaces",
4    "learn api design",
5    "state-of-the-art parser",
6    "rock'n'roll handbook",
7    "iOS sdk and sql",
8]
9
10for c in cases:
11    print(c, "=>", smart_title(c))

Include negative tests where output should not change, such as already well-formatted branded names.

Common Pitfalls

A common pitfall is relying on str.title for all user-generated content without checking acronym behavior. Another is unintentionally collapsing spaces while formatting words, which can break text alignment requirements. Teams also skip punctuation rules, causing inconsistent handling of hyphenated names and apostrophes. Multi-language text is often overlooked until production, where locale-specific casing issues surface. Finally, inconsistent implementations across backend and frontend create mismatched titles in different product surfaces. A capitalization helper becomes part of product polish once the same text appears in many places.

Summary

  • Use built-in title-casing for simple, controlled text.
  • Add custom rules when acronyms or branded terms must be preserved.
  • Define clear punctuation policy for hyphens and apostrophes.
  • Decide whether spacing should be preserved or normalized.
  • Test with realistic user text, including mixed-case and multilingual examples.
  • Keep one shared capitalization policy to avoid cross-system inconsistencies.

Course illustration
Course illustration

All Rights Reserved.