anagrams
wordplay
string manipulation
algorithms
programming basics

finding if two words are anagrams of each other

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Two words are anagrams if they contain the same letters with the same frequencies, just in a different order. In programming terms, that means the strings must become identical after you account for character counts.

This is a simple problem, but it is a good example of choosing the right data structure. The two most common solutions are sorting the characters or counting them.

Sorting-Based Solution

A direct method is to sort both strings and compare the results. If the sorted character sequences match, the words are anagrams.

python
1def are_anagrams_sorting(a: str, b: str) -> bool:
2    return sorted(a) == sorted(b)
3
4
5print(are_anagrams_sorting("listen", "silent"))
6print(are_anagrams_sorting("hello", "world"))

This is easy to read and works well for small inputs. Its time complexity is O(n log n) because sorting dominates the work.

Counting-Based Solution

A more efficient approach is to count character frequencies. If both strings produce the same counts, they are anagrams.

python
1from collections import Counter
2
3
4def are_anagrams_counting(a: str, b: str) -> bool:
5    return Counter(a) == Counter(b)
6
7
8print(are_anagrams_counting("listen", "silent"))
9print(are_anagrams_counting("hello", "world"))

This usually runs in O(n) time because each string is scanned once.

Why Length Check Helps

Before sorting or counting, it is often useful to compare lengths.

python
1def are_anagrams(a: str, b: str) -> bool:
2    if len(a) != len(b):
3        return False
4    return Counter(a) == Counter(b)

If the lengths differ, the words cannot be anagrams. This is a small optimization, but it also makes the logic explicit.

Manual Frequency Array for Fixed Alphabets

If you know the input is limited to lowercase English letters, you can use a fixed-size integer array instead of a general-purpose dictionary.

python
1def are_anagrams_lowercase(a: str, b: str) -> bool:
2    if len(a) != len(b):
3        return False
4
5    counts = [0] * 26
6
7    for ch in a:
8        counts[ord(ch) - ord('a')] += 1
9
10    for ch in b:
11        counts[ord(ch) - ord('a')] -= 1
12
13    return all(value == 0 for value in counts)
14
15
16print(are_anagrams_lowercase("angel", "glean"))

This is still O(n) time and uses constant extra space relative to the alphabet size.

Case, Spaces, and Punctuation

Real-world anagram checks often need normalization. For example, should Dormitory and dirty room count as anagrams? If yes, you need to ignore case and spaces.

python
1from collections import Counter
2
3
4def normalize(text: str) -> str:
5    return "".join(ch.lower() for ch in text if ch.isalpha())
6
7
8def are_phrase_anagrams(a: str, b: str) -> bool:
9    a = normalize(a)
10    b = normalize(b)
11    return Counter(a) == Counter(b)
12
13
14print(are_phrase_anagrams("Dormitory", "Dirty room"))

This is often more useful than the raw string comparison because human-facing text frequently includes spaces, punctuation, and mixed capitalization.

Choosing the Best Approach

If you need the simplest readable solution, sorting is fine. If you want better asymptotic performance, counting is usually better.

A practical rule is:

  • use sorting when clarity matters most and inputs are small
  • use counting when performance matters or checks happen frequently
  • use normalization when the strings come from user-facing text

The problem itself is easy, so the right solution is often the one that matches the data assumptions cleanly.

Common Pitfalls

One common mistake is checking only whether the two strings contain the same set of characters. Sets ignore frequency, so aab and ab would look the same even though they are not anagrams.

Another issue is forgetting to normalize case or punctuation when the requirement is phrase-level anagrams rather than exact raw-string anagrams.

It is also easy to assume Unicode text behaves like plain ASCII text. If the input includes accented letters or composed characters, normalization rules may matter depending on the application.

Finally, do not overcomplicate the problem. For most normal inputs, a length check plus a frequency count is already an excellent solution.

Summary

  • Two words are anagrams if they contain the same characters with the same frequencies.
  • Sorting gives a simple O(n log n) solution.
  • Counting characters gives a typical O(n) solution.
  • Real applications often need normalization for case, spaces, or punctuation.
  • The best implementation depends on your input assumptions and whether clarity or raw performance matters more.

Course illustration
Course illustration

All Rights Reserved.