algorithm
string manipulation
coding challenge
data structures
programming techniques

Best way to find first non repeating character in a string

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Finding the first non-repeating character in a string is a classic interview problem because it looks simple but rewards a clean choice of data structure. The best general-purpose solution is usually a two-pass approach: count characters first, then scan again to find the first one whose count is 1.

Core Sections

Two-Pass Hash Map Solution

The idea is straightforward. In the first pass, store how many times each character appears. In the second pass, return the first character with count 1.

python
1def first_non_repeating_char(text: str) -> str | None:
2    counts: dict[str, int] = {}
3
4    for ch in text:
5        counts[ch] = counts.get(ch, 0) + 1
6
7    for ch in text:
8        if counts[ch] == 1:
9            return ch
10
11    return None
12
13
14print(first_non_repeating_char("swiss"))
15print(first_non_repeating_char("aabbcc"))
text
w
None

This runs in linear time because each pass touches the string once, and dictionary lookups are constant time on average.

Why This Is Usually the Best Choice

Compared with the naive nested-loop solution, the hash map version scales much better. A brute-force method checks each character against the rest of the string, which leads to quadratic time in the worst case.

With the dictionary approach:

  • time complexity is O(n)
  • extra space is O(k), where k is the number of distinct characters

That tradeoff is usually excellent unless memory is extremely constrained or the character set is fixed and tiny.

Returning the Index Instead of the Character

Some interview variants ask for the position rather than the character itself. You can adapt the second pass to return the index.

python
1def first_non_repeating_index(text: str) -> int:
2    counts: dict[str, int] = {}
3
4    for ch in text:
5        counts[ch] = counts.get(ch, 0) + 1
6
7    for index, ch in enumerate(text):
8        if counts[ch] == 1:
9            return index
10
11    return -1
12
13
14print(first_non_repeating_index("leetcode"))
15print(first_non_repeating_index("aabb"))
text
0
-1

Returning -1 is common when no non-repeating character exists, but None is also reasonable if your codebase uses optional return values.

Case Sensitivity and Unicode

Before you implement the function, decide what "same character" means for your problem. In Python, "A" and "a" are different characters unless you normalize the text first.

python
1def first_non_repeating_case_insensitive(text: str) -> str | None:
2    normalized = text.lower()
3    counts: dict[str, int] = {}
4
5    for ch in normalized:
6        counts[ch] = counts.get(ch, 0) + 1
7
8    for original, lowered in zip(text, normalized):
9        if counts[lowered] == 1:
10            return original
11
12    return None
13
14
15print(first_non_repeating_case_insensitive("sTreSS"))
text
T

That version preserves the original output character while counting in a case-insensitive way.

Unicode also matters in real applications. Python strings are Unicode-aware, so the algorithm still works, but normalization may be needed if visually similar characters can be represented in multiple forms.

Alternative for Small Fixed Alphabets

If you know the input contains only lowercase English letters, an array can be slightly simpler or faster than a dictionary.

python
1def first_non_repeating_lowercase(text: str) -> str | None:
2    counts = [0] * 26
3
4    for ch in text:
5        counts[ord(ch) - ord("a")] += 1
6
7    for ch in text:
8        if counts[ord(ch) - ord("a")] == 1:
9            return ch
10
11    return None

This avoids hash table overhead, but it only works when the character set is tightly constrained. For general text, the dictionary version is more flexible and clearer.

Common Pitfalls

  • Using a nested loop and turning a linear problem into quadratic work.
  • Forgetting to define what the function should return when no answer exists.
  • Losing order information by using a set instead of a frequency map plus second pass.
  • Ignoring case-sensitivity or normalization requirements in real text data.
  • Over-optimizing for fixed alphabets when the input can actually contain arbitrary Unicode characters.

Summary

  • The standard solution is a two-pass count-and-scan algorithm.
  • A hash map gives O(n) time with clear, maintainable code.
  • Return either the character or its index depending on the problem statement.
  • Decide whether matching should be case-sensitive and whether normalization is needed.
  • Use an array only when the input alphabet is small and fixed.

Course illustration
Course illustration

All Rights Reserved.