Complexity of in operator in Python
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
The in operator in Python checks whether a value exists in a collection. Its time complexity depends entirely on the data structure: O(1) average for sets and dictionaries, O(n) for lists and tuples, and O(n) for strings (where n is the string length). Choosing the right data structure for membership testing is one of the most impactful performance decisions in Python code.
Time Complexity by Data Structure
| Data Structure | Average Case | Worst Case | Notes |
list | O(n) | O(n) | Linear scan |
tuple | O(n) | O(n) | Linear scan |
set | O(1) | O(n) | Hash table lookup |
frozenset | O(1) | O(n) | Hash table lookup |
dict (keys) | O(1) | O(n) | Hash table lookup |
str | O(n) | O(n) | Substring search |
range | O(1) | O(1) | Arithmetic check |
List and Tuple: O(n)
Lists and tuples use sequential search. Python walks through each element and compares it to the target.
Set and Dict: O(1) Average
Sets and dictionaries use hash tables. Lookup involves computing a hash and checking one (or a few) buckets.
For dictionaries, in checks keys by default:
String: O(n)
For strings, in performs substring search, not character lookup.
Python uses a combination of Boyer-Moore and Horspool algorithms for substring search, giving O(n/m) average case where n is the text length and m is the pattern length. Worst case is O(n*m) but this rarely occurs with real text.
Range: O(1)
Python 3's range objects support O(1) membership testing through arithmetic:
Converting List to Set for Repeated Lookups
If you perform multiple membership tests against the same collection, converting a list to a set pays off quickly.
The set conversion costs O(n) once. Each lookup is then O(1). For m lookups, total goes from O(n*m) to O(n + m).
The O(n) Worst Case for Sets
Set lookup degrades to O(n) when many keys hash to the same bucket (hash collision). This is rare with Python's built-in types but can happen with adversarial input or poorly implemented __hash__.
not in Operator
not in has the same time complexity as in. It simply negates the result.
Common Pitfalls
- Using a list for repeated membership tests: Checking
x in large_listinside a loop is O(n) per check. Convert to a set first if you do more than a few lookups. The break-even point is typically 3-5 lookups. - Checking
in dict.values():key in dictis O(1) because it checks keys.value in dict.values()is O(n) because values are not hashed. If you need fast value lookup, build a reverse mapping. - Assuming
inon a generator is efficient:x in generatorconsumes the generator element by element until it finds a match or exhausts the generator. This is O(n) and the consumed elements are gone. - Confusing substring
inwith elementin: For strings,"ab" in "abc"checks for a substring. For lists,"ab" in ["a", "b", "abc"]checks for an exact element match. These are different operations. - Forgetting
rangeis O(1):1_000_000 in range(2_000_000)is instant in Python 3. In Python 2,range()returns a list (O(n)). Usexrange()in Python 2 for O(1) behavior.
Summary
inonlist/tuple: O(n) — sequential scaninonset/dict: O(1) average — hash table lookupinonstr: O(n) — substring search using optimized algorithmsinonrange: O(1) — arithmetic membership test- Convert lists to sets when doing repeated membership tests
key in dictis O(1) butvalue in dict.values()is O(n)- Python 3's
rangesupports O(1)inchecks; Python 2'srangedoes not

