How does this algorithm to count the number of set bits in a 32-bit integer work?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Counting set bits in an integer is known as popcount or Hamming weight and appears in many systems tasks. The algorithm most people ask about is the one that repeatedly does x &= x - 1. It works because each iteration removes exactly one set bit, so loop count equals number of ones.
Binary Identity Behind the Algorithm
The key identity is:
- '
x - 1flips the rightmost set bit to zero and turns lower trailing zeros and ones accordingly' - '
x & (x - 1)clears the rightmost set bit ofx'
So if you repeat this operation until x becomes zero, the number of iterations is exactly the number of set bits.
Algorithm:
This is often faster than scanning all 32 positions when numbers are sparse.
Step-by-Step Example
Take x = 0b10110000.
Iteration detail:
x = 10110000,x - 1 = 10101111,x & (x - 1) = 10100000x = 10100000,x - 1 = 10011111,x & (x - 1) = 10000000x = 10000000,x - 1 = 01111111,x & (x - 1) = 00000000
Loop ran 3 times, which equals the number of set bits.
Compare with Naive Bit Scan
The naive method checks each bit position.
Complexity comparison for 32-bit values:
- naive scan: always 32 iterations
- Kernighan method:
kiterations wherekis number of set bits
Kernighan is particularly good when k is small.
Handling 32-Bit Semantics in Python
Python integers are unbounded, so negative numbers are not automatically limited to 32 bits. If you want true 32-bit behavior, mask first.
Without masking, negative values would not represent fixed-width two's complement values the way C does.
C Implementation with Unsigned Types
In C or C-like languages, use unsigned 32-bit types for predictable bit operations.
Using uint32_t avoids sign-extension pitfalls in shift and arithmetic operations.
Alternative: Built-ins and Intrinsics
In production code, compiler and runtime built-ins are usually preferred.
Python:
GCC or Clang C:
These often map to optimized CPU instructions where available.
Parallel Bit Count Method
There is also a branchless method that counts bits in groups using masks and arithmetic.
This is useful in low-level contexts where branch predictability matters.
Practical Use Cases
Popcount appears in:
- bitmap index operations
- Hamming distance and similarity scoring
- bloom filter diagnostics
- permission or feature flag analysis
- cryptographic bit-level routines
Because it can be called in tight loops, choice of method can affect performance.
Common Pitfalls
A common pitfall is forgetting fixed-width masking in Python for 32-bit semantics, especially with negative numbers.
Another issue is using signed integers in C and assuming behavior is identical across compilers and platforms.
Teams also over-optimize by writing clever bit hacks where built-ins are clearer and equally fast.
Benchmarking on unrealistic data is another mistake. Kernighan gains most on sparse bit patterns.
Finally, mixing 32-bit and 64-bit assumptions in one code path can produce subtle correctness bugs.
Summary
- '
x &= x - 1clears one set bit per iteration.' - Loop iterations equal the number of set bits.
- For fixed-width semantics, use unsigned types or masks.
- Built-ins such as
bit_countor compiler intrinsics are often best in production. - Choose method based on correctness needs and measured performance, not folklore.

