Hamming Weight
Bit Manipulation
Constant Time
Algorithm
Duplication

Calculating Hamming Weight in O1

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

The Hamming weight of an integer is the number of bits set to 1 in its binary representation. The phrase “O(1) Hamming weight” is only correct when the integer width is fixed, such as a 32-bit or 64-bit machine word; for arbitrary-length integers, the runtime must grow with the number of machine words being inspected.

What O(1) Really Means Here

For a fixed-width type like uint32_t, the number of bits is constant. That means any algorithm using a fixed number of bitwise operations is constant time with respect to input size.

For an integer type that can grow without bound, such as Python’s big integers, there is no true constant-time popcount across all possible values. The implementation must inspect however many machine words are needed to store the number.

So the right interpretation is:

  • fixed-size machine word: O(1) is meaningful
  • variable-size integer: runtime is proportional to representation size

A Classic SWAR Technique

One common constant-time technique for 32-bit values is a SWAR style algorithm, short for “SIMD within a register”. It groups partial bit counts and sums them inside the same word.

c
1#include <stdint.h>
2#include <stdio.h>
3
4uint32_t popcount32(uint32_t x) {
5    x = x - ((x >> 1) & 0x55555555u);
6    x = (x & 0x33333333u) + ((x >> 2) & 0x33333333u);
7    x = (x + (x >> 4)) & 0x0F0F0F0Fu;
8    x = x + (x >> 8);
9    x = x + (x >> 16);
10    return x & 0x3Fu;
11}
12
13int main(void) {
14    uint32_t value = 0b101101001;
15    printf("%u\n", popcount32(value));
16    return 0;
17}

This uses a fixed number of operations regardless of the actual bit pattern. On a fixed 32-bit word, that is constant time.

Brian Kernighan’s Method

A popular alternative repeatedly clears the lowest set bit:

c
1#include <stdint.h>
2#include <stdio.h>
3
4uint32_t popcount_loop(uint32_t x) {
5    uint32_t count = 0;
6    while (x != 0) {
7        x &= (x - 1);
8        count++;
9    }
10    return count;
11}
12
13int main(void) {
14    printf("%u\n", popcount_loop(0b11101000u));
15    return 0;
16}

This is often very fast in practice, but it is not constant with respect to the number of set bits. Its runtime is proportional to the popcount itself.

That makes it a good practical algorithm, but not the best example if the discussion is strictly about fixed-operation O(1) techniques.

Compiler and Hardware Support

In real code, the best solution is often not a hand-written bit trick at all. Modern compilers expose built-in popcount operations that map to efficient instructions when the hardware supports them.

c
1#include <stdio.h>
2
3int main(void) {
4    unsigned int value = 0b111100001010u;
5    printf("%d\n", __builtin_popcount(value));
6    return 0;
7}

With GCC and Clang, __builtin_popcount and its wider variants are typically the clearest option. They are easier to read, easier to trust, and often compile to a dedicated CPU instruction.

If you are writing performance-sensitive code, inspect the generated assembly or benchmark on the target platform before assuming a hand-tuned trick is faster.

Lookup Tables

Another constant-time approach for fixed-width integers is a byte lookup table. Precompute popcounts for all 256 possible byte values, then sum the counts of the bytes in the word.

This can work well, but it trades arithmetic for memory access. On modern CPUs, built-ins or hardware popcount instructions usually win for both simplicity and speed.

Lookup tables still make sense in constrained environments or educational settings where you want to show how time-space tradeoffs work.

Common Pitfalls

The biggest pitfall is claiming true O(1) for arbitrary-size integers. That is mathematically misleading because the representation length still matters.

Another mistake is optimizing readability away. A clever SWAR implementation is fine in a textbook or interview answer, but production code is often better served by a compiler builtin.

A third issue is benchmarking incorrectly. Tiny microbenchmarks can be distorted by inlining, constant folding, or different compiler flags, so performance claims should be measured carefully.

Summary

  • Hamming weight is the count of set bits in a number.
  • 'O(1) only makes sense for fixed-width integers such as 32-bit or 64-bit words.'
  • SWAR methods compute the result in a fixed number of bitwise operations.
  • Brian Kernighan’s loop is elegant and fast, but its runtime depends on the number of set bits.
  • In production C or C++, a builtin popcount is usually the most practical choice.

Course illustration
Course illustration

All Rights Reserved.