XOR operation
summation techniques
formula derivation
computational mathematics
bitwise operations

Direct formula for summing XOR

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

When people ask for a direct formula for summing XOR, they usually want to avoid looping through every integer in a range. The standard trick is to work with prefix XOR instead of ordinary arithmetic sums. Once you know the repeating pattern for XOR from zero to n, any inclusive range can be answered in constant time.

Start with Prefix XOR

Define xor_upto(n) as the XOR of every integer from zero through n.

If you list the first few results, a pattern appears quickly:

  • 'xor_upto(0) = 0'
  • 'xor_upto(1) = 1'
  • 'xor_upto(2) = 3'
  • 'xor_upto(3) = 0'
  • 'xor_upto(4) = 4'
  • 'xor_upto(5) = 1'

The values repeat every four numbers. That gives a direct formula based on n % 4.

python
1def xor_upto(n: int) -> int:
2    remainder = n % 4
3    if remainder == 0:
4        return n
5    if remainder == 1:
6        return 1
7    if remainder == 2:
8        return n + 1
9    return 0

This function runs in constant time and replaces a loop that would otherwise take O(n) work.

Why the Four-Value Pattern Works

XOR behaves differently from addition, but it has algebraic properties that make cancellation easy:

  • 'x ^ x is 0'
  • 'x ^ 0 is x'
  • XOR is associative
  • XOR is commutative

Because of those rules, large parts of consecutive ranges cancel out cleanly. The modulo-four pattern is the compressed result of that cancellation behavior. You do not need to memorize a proof to use it, but you do need to remember that the repetition length is four, not two.

Compute XOR for Any Inclusive Range

Once you have prefix XOR, the XOR for an inclusive range from a to b is:

xor_range(a, b) = xor_upto(b) ^ xor_upto(a - 1)

That works because the shared prefix from zero through a - 1 appears in both expressions and cancels itself out.

python
1def xor_range(a: int, b: int) -> int:
2    if a == 0:
3        return xor_upto(b)
4    return xor_upto(b) ^ xor_upto(a - 1)
5
6
7print(xor_range(3, 9))

This makes large-range queries trivial even when b is enormous.

Verify the Formula with a Slow Version

A brute-force implementation is still useful for testing. It gives you a quick sanity check while writing or porting the formula.

python
1def xor_range_slow(a: int, b: int) -> int:
2    result = 0
3    for value in range(a, b + 1):
4        result ^= value
5    return result
6
7
8for start in range(0, 10):
9    for end in range(start, 15):
10        assert xor_range(start, end) == xor_range_slow(start, end)
11
12print("all tests passed")

That sort of test is especially helpful in interviews, contest code, or low-level code where one off-by-one mistake invalidates the whole solution.

Distinguish XOR from Ordinary Summation

The word "sum" causes confusion here. XOR is not ordinary addition, so familiar formulas such as arithmetic progression sums do not apply.

For example, the arithmetic sum from 1 to 4 is 10, but the XOR from 1 to 4 is:

python
print(1 ^ 2 ^ 3 ^ 4)  # 4

That is why the solution depends on cancellation properties, not on a numeric average or closed-form addition formula.

The Same Prefix Idea Applies to Arrays

The range trick is not limited to consecutive integers. You can also precompute prefix XOR for an arbitrary array and answer subarray XOR queries quickly.

python
1def build_prefix_xor(values):
2    prefix = [0]
3    for value in values:
4        prefix.append(prefix[-1] ^ value)
5    return prefix
6
7
8def xor_subarray(prefix, left, right):
9    return prefix[right + 1] ^ prefix[left]
10
11
12arr = [4, 7, 1, 3]
13prefix = build_prefix_xor(arr)
14print(xor_subarray(prefix, 1, 3))

The implementation details differ, but the underlying idea is the same: shared prefixes cancel out.

Common Pitfalls

The biggest pitfall is treating XOR like arithmetic addition and looking for the wrong kind of formula.

Another common mistake is forgetting the a - 1 term in the range formula. That creates an off-by-one error that may still pass a few small examples by accident.

People also sometimes remember that there is a repeating pattern but misremember the period. It repeats every four values, not every two.

Finally, range code should handle the a == 0 case carefully so you do not compute xor_upto(-1) by mistake.

Summary

  • Use prefix XOR rather than a loop when you need XOR over a range of integers.
  • 'xor_upto(n) follows a simple pattern based on n % 4.'
  • Any inclusive range XOR can be computed from two prefix XOR results.
  • A short brute-force validator is valuable for catching off-by-one mistakes.
  • The same cancellation idea also powers fast XOR queries on arrays.

Course illustration
Course illustration

All Rights Reserved.