Choosing a multiplier for a string hash function
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
If you are building a simple polynomial-style string hash, the multiplier controls how strongly earlier characters influence the final value. There is no single perfect constant, but good choices are usually odd numbers with decent mixing properties, and the best multiplier depends on the rest of the hash design, especially the modulus or overflow behavior.
The Role Of The Multiplier
A common string hash uses a recurrence like this:
Each new character shifts the influence of the previous characters by multiplying the accumulated hash before adding the next code point.
If the multiplier is too small or poorly matched to the modulus, different strings can collapse into similar patterns and cause more collisions.
Why Constants Like 31 And 33 Show Up So Often
Certain multipliers became popular because they worked reasonably well in practice and were cheap to compute on older hardware.
Examples:
- '
31in Java's classic string hash' - '
33in thedjb2family' - '
131in some BKDR-style hashes'
These constants are not magic. They are simply serviceable tradeoffs that produce acceptable mixing for many workloads.
Historically, small odd numbers were attractive because they avoided trivial even-number patterns and could sometimes be optimized efficiently. For example, multiplying by 31 can be rewritten as a shift and subtraction in low-level code, though modern compilers usually handle this for you.
The Multiplier Is Not The Whole Story
A strong hash depends on more than one constant.
You also need to consider:
- whether arithmetic wraps naturally or uses a modulus
- the expected character set
- the table resizing strategy
- whether a final mixing step is applied
For example, if your hash table size is a power of two, using an even multiplier is especially poor because low bits can lose variation quickly. Odd multipliers are usually safer in that setting.
If you are using a large prime modulus for rolling hashes, the multiplier is often called the base. In that case, many developers choose a base larger than the alphabet size and smaller than the modulus, then verify collision behavior empirically for the target workload.
Compare A Few Multipliers
A quick experiment can show how the constant affects bucket distribution.
This does not prove one multiplier is universally best, but it illustrates the real point: the constant changes how strings cluster into buckets, especially when the bucket count is small.
Practical Selection Rules
If you are implementing a basic non-cryptographic string hash, these rules are usually good enough:
- Prefer an odd multiplier.
- Avoid tiny values such as
1or2. - Match the choice to the modulus or bucket strategy.
- Test on realistic input rather than trusting folklore alone.
If you are implementing a rolling hash for substring algorithms, choose:
- a large modulus, often prime
- a base that is not too small
- sometimes multiple independent hashes if collision risk matters
For hash tables in production systems, it is often better to use a proven existing hash rather than hand-tune one multiplier in isolation.
When Not To Design Your Own Hash
If the goal is general-purpose key hashing for a production data structure, you usually should not invent a custom string hash unless you have a specific reason.
Well-tested implementations already account for:
- adversarial collision patterns
- platform-specific integer behavior
- final mixing quality
- performance tradeoffs on modern CPUs
Choosing a multiplier is a useful algorithm exercise, but it is not a substitute for a mature hashing strategy when correctness or security matters.
Common Pitfalls
The biggest mistake is assuming the multiplier alone guarantees a good hash. Poor modulus choice or poor bucket sizing can ruin the distribution even with a famous constant.
Another mistake is copying constants such as 31 or 33 without understanding the arithmetic model. A multiplier that behaves reasonably with one modulus or machine-word overflow pattern may behave differently in another setup.
People also underestimate the effect of input distribution. A multiplier that looks fine on random strings may perform badly on short identifiers, similar prefixes, or structured keys.
Finally, remember that simple polynomial hashes are not cryptographic. A good multiplier reduces accidental collisions, not deliberate attacks.
Summary
- The multiplier in a string hash controls how earlier characters influence later positions.
- Good choices are usually odd constants such as
31,33, or131, but none is universally best. - The right constant depends on the modulus, overflow behavior, and input distribution.
- Measure collision behavior on realistic data instead of relying only on tradition.
- For production systems, a proven hash function is often better than tuning one multiplier by hand.

