weighted random choice
Python programming
algorithms
random selection
coding techniques

A weighted version of random.choice

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

random.choice() selects an element uniformly at random — every element has equal probability. For weighted random selection where some elements are more likely than others, Python 3.6+ provides random.choices() with a weights parameter. For older Python versions or specialized needs, you can use numpy.random.choice(), bisect with cumulative weights, or build an alias table for O(1) sampling.

python
1import random
2
3items = ["common", "uncommon", "rare", "legendary"]
4weights = [70, 20, 8, 2]
5
6# Single selection
7result = random.choices(items, weights=weights, k=1)[0]
8print(result)  # Most likely "common"
9
10# Multiple selections (with replacement)
11results = random.choices(items, weights=weights, k=10)
12print(results)
13# ['common', 'common', 'uncommon', 'common', 'rare', 'common', ...]

Weights do not need to sum to 1 or 100 — Python normalizes them internally. [70, 20, 8, 2] means "common" has a 70% chance, "uncommon" 20%, "rare" 8%, "legendary" 2%.

Using cum_weights for Pre-Computed Cumulative Weights

python
1import random
2
3items = ["A", "B", "C", "D"]
4
5# Regular weights
6weights = [10, 30, 40, 20]
7
8# Equivalent cumulative weights: [10, 40, 80, 100]
9cum_weights = [10, 40, 80, 100]
10
11result = random.choices(items, cum_weights=cum_weights, k=1)[0]

Pass cum_weights instead of weights if you pre-compute them — avoids recalculating cumulative sums on each call.

numpy.random.choice() — For Large-Scale Sampling

python
1import numpy as np
2
3items = ["common", "uncommon", "rare", "legendary"]
4probs = [0.70, 0.20, 0.08, 0.02]  # Must sum to 1.0
5
6# Single pick
7result = np.random.choice(items, p=probs)
8
9# 1000 picks — much faster than calling random.choices 1000 times
10results = np.random.choice(items, size=1000, p=probs)
11print(np.unique(results, return_counts=True))
12# (array(['common', 'legendary', 'rare', 'uncommon']),
13#  array([702, 18, 81, 199]))
14
15# Without replacement
16unique_picks = np.random.choice(items, size=3, replace=False, p=probs)

NumPy requires probabilities that sum to 1.0, not arbitrary weights.

Manual Implementation with bisect (Python < 3.6)

python
1import random
2import bisect
3import itertools
4
5def weighted_choice(items, weights):
6    cum_weights = list(itertools.accumulate(weights))
7    total = cum_weights[-1]
8    r = random.uniform(0, total)
9    idx = bisect.bisect_right(cum_weights, r)
10    return items[idx]
11
12# Example
13items = ["common", "uncommon", "rare", "legendary"]
14weights = [70, 20, 8, 2]
15
16result = weighted_choice(items, weights)
17print(result)

This builds cumulative weights [70, 90, 98, 100], picks a random number in [0, 100), and uses binary search to find which bucket it falls in. Time complexity is O(n) to build cumulative weights and O(log n) per sample.

Real-World Examples

Loot Drop System

python
1import random
2from collections import Counter
3
4loot_table = {
5    "gold_coin": 50,
6    "health_potion": 25,
7    "rare_sword": 15,
8    "epic_armor": 8,
9    "legendary_gem": 2,
10}
11
12items = list(loot_table.keys())
13weights = list(loot_table.values())
14
15# Simulate 10,000 drops
16drops = random.choices(items, weights=weights, k=10_000)
17print(Counter(drops))
18# Counter({'gold_coin': 5012, 'health_potion': 2498, 'rare_sword': 1507,
19#          'epic_armor': 793, 'legendary_gem': 190})

Weighted Load Balancing

python
1import random
2
3servers = ["server-1", "server-2", "server-3"]
4capacities = [4, 2, 1]  # server-1 handles 4x more traffic
5
6def route_request():
7    return random.choices(servers, weights=capacities, k=1)[0]
8
9# Distribute 1000 requests
10from collections import Counter
11distribution = Counter(route_request() for _ in range(1000))
12print(distribution)
13# Counter({'server-1': 572, 'server-2': 286, 'server-3': 142})

A/B Testing with Unequal Split

python
1import random
2
3variants = ["control", "treatment_A", "treatment_B"]
4traffic_split = [80, 15, 5]  # 80% control, 15% A, 5% B
5
6def assign_variant(user_id):
7    # Use seed for deterministic assignment per user
8    rng = random.Random(user_id)
9    return rng.choices(variants, weights=traffic_split, k=1)[0]
10
11print(assign_variant(12345))  # Always same result for same user
12print(assign_variant(67890))

Performance Comparison

python
1import random
2import numpy as np
3import timeit
4
5items = list(range(1000))
6weights = list(range(1, 1001))
7probs = [w / sum(weights) for w in weights]
8
9# random.choices — good for small to medium sampling
10timeit.timeit(lambda: random.choices(items, weights=weights, k=1000), number=1000)
11# ~0.8s for 1M total samples
12
13# numpy — faster for large batches
14np_items = np.array(items)
15timeit.timeit(lambda: np.random.choice(np_items, size=1000, p=probs), number=1000)
16# ~0.3s for 1M total samples

For single picks, random.choices is faster (no NumPy overhead). For thousands of picks at once, NumPy wins.

Common Pitfalls

  • Weights summing to zero: If all weights are 0, random.choices raises ValueError: Total of weights must be greater than zero. Validate weights before calling.
  • Confusing weights with cum_weights: Passing cumulative weights to the weights parameter produces incorrect probabilities. Use cum_weights explicitly if your weights are already cumulative.
  • random.choices returns a list: Even with k=1, the return value is a list ["item"], not a single value. Use random.choices(..., k=1)[0] to get the item directly.
  • NumPy probabilities not summing to 1.0: numpy.random.choice raises ValueError if p does not sum to approximately 1.0. Normalize with p = [w/sum(weights) for w in weights] or use random.choices which accepts unnormalized weights.
  • Sampling without replacement with weights: random.choices always samples with replacement. For weighted sampling without replacement, use numpy.random.choice(..., replace=False) or implement reservoir sampling.

Summary

  • Use random.choices(items, weights=weights, k=n) for weighted selection in Python 3.6+
  • Weights do not need to sum to 1 — Python normalizes automatically
  • Use numpy.random.choice() for large-scale sampling or when probabilities must sum to 1.0
  • For Python < 3.6, use bisect with cumulative weights for O(log n) sampling
  • random.choices samples with replacement; use NumPy for weighted sampling without replacement
  • Seed the random generator for reproducible results in testing and A/B experiments

Course illustration
Course illustration

All Rights Reserved.