Algorithm to select a single, random combination of values?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Selecting one random combination sounds simple until you define what "fair" means. If you need one subset of size k from n values, every valid k-combination must have the same probability; if you allow any subset size, the probability model changes again.
Define the Combination Space First
There are two common interpretations of this problem:
- Pick exactly
kdistinct items from a list ofnitems. - Pick any subset, including the empty subset and the full set.
Those are different problems. For fixed-size combinations, the sample space has n choose k possibilities. For arbitrary subsets, the sample space has 2^n possibilities because each element is either included or excluded.
If you skip this distinction, you can accidentally bias the result. A routine that flips a coin for each element produces a uniform random subset over all subset sizes, but it does not produce a uniform random k-combination.
Uniform Selection for a Fixed Size
For the fixed-size case, the most practical solution is to sample without replacement. In Python, random.sample already does that efficiently.
This works because every group of k unique positions is equally likely. If the input contains duplicate values and you care about positions rather than value labels, sampling the list directly is still correct. If duplicates should collapse into one logical value, deduplicate before sampling.
For a language without a built-in helper, reservoir-style selection is a good mental model. Walk the array once and decide whether to keep each item while preserving a final sample size of k.
The key invariant is that after each step, the algorithm still has enough remaining items to finish with exactly k picks.
Uniform Selection for Any Subset Size
If every subset is allowed, the simplest correct algorithm is to decide independently for each item whether it is included.
This is equivalent to generating a random bit mask. For n elements there are 2^n masks, and each one appears with probability 1 / 2^n.
You can also implement that idea explicitly:
The bitmask version is useful when you later need to store or compare the chosen subset compactly.
Complexity and Practical Limits
You do not need to generate all combinations to pick one random combination. That would be catastrophic once n grows.
The fixed-size sampling approach runs in linear time or close to linear time depending on the helper used, and it uses only the memory required for the output. The subset approach is also linear because you inspect each input item once.
The real limit is not time spent generating combinations, but whether your randomness source is appropriate. For simulations, the default pseudo-random generator is usually fine. For security-sensitive selection, use a cryptographically secure generator such as Python's secrets module.
Common Pitfalls
The most common mistake is shuffling the whole list and taking the first k items when the input contains repeated logical values. That samples positions correctly, but not unique value labels.
Another mistake is generating a random integer between 0 and 2^n - 1 and then filtering down to subsets of size k. That approach wastes work and becomes biased if you "retry until size matches" without thinking through the distribution.
Developers also sometimes confuse combinations with permutations. If order does not matter, ["a", "b"] and ["b", "a"] are the same combination and should not be counted twice.
Finally, avoid materializing all combinations with helpers like itertools.combinations just to pick one random entry. That turns a sampling problem into an enumeration problem.
Summary
- Decide whether you need a fixed-size combination or any subset at all.
- For exactly
kitems, sample without replacement instead of generating all combinations. - For arbitrary subsets, include or exclude each element independently.
- Do not enumerate the entire combination space unless the input is tiny.
- Use a secure randomness source only when the selection has security implications.

