Algorithm for finding multiset permutation given lexicographic index
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Finding the permutation of a multiset at a given lexicographic index is an unranking problem. The challenge is that repeated elements reduce the number of distinct permutations, so you cannot treat the multiset like a list of unique symbols and apply an ordinary factorial-number-system algorithm without overcounting.
Count Permutations with Multiplicity
If a multiset contains repeated elements, the number of distinct permutations is given by the multinomial formula. For counts c1, c2, and so on, the number of unique permutations is:
For example, the multiset AAB has three distinct permutations:
A naive unique-element algorithm would incorrectly count six arrangements because it would treat the two A symbols as different.
The Core Idea of the Unranking Algorithm
Build the answer one position at a time.
At each step:
- try each remaining symbol in lexicographic order
- temporarily place that symbol at the current position
- count how many distinct permutations remain with the reduced multiset
- compare that block size with the target index
- either keep the symbol or skip that entire block and subtract its size from the index
That is the multiset equivalent of walking lexicographic blocks.
A Runnable Python Implementation
The following code assumes a zero-based index. So index 0 means the very first permutation in lexicographic order.
Example output:
The last valid index for AABC is 11 because there are 4! / 2! = 12 distinct permutations.
Why the Block Counting Works
Suppose the first character is fixed to A. If the remaining multiset is ABC, then every distinct arrangement of ABC forms one lexicographic block beginning with A. The block size is just the number of permutations of the remaining multiset.
If your target index is larger than that block, you know the desired permutation does not start with A, so you skip the entire block and try the next symbol. This avoids generating permutations explicitly, which is exactly what makes the algorithm efficient.
Complexity and Practical Use
The algorithm is much better than enumerating every permutation and indexing into the result, especially when the permutation count is large. It still performs repeated counting work, so precomputing factorials can help if you call it many times.
In practice, this approach is useful in:
- combinatorial generation tools
- ranking and unranking problems
- search-space partitioning
- deterministic test case generation
Common Pitfalls
The most common mistake is forgetting whether the index is zero-based or one-based. The code above is zero-based. If your problem statement uses one-based indexing, subtract 1 before calling the function.
Another mistake is applying an ordinary factorial permutation algorithm to repeated symbols. That overcounts and produces wrong block sizes.
It is also easy to forget to restore the symbol count when you skip a block. If you decrement a count and do not put it back after skipping, the remaining counts become wrong and the whole result drifts.
Summary
- Multiset permutation unranking is a lexicographic block-counting problem.
- Use multinomial counts instead of ordinary factorial counts.
- Build the permutation left to right by testing symbols in sorted order.
- Skip whole blocks when the index lies beyond the current prefix.
- Be explicit about zero-based versus one-based indexing.

