multiset permutation
lexicographic index
algorithm
combinatorics
permutation generation

Algorithm for finding multiset permutation given lexicographic index

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Finding the permutation of a multiset at a given lexicographic index is an unranking problem. The challenge is that repeated elements reduce the number of distinct permutations, so you cannot treat the multiset like a list of unique symbols and apply an ordinary factorial-number-system algorithm without overcounting.

Count Permutations with Multiplicity

If a multiset contains repeated elements, the number of distinct permutations is given by the multinomial formula. For counts c1, c2, and so on, the number of unique permutations is:

text
n! / (c1! * c2! * ...)

For example, the multiset AAB has three distinct permutations:

text
AAB
ABA
BAA

A naive unique-element algorithm would incorrectly count six arrangements because it would treat the two A symbols as different.

The Core Idea of the Unranking Algorithm

Build the answer one position at a time.

At each step:

  1. try each remaining symbol in lexicographic order
  2. temporarily place that symbol at the current position
  3. count how many distinct permutations remain with the reduced multiset
  4. compare that block size with the target index
  5. either keep the symbol or skip that entire block and subtract its size from the index

That is the multiset equivalent of walking lexicographic blocks.

A Runnable Python Implementation

The following code assumes a zero-based index. So index 0 means the very first permutation in lexicographic order.

python
1from collections import Counter
2from math import factorial
3
4
5def multiset_count(counts):
6    total = sum(counts.values())
7    result = factorial(total)
8    for count in counts.values():
9        result //= factorial(count)
10    return result
11
12
13def unrank_multiset_permutation(items, index):
14    counts = Counter(items)
15    symbols = sorted(counts)
16    total = multiset_count(counts)
17
18    if index < 0 or index >= total:
19        raise IndexError("lexicographic index out of range")
20
21    result = []
22    remaining = sum(counts.values())
23
24    for _ in range(remaining):
25        for symbol in symbols:
26            if counts[symbol] == 0:
27                continue
28
29            counts[symbol] -= 1
30            block_size = multiset_count(counts)
31
32            if index < block_size:
33                result.append(symbol)
34                break
35
36            index -= block_size
37            counts[symbol] += 1
38
39    return "".join(result)
40
41
42print(unrank_multiset_permutation("AABC", 0))
43print(unrank_multiset_permutation("AABC", 5))
44print(unrank_multiset_permutation("AABC", 11))

Example output:

text
AABC
ACBA
CBAA

The last valid index for AABC is 11 because there are 4! / 2! = 12 distinct permutations.

Why the Block Counting Works

Suppose the first character is fixed to A. If the remaining multiset is ABC, then every distinct arrangement of ABC forms one lexicographic block beginning with A. The block size is just the number of permutations of the remaining multiset.

If your target index is larger than that block, you know the desired permutation does not start with A, so you skip the entire block and try the next symbol. This avoids generating permutations explicitly, which is exactly what makes the algorithm efficient.

Complexity and Practical Use

The algorithm is much better than enumerating every permutation and indexing into the result, especially when the permutation count is large. It still performs repeated counting work, so precomputing factorials can help if you call it many times.

In practice, this approach is useful in:

  • combinatorial generation tools
  • ranking and unranking problems
  • search-space partitioning
  • deterministic test case generation

Common Pitfalls

The most common mistake is forgetting whether the index is zero-based or one-based. The code above is zero-based. If your problem statement uses one-based indexing, subtract 1 before calling the function.

Another mistake is applying an ordinary factorial permutation algorithm to repeated symbols. That overcounts and produces wrong block sizes.

It is also easy to forget to restore the symbol count when you skip a block. If you decrement a count and do not put it back after skipping, the remaining counts become wrong and the whole result drifts.

Summary

  • Multiset permutation unranking is a lexicographic block-counting problem.
  • Use multinomial counts instead of ordinary factorial counts.
  • Build the permutation left to right by testing symbols in sorted order.
  • Skip whole blocks when the index lies beyond the current prefix.
  • Be explicit about zero-based versus one-based indexing.

Course illustration
Course illustration

All Rights Reserved.