Creating a random tree?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Creating a random tree sounds simple until you decide what "random" should mean. Different generation methods produce very different distributions, so the right algorithm depends on whether you want a quick arbitrary tree, a random rooted tree, or a uniformly random labeled tree.
Decide What Kind Of Randomness You Need
A tree can be random in several senses:
- each new node chooses a random parent from earlier nodes
- each labeled tree on
nnodes is equally likely - a random spanning tree is drawn from a larger graph
Those are not equivalent. If you do not define the distribution first, it is easy to implement something random-looking but mathematically biased.
Simple Parent-Choice Method
A very easy generator for a rooted tree on nodes 0..n-1 is:
- start with node
0as the root - for each new node
i, choose a parent uniformly from0..i-1
This is fast and useful for testing, but it does not generate all labeled trees with equal probability.
Uniform Random Labeled Trees With Prüfer Codes
If you want every labeled tree on n vertices to be equally likely, Prüfer sequences are the standard method. A labeled tree on n nodes corresponds uniquely to a Prüfer code of length n - 2, where each entry is an integer from 0 to n - 1.
So the algorithm is:
- generate a random Prüfer sequence of length
n - 2 - decode it into a tree
This method gives a uniform random labeled tree, which is usually what people mean in combinatorics.
Random Spanning Tree On A Graph
If you are not generating from scratch but from an existing graph, the goal is often a random spanning tree. In that setting, algorithms such as Wilson’s algorithm or Aldous-Broder are more appropriate. They sample spanning trees of the given graph rather than arbitrary trees on n labeled nodes.
That distinction matters because the surrounding graph constrains which trees are even possible.
Representing The Result
A random tree can be returned in several formats:
- parent array
- edge list
- adjacency list
Choose the one that matches the next step of your application. A parent array is compact for rooted trees, while an edge list is more general and works well for graph algorithms.
Common Pitfalls
The most common mistake is saying "random tree" without defining the target distribution. A random parent generator and a uniform labeled-tree generator are not the same thing.
Another mistake is using the simple parent-choice method when theoretical uniformity matters. It is fine for simulation or test data, but not for combinatorial experiments that assume equal probability across trees.
A third issue is confusing rooted trees with unrooted labeled trees. The output representation changes the interpretation of the randomness.
Summary
- "Random tree" is ambiguous until you define the probability model.
- Choosing a random parent for each new node is simple but biased.
- Prüfer codes generate uniformly random labeled trees.
- Random spanning trees on existing graphs require different algorithms.
- Pick a representation such as parent array or edge list based on what comes next.

