Convert parent-child array to tree
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Converting a flat parent-child list into a tree is a common task in menus, comment threads, category hierarchies, and organization charts. The most reliable solution is a two-pass algorithm that first creates all nodes in a lookup map and then links each node to its parent.
That approach is both simple and efficient. It runs in linear time and makes it easier to validate bad input such as duplicate IDs, missing parents, or cyclic data.
Define the Input and Output Clearly
A typical input row contains:
- '
idas the node identifier' - '
parentIdas the parent reference ornullfor a root' - additional fields such as
name
A typical output is either:
- a list of root nodes, each with
children - or one synthetic root node that wraps the forest
Choose the output contract first, because UI code and API consumers need to know whether multiple roots are possible.
A Two-Pass Linear Algorithm
First create the node objects and store them by ID. Then do a second pass that either attaches each node to its parent or puts it into the root list.
This is O(n) time and O(n) space, which is much better than repeatedly scanning the whole list to find parents.
Example Data
The result is one root with children A and B, and A itself has child A1.
Add Validation for Real Systems
The tree-building algorithm is straightforward, but production input can still be bad.
Important checks include:
- duplicate IDs
- missing parents
- cycles
- nondeterministic child order
Duplicate IDs and missing parents are easy to catch during the two-pass build. Cycles require an extra traversal if your upstream data is untrusted.
Stable Ordering Matters
If the tree is rendered in a UI, child order should usually be deterministic.
Stable ordering helps with caching, snapshot tests, and predictable user experience.
Common Pitfalls
A common mistake is using nested loops to search for each parent repeatedly. That works on tiny datasets and then becomes unnecessarily slow on large ones.
Another issue is silently skipping nodes whose parents are missing. That may hide data-quality problems that should fail loudly.
Developers also often forget cycle detection because the input “should be a tree.” If upstream guarantees are weak, that assumption is dangerous.
Finally, decide what should happen when there are multiple roots. A tree API that assumes a single root but receives a forest will create confusion unless that possibility is handled explicitly.
Summary
- Use a two-pass map-based algorithm to convert flat parent-child rows into a tree efficiently.
- Validate duplicate IDs and missing parents during construction.
- Add cycle detection if input quality is not guaranteed.
- Sort children when deterministic rendering matters.
- Define clearly whether the output is a single root or a list of roots.

