Binary Search Trees

DSA Fundamentals

Binary Search Trees

Topics Covered

BST Property and Operations

The BST Property

Search: O(h) by Following the Order

Insertion: Find the Right Leaf Position

Deletion: Three Cases

Balanced vs Degenerate: O(log n) vs O(n)

BST Validation

The Common Mistake

Approach 1: Inorder Traversal

Approach 2: Min/Max Bounds (Preferred)

Why Bounds Is Preferred

Edge Cases to Watch

Kth Smallest and Inorder

Why Inorder Produces Sorted Order

Kth Smallest Element

Iterative Version with a Stack

Applications Beyond Kth Smallest

Lowest Common Ancestor

LCA in a General Binary Tree: The Baseline

LCA in a BST: Exploit the Ordering

Why This Is O(h) Time and O(1) Space

Edge Cases

When to Use Which Approach

Practice

A binary search tree is not just a binary tree with values in it. It is a binary tree with a strict ordering guarantee that makes search, insertion, and deletion dramatically faster than scanning every node. Understanding this ordering property is the foundation for every BST problem you will encounter.

The BST Property

For every node in a BST, all values in its left subtree are strictly less than the node's value, and all values in its right subtree are strictly greater. This is not just about the immediate children - it applies to the entire subtree. A node with value 10 guarantees that every single node reachable through its left child is less than 10, and every single node reachable through its right child is greater than 10.

1        8
2      /   \
3     3     10
4    / \      \
5   1   6      14
6      / \    /
7     4   7  13

In this tree, node 3's entire left subtree (just node 1) is less than 3, and its entire right subtree (nodes 6, 4, 7) is greater than 3. Node 10's right subtree (nodes 14, 13) is entirely greater than 10. Every node satisfies the property recursively.

This recursive ordering is what makes BSTs powerful. When you are looking for a value, you never need to search both subtrees - the ordering tells you exactly which direction to go.

Search: O(h) by Following the Order

To search for a value, start at the root. If the target equals the current node, you found it. If the target is less, go left. If greater, go right. Repeat until you find the value or hit a null pointer (meaning the value does not exist).

Each step eliminates an entire subtree from consideration. You follow a single path from root to leaf, making the time complexity O(h) where h is the height of the tree. This is the same principle as binary search on a sorted array - each comparison halves the remaining search space.

Notice that BST search is iterative, not recursive. You do not need to explore both subtrees or backtrack. The ordering property tells you exactly which way to go at every step, so a simple while loop suffices.

BST search

At each node, compare the target with the current value. Go left if smaller, right if larger. O(h) time.

Insertion: Find the Right Leaf Position

Insertion follows the same path as search. You walk down the tree comparing values, going left or right, until you reach a null pointer. That null position is exactly where the new node belongs - inserting it there preserves the BST property.

The key insight is that a new node always becomes a leaf. You never need to rearrange existing nodes during a basic BST insertion - you just find the correct empty spot. The path you take to find that spot is identical to the path you would take if you were searching for the value. If the value already exists, this implementation does nothing (skips the insertion), which is the standard behavior for BSTs that do not allow duplicates.

Deletion: Three Cases

Deletion is the most complex BST operation because removing a node from the middle of the tree must preserve the ordering property. There are three cases, and each one requires a different strategy:

Case 1: Leaf node. Simply remove it. No children to worry about. Set the parent's pointer to null and the node is gone.

Case 2: One child. Replace the node with its single child. The child's subtree is already correctly ordered relative to the deleted node's parent. Think of it as "bypassing" the deleted node - the child takes its place and the tree remains valid.

Case 3: Two children. This is the tricky one. You cannot just remove the node because both subtrees need a parent. The solution is to find the inorder successor - the smallest node in the right subtree - copy its value into the node being deleted, and then delete the inorder successor from the right subtree. The inorder successor has at most one child (it cannot have a left child, since it is the leftmost node in its subtree), so deleting it reduces to Case 1 or Case 2.

You could also use the inorder predecessor (the largest node in the left subtree) instead of the inorder successor. Both approaches preserve the BST property for the same reason - the replacement value sits at the boundary between the two subtrees.

Interview Tip

The inorder successor is the leftmost node in the right subtree. It is guaranteed to be greater than everything in the left subtree (because it is in the right subtree) and less than everything else in the right subtree (because it is the smallest there). This is why copying its value preserves the BST property.

Balanced vs Degenerate: O(log n) vs O(n)

The height h determines performance for every BST operation. A balanced BST with n nodes has height O(log n), so search, insertion, and deletion all run in O(log n) time. But if you insert sorted data (1, 2, 3, 4, 5), each node becomes the right child of the previous one, forming a straight line - a degenerate tree. The height becomes n, and every operation degrades to O(n), no better than scanning a linked list.

1Balanced (h = log n):       Degenerate (h = n):
2        4                     1
3      /   \                    \
4     2     6                    2
5    / \   / \                    \
6   1   3 5   7                    3
7                                   \
8                                    4
9                                     \
10                                      5

This is why self-balancing trees (AVL, Red-Black) exist - they perform rotations after insertions and deletions to guarantee O(log n) height. In interviews, you usually work with plain BSTs, but you should mention that the O(log n) guarantee requires balanced structure. If an interviewer asks about worst-case performance, the degenerate case is the answer.

The practical takeaway: a plain BST gives you O(h) operations where h depends on insertion order. A self-balancing BST gives you guaranteed O(log n) regardless of insertion order. Java's TreeMap and C++'s std::map use Red-Black trees internally. Python does not have a built-in balanced BST, but the sortedcontainers library provides one.

Operation	Balanced BST	Degenerate BST	Sorted Array
Search	O(log n)	O(n)	O(log n)
Insert	O(log n)	O(n)	O(n)
Delete	O(log n)	O(n)	O(n)
Min/Max	O(log n)	O(n)	O(1)

BST Validation

Validating whether a binary tree is a valid BST is one of the most frequently asked interview questions - and one of the most commonly solved incorrectly. The intuitive approach has a subtle bug that catches most candidates, and understanding why it fails teaches you something fundamental about the BST property.

The Common Mistake

The most natural first attempt is to check each node against its immediate children: is the left child smaller and the right child larger? This seems correct but fails on trees like this:

1        5
2      /   \
3     1     6
4          / \
5         3   7

Node 6 has left child 3 and right child 7. That looks fine locally - 3 is less than 6 and 7 is greater than 6. But 3 is also in the right subtree of 5, and 3 is less than 5. This violates the BST property. The local check at node 6 passes, but the global constraint fails.

The BST property is not about parent-child relationships - it is about entire subtrees. Every node in the right subtree of 5 must be greater than 5, including node 3. Checking only immediate children misses these inherited constraints from ancestors higher up in the tree.

Approach 1: Inorder Traversal

One correct approach exploits the fact that an inorder traversal of a valid BST produces values in strictly increasing order. If you perform an inorder traversal and the sequence is not strictly increasing, the tree is not a valid BST.

This works because inorder traversal visits nodes in left-root-right order. In a valid BST, this order matches the sorted order of values. If any value is not strictly greater than the previous one, the BST property is violated somewhere in the tree.

The inorder approach is intuitive if you already understand the connection between BSTs and sorted order. However, it has a subtle drawback: you must visit all left descendants of a node before you can check the node itself. If the violation is at the root, the inorder approach still processes the entire left subtree before discovering the problem.

Approach 2: Min/Max Bounds (Preferred)

The more elegant and interview-preferred approach passes valid bounds down the recursion. Each node must fall within a range determined by its ancestors. The root can be any value (range is negative infinity to positive infinity). When you go left, the current node becomes the upper bound. When you go right, the current node becomes the lower bound.

Step through the validation and see how min/max bounds tighten at each level:

Loading animator...

BST validation with bounds

Each node inherits a valid range [min, max] from ancestors. Going left tightens the max, going right tightens the min.

When validating node 3 in the example above, the bounds would be (5, 6) - it must be greater than 5 (because it is in the right subtree of 5) and less than 6 (because it is in the left subtree of 6). Since 3 is not greater than 5, validation fails immediately.

The bounds tighten as you descend. They never loosen. Every time you go left, the upper bound gets smaller. Every time you go right, the lower bound gets larger. By the time you reach any node, the bounds represent the intersection of all constraints inherited from every ancestor on the path from the root.

Why Bounds Is Preferred

Both approaches are O(n) time and O(h) space. But the bounds approach can short-circuit: as soon as it finds a violation, it returns false without visiting the rest of the tree. The inorder approach also short-circuits in the implementation above, but the bounds approach is conceptually cleaner because it directly encodes what the BST property means - every node has a valid range inherited from its ancestors.

Interview Tip

Interviewers use BST validation to test whether you truly understand the BST property. If your first instinct is to check only immediate children, pause and ask yourself: does a left child of a right child inherit constraints from the grandparent? The answer is yes, and that realization leads you directly to the bounds approach.

The bounds approach also generalizes well. If you need to validate a BST with additional constraints (like all values must be positive, or within some application-specific range), you simply adjust the initial bounds. The inorder approach would require adding separate range checks that are disconnected from the traversal logic.

Edge Cases to Watch

Duplicate values. The standard BST property uses strict inequality: left values are strictly less, right values are strictly greater. If the problem allows duplicates, clarify which subtree holds them (commonly left subtree allows equal values, using <= on the left). Adjust your bounds comparison accordingly - change node.val <= low to node.val < low for the left subtree if equal values are allowed on the left.

Integer overflow. If node values can be INT_MIN or INT_MAX, using integer bounds will overflow or collide with valid values. Use float('-inf') and float('inf') in Python, or Long.MIN_VALUE/Long.MAX_VALUE in Java to avoid this. The bounds must be strictly outside the range of any possible node value.

Single node. A tree with one node and no children is always a valid BST. Make sure your base case handles this - an empty tree (null root) is also valid. These trivial cases should return true immediately.

Left-skewed and right-skewed trees. A tree where every node has only a left child or only a right child is still a valid BST as long as the values are in the correct order. Do not assume invalid structure just because the tree is unbalanced.

Kth Smallest and Inorder

The connection between inorder traversal and sorted order is the single most useful property of BSTs for interview problems. Once you internalize that inorder traversal of a BST visits nodes in ascending order, a whole category of problems - finding the kth smallest, the median, range queries - becomes straightforward.

Why Inorder Produces Sorted Order

Inorder traversal visits left subtree, then root, then right subtree. In a BST, the left subtree contains all values smaller than the root, and the right subtree contains all values larger. So inorder visits all smaller values first, then the current value, then all larger values. Applied recursively at every node, this produces the full sorted sequence.

For the tree below, inorder traversal produces [1, 3, 4, 6, 7, 8, 10, 13, 14]:

1        8
2      /   \
3     3     10
4    / \      \
5   1   6      14
6      / \    /
7     4   7  13

This is not a coincidence or a trick - it is a direct consequence of the BST property. At every level of recursion, the left subtree contains smaller values (visited first), the root is in the middle (visited second), and the right subtree contains larger values (visited third). The entire traversal reconstructs the sorted order.

Inorder traversal of BST

Inorder traversal of a BST visits nodes in ascending sorted order, a direct consequence of the BST property.

Kth Smallest Element

To find the kth smallest element, perform an inorder traversal and return the kth value you visit. The naive approach collects all values into a list and returns the (k-1)th element, but that visits every node even if k is small.

The efficient approach counts nodes during traversal and stops as soon as you reach the kth one:

This visits at most k nodes plus the ancestors needed to reach them - significantly less than n when k is small. The early termination check (if result is not None: return) prevents unnecessary traversal of the right subtree once the answer is found. For k=1 (finding the minimum), this visits only the leftmost path - O(h) nodes in a balanced tree.

Iterative Version with a Stack

The iterative approach gives you even more explicit control over when to stop. It uses a stack to simulate the recursion, pushing left children until you reach the leftmost node, then popping and processing:

Watch the stack-based inorder traversal count down to the kth element:

Loading animator...

The outer while loop processes nodes in inorder. The inner while loop pushes all left descendants onto the stack, ensuring you visit the smallest unvisited node next. When you pop a node, it is the next in sorted order. Incrementing the counter and checking against k gives you the answer without collecting all values.

This iterative pattern is worth memorizing. It appears in many BST problems beyond just kth smallest: implementing a BST iterator, finding the next larger element, scanning a range of values, and verifying BST validity iteratively. The stack always holds at most h nodes (one per level of the current path), so space complexity is O(h).

Applications Beyond Kth Smallest

Finding the median. If you know the tree has n nodes, the median is the (n/2)th smallest for even n, or the ((n+1)/2)th smallest for odd n. Use the kth smallest algorithm with the appropriate k. If you do not know n, count nodes first with a separate O(n) traversal, then find the median. The total time is still O(n), but you visit the tree twice.

Range queries. To find all values between low and high, perform an inorder traversal but skip subtrees that cannot contain values in the range. If the current node is less than or equal to low, skip the left subtree entirely - it contains only smaller values. If the current node is greater than or equal to high, skip the right subtree. This prunes the search space significantly.

The pruning conditions (node.val > low before going left, node.val < high before going right) eliminate entire subtrees that fall outside the range. On a balanced BST with n nodes, a range query returning k results runs in O(log n + k) time - O(log n) to navigate to the range boundaries, plus O(k) to collect the results. This is dramatically better than the O(n) full traversal when k is much smaller than n.

Finding the successor of a given value. The inorder successor of a node is the smallest value larger than it. If the node has a right subtree, the successor is the leftmost node in that right subtree. If it does not, the successor is the nearest ancestor for which the node is in the left subtree. Both cases are O(h).

Lowest Common Ancestor

The lowest common ancestor (LCA) of two nodes in a tree is the deepest node that is an ancestor of both. In a general binary tree, finding the LCA requires visiting potentially every node. But in a BST, the ordering property gives you a shortcut that makes this problem elegant and efficient.

LCA in a General Binary Tree: The Baseline

In a binary tree without ordering, you must search both subtrees for the two target nodes. The standard approach recurses into both children and checks whether each subtree contains one of the targets:

This works but visits O(n) nodes in the worst case because you cannot predict which subtree contains which target. You must explore everything. The recursion also uses O(h) space on the call stack. For a balanced tree with a million nodes, that is up to a million comparisons and 20 stack frames. For a degenerate tree, both the time and stack depth reach n.

LCA in a BST: Exploit the Ordering

In a BST, you know exactly where values live based on comparisons with the current node. If both target values are less than the current node, both nodes must be in the left subtree - so the LCA must be there too. If both are greater, the LCA is in the right subtree. If one is less and one is greater (or one equals the current node), you have found the split point - the current node is the LCA.

Watch how the algorithm walks down the tree until the two search paths diverge:

Loading animator...

LCA in a BST

Walk from the root: when one target pulls left and the other right, you are at their lowest common ancestor.

The logic is simple: follow the path from the root, going left when both values are smaller and right when both are larger. The moment the two values diverge - one pulling left and one pulling right - you are at their lowest common ancestor. This is the point where the search paths to the two nodes split apart.

Think of it as walking down from the root with two search queries simultaneously. As long as both queries go the same direction, they share a path. The first node where they would go different directions is the deepest ancestor they both pass through.

Why This Is O(h) Time and O(1) Space

You follow a single path from the root downward, making one comparison per level. In a balanced BST, this is O(log n). In the worst case (degenerate tree), it is O(n). The space is O(1) because the iterative version uses no recursion and no extra data structures - just a single pointer that walks down the tree.

Compare this to the general binary tree LCA, which is O(n) time and O(h) space (for the recursion stack). The BST version is faster because the ordering property lets you eliminate half the tree at each step instead of searching both subtrees. This is the same divide-and-conquer advantage that makes BST search O(h) instead of O(n).

Algorithm	Time	Space	Requires BST?
General LCA	O(n)	O(h)	No
BST LCA	O(h)	O(1)	Yes

For a balanced tree with 1 million nodes, the general approach might visit all 1 million nodes, while the BST approach visits at most about 20. That is a 50,000x difference. Even for degenerate trees where both are O(n), the BST version is still faster because it does less work per node (one comparison vs two recursive calls).

Interview Tip

The BST LCA is essentially a binary search for the split point. Think of it as asking: where do the search paths for p and q diverge? As long as both values are on the same side of the current node, they share a path. The first node where they go different directions is the LCA.

Edge Cases

One node is the ancestor of the other. If p is an ancestor of q (meaning q is in p's subtree), then p itself is the LCA. The algorithm handles this correctly - when you reach p, one value equals the current node and the other is in a subtree, so the else branch fires and returns the current node.

Both nodes have the same value. If p.val equals q.val, both conditions (both less and both greater) will be false at the node with that value, and the else branch returns it. This is correct - a node is its own ancestor.

The root is the LCA. If one target is in the left subtree and one is in the right subtree, the very first comparison at the root triggers the else branch. The algorithm correctly identifies the root as the LCA without descending at all.

Nodes not in the tree. The standard LCA problem assumes both nodes exist in the tree. If they might not, you need an additional verification step after finding the candidate LCA: confirm that both p and q actually exist by searching for them in the candidate's subtree.

When to Use Which Approach

If the problem specifies a BST, always use the BST-specific approach. It is simpler, faster, and uses less space. If the problem says "binary tree" without the BST constraint, you must use the general approach because you cannot rely on ordering.

In interviews, recognizing that a tree is a BST and leveraging the ordering property is often the difference between an O(n) solution and an O(log n) one. The interviewer is testing whether you see that additional structure in the problem gives you information for free - and whether you exploit that information or ignore it.

Practice

Loading problem...

Done

Course

DSA Fundamentals

Arrays and Hashing

Two Pointers and Sliding Window

Stacks and Queues

Linked Lists

Trees and Graphs

Dynamic Programming

Advanced Data Structures

Advanced Patterns