How to optimize quicksort

quicksort optimization

algorithm efficiency

sorting algorithms

performance tuning

computer science

How to optimize quicksort

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

Quicksort is fast on average, but its real-world performance depends heavily on pivot choice, partition strategy, and recursion behavior. Optimizing quicksort is less about changing its big-O headline and more about reducing bad partitions, handling duplicates efficiently, and cutting overhead on small subarrays.

Start with the Real Weaknesses

A plain quicksort suffers in three predictable situations:

poor pivots create highly unbalanced partitions
many duplicate values waste work in two-way partitioning
recursion overhead dominates on small slices

Those issues are why a naive implementation can perform much worse than a production-grade sort even though both are “quicksort.”

Improve Pivot Selection

Choosing the first element as the pivot is easy and often bad. Sorted or nearly sorted input can produce very unbalanced partitions.

A common improvement is median-of-three: use the median of the first, middle, and last elements.

python

1def median_of_three(a, lo, hi):
2    mid = (lo + hi) // 2
3    x = [(a[lo], lo), (a[mid], mid), (a[hi], hi)]
4    x.sort(key=lambda pair: pair[0])
5    return x[1][1]

Another reasonable choice is random pivoting, which makes adversarial patterns much less likely in practice.

python

import random

pivot_index = random.randint(lo, hi)

Median-of-three reduces the chance of bad partitions on partially ordered data. Random pivoting reduces predictability. Both are much better than blindly choosing one fixed edge.

Use Three-Way Partitioning for Duplicates

If many elements equal the pivot, classic two-way partitioning keeps doing unnecessary comparisons and recursive calls. Three-way partitioning splits the array into:

less than pivot
equal to pivot
greater than pivot

python

1def partition3(a, lo, hi):
2    pivot = a[hi]
3    lt = lo
4    i = lo
5    gt = hi
6
7    while i <= gt:
8        if a[i] < pivot:
9            a[lt], a[i] = a[i], a[lt]
10            lt += 1
11            i += 1
12        elif a[i] > pivot:
13            a[i], a[gt] = a[gt], a[i]
14            gt -= 1
15        else:
16            i += 1
17
18    return lt, gt

This is especially important when the input contains many repeated keys. Without it, quicksort can waste a lot of effort reprocessing equal regions.

Switch to Insertion Sort for Small Partitions

Recursing down to tiny subarrays is usually not worth it. A common optimization is to stop quicksort below a small threshold and finish those pieces with insertion sort.

python

1def insertion_sort(a, lo, hi):
2    for i in range(lo + 1, hi + 1):
3        key = a[i]
4        j = i - 1
5        while j >= lo and a[j] > key:
6            a[j + 1] = a[j]
7            j -= 1
8        a[j + 1] = key

This works because insertion sort has low overhead and performs well on tiny or nearly sorted partitions. In practice, thresholds somewhere around 8 to 32 elements are common, though the best number depends on language and runtime.

Recurse on the Smaller Side First

Quicksort's recursion depth can be reduced by always recursing into the smaller partition first and then iterating on the larger side.

python

1def quicksort(a, lo, hi):
2    while lo < hi:
3        if hi - lo < 16:
4            insertion_sort(a, lo, hi)
5            return
6
7        pivot_index = median_of_three(a, lo, hi)
8        a[pivot_index], a[hi] = a[hi], a[pivot_index]
9        lt, gt = partition3(a, lo, hi)
10
11        if lt - lo < hi - gt:
12            quicksort(a, lo, lt - 1)
13            lo = gt + 1
14        else:
15            quicksort(a, gt + 1, hi)
16            hi = lt - 1

This keeps stack usage more controlled, especially on awkward inputs.

Cache and Data Movement Still Matter

Quicksort is usually cache-friendly because it works in place and scans contiguous memory, but poor partition logic can still cause unnecessary swaps. Optimizing the partition loop and avoiding redundant work on equal elements often matters more than inventing a complicated pivot rule.

In other words, the main wins come from better structure, not from micro-optimizing every comparison.

Know When to Stop Tuning Quicksort

Many production libraries do not use plain quicksort at all. They use hybrids such as introsort, which starts with quicksort and falls back to heap sort if recursion gets too deep. That protects against worst-case behavior while keeping quicksort's average speed.

If you are writing your own sort for learning, the optimizations above are the right place to focus. If you are writing production application code, using the standard library sort is often the better engineering choice.

Common Pitfalls

Choosing a fixed edge element as the pivot and suffering on sorted input.
Ignoring duplicate-heavy data and using only two-way partitioning.
Recursing all the way down to tiny partitions instead of switching to insertion sort.
Allowing recursion depth to grow unnecessarily instead of handling the smaller side first.
Benchmarking one special input shape and assuming the result generalizes.

Summary

Better pivot selection reduces the risk of unbalanced partitions.
Three-way partitioning is a major improvement when duplicates are common.
Insertion sort is usually faster for small subarrays.
Recurse on the smaller side first to reduce stack depth.
For production code, a library sort or introsort-style hybrid is often better than a hand-written naive quicksort.