How to optimize quicksort
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Quicksort is fast on average, but its real-world performance depends heavily on pivot choice, partition strategy, and recursion behavior. Optimizing quicksort is less about changing its big-O headline and more about reducing bad partitions, handling duplicates efficiently, and cutting overhead on small subarrays.
Start with the Real Weaknesses
A plain quicksort suffers in three predictable situations:
- poor pivots create highly unbalanced partitions
- many duplicate values waste work in two-way partitioning
- recursion overhead dominates on small slices
Those issues are why a naive implementation can perform much worse than a production-grade sort even though both are “quicksort.”
Improve Pivot Selection
Choosing the first element as the pivot is easy and often bad. Sorted or nearly sorted input can produce very unbalanced partitions.
A common improvement is median-of-three: use the median of the first, middle, and last elements.
Another reasonable choice is random pivoting, which makes adversarial patterns much less likely in practice.
Median-of-three reduces the chance of bad partitions on partially ordered data. Random pivoting reduces predictability. Both are much better than blindly choosing one fixed edge.
Use Three-Way Partitioning for Duplicates
If many elements equal the pivot, classic two-way partitioning keeps doing unnecessary comparisons and recursive calls. Three-way partitioning splits the array into:
- less than pivot
- equal to pivot
- greater than pivot
This is especially important when the input contains many repeated keys. Without it, quicksort can waste a lot of effort reprocessing equal regions.
Switch to Insertion Sort for Small Partitions
Recursing down to tiny subarrays is usually not worth it. A common optimization is to stop quicksort below a small threshold and finish those pieces with insertion sort.
This works because insertion sort has low overhead and performs well on tiny or nearly sorted partitions. In practice, thresholds somewhere around 8 to 32 elements are common, though the best number depends on language and runtime.
Recurse on the Smaller Side First
Quicksort's recursion depth can be reduced by always recursing into the smaller partition first and then iterating on the larger side.
This keeps stack usage more controlled, especially on awkward inputs.
Cache and Data Movement Still Matter
Quicksort is usually cache-friendly because it works in place and scans contiguous memory, but poor partition logic can still cause unnecessary swaps. Optimizing the partition loop and avoiding redundant work on equal elements often matters more than inventing a complicated pivot rule.
In other words, the main wins come from better structure, not from micro-optimizing every comparison.
Know When to Stop Tuning Quicksort
Many production libraries do not use plain quicksort at all. They use hybrids such as introsort, which starts with quicksort and falls back to heap sort if recursion gets too deep. That protects against worst-case behavior while keeping quicksort's average speed.
If you are writing your own sort for learning, the optimizations above are the right place to focus. If you are writing production application code, using the standard library sort is often the better engineering choice.
Common Pitfalls
- Choosing a fixed edge element as the pivot and suffering on sorted input.
- Ignoring duplicate-heavy data and using only two-way partitioning.
- Recursing all the way down to tiny partitions instead of switching to insertion sort.
- Allowing recursion depth to grow unnecessarily instead of handling the smaller side first.
- Benchmarking one special input shape and assuming the result generalizes.
Summary
- Better pivot selection reduces the risk of unbalanced partitions.
- Three-way partitioning is a major improvement when duplicates are common.
- Insertion sort is usually faster for small subarrays.
- Recurse on the smaller side first to reduce stack depth.
- For production code, a library sort or introsort-style hybrid is often better than a hand-written naive quicksort.

