Amortized analysis of stdvector insertion
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
std::vector::push_back is a classic example of amortized analysis. A single insertion can occasionally cost O(n) when the vector has to reallocate and move existing elements, but across a long sequence of insertions the average cost per insertion is still constant.
Why a Single Insertion Can Be Expensive
A vector stores elements in one contiguous block of memory. When there is still unused capacity, push_back only needs to construct the new element at the end.
As long as capacity remains available, that end insertion is effectively constant time.
The expensive case happens when size() == capacity(). Then the vector typically:
- allocates a larger block
- moves or copies the existing elements
- destroys the old storage
- appends the new element
That one insertion costs linear time in the number of existing elements.
Why the Amortized Cost Is Still O(1)
Amortized analysis asks for the average cost over a long sequence of operations, not the worst cost of one isolated operation.
Suppose capacity grows geometrically, often by roughly doubling. Starting from capacity 1, the expensive reallocations happen around these sizes:
- '
1' - '
2' - '
4' - '
8' - '
16'
If you insert n elements, the total amount of copying or moving from all reallocations is roughly:
That geometric sum is O(n). The n normal insertions themselves are also O(n) total. So across n insertions, the total work is still O(n), which means the amortized cost per insertion is O(1).
That is the key argument. Reallocations are expensive, but they are infrequent enough that their cost spreads out over many cheap insertions.
A Concrete Intuition
Imagine pushing eight integers into an initially empty vector whose capacity doubles.
- insert
1: cheap - insert
2: reallocate and move1old element - insert
3: reallocate and move2old elements - insert
4: cheap - insert
5: reallocate and move4old elements - inserts
6,7,8: cheap
Even though a few steps are expensive, most are cheap. The expensive work does not happen on every insertion, so the average remains constant.
The Growth Factor Is Not Fixed by the Standard
One subtle point is that the C++ standard does not require exact doubling. Different standard library implementations can use different geometric growth strategies.
The amortized O(1) result still relies on the same idea: capacity must grow by more than a constant additive amount. If capacity only increased by 1 each time, the total cost would become quadratic. Geometric growth is what keeps the total reallocation work linear over many insertions.
So the precise constants differ by implementation, but the asymptotic amortized argument remains the same.
push_back Versus insert
This discussion is specifically about insertion at the end. Insertion in the middle of a vector is different because existing elements after the insertion point must be shifted even when no reallocation is needed.
That operation is generally O(n) because the vector must move trailing elements to make room. So do not generalize the amortized O(1) result from push_back to arbitrary insertion positions.
reserve Can Reduce Reallocations
If you know approximately how many elements will be inserted, reserve can remove much of the reallocation overhead.
This does not change the formal amortized analysis of ordinary push_back, but it does improve real-world performance by avoiding repeated growth and movement.
It is a performance hint, not a semantic requirement. The vector still behaves correctly without it.
Amortized Does Not Mean Every Operation Is Cheap
A common misunderstanding is that amortized O(1) means each insertion is fast. That is false. Some insertions are still O(n). Amortized analysis says that if you average the cost over many insertions, the average cost per insertion stays bounded by a constant.
That distinction matters in latency-sensitive code. If one expensive reallocation at the wrong time matters, amortized complexity alone may not be enough. In those cases, reserve or a different data structure may be worth considering.
Common Pitfalls
One common mistake is thinking push_back is worst-case O(1). It is only amortized O(1); a reallocation can still make one insertion linear.
Another pitfall is assuming the same argument applies to insertion in the middle of the vector. It does not, because element shifting already costs linear time.
A third issue is assuming the standard guarantees a specific capacity-doubling factor. It guarantees complexity behavior, not an exact growth formula.
Finally, developers sometimes ignore reserve when the final size is known in advance, which can leave easy performance gains unused.
Summary
- '
std::vector::push_backis amortizedO(1), not worst-caseO(1).' - Individual insertions can cost
O(n)when reallocation and movement occur. - Geometric capacity growth keeps the total cost of many insertions linear.
- The amortized argument applies to end insertion, not arbitrary middle insertion.
- '
reserveis a practical way to reduce reallocation cost when the expected size is known.'

