Amortized analysis of stdvector insertion

algorithm analysis

data structures

std::vector

C++ programming

amortized complexity

Amortized analysis of stdvector insertion

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

std::vector::push_back is a classic example of amortized analysis. A single insertion can occasionally cost O(n) when the vector has to reallocate and move existing elements, but across a long sequence of insertions the average cost per insertion is still constant.

Why a Single Insertion Can Be Expensive

A vector stores elements in one contiguous block of memory. When there is still unused capacity, push_back only needs to construct the new element at the end.

cpp

1#include <iostream>
2#include <vector>
3
4int main() {
5    std::vector<int> values;
6    values.reserve(4);
7
8    values.push_back(10);
9    values.push_back(20);
10    values.push_back(30);
11
12    std::cout << values.size() << '\n';
13}

As long as capacity remains available, that end insertion is effectively constant time.

The expensive case happens when size() == capacity(). Then the vector typically:

allocates a larger block
moves or copies the existing elements
destroys the old storage
appends the new element

That one insertion costs linear time in the number of existing elements.

Why the Amortized Cost Is Still `O(1)`

Amortized analysis asks for the average cost over a long sequence of operations, not the worst cost of one isolated operation.

Suppose capacity grows geometrically, often by roughly doubling. Starting from capacity 1, the expensive reallocations happen around these sizes:

'1'
'2'
'4'
'8'
'16'

If you insert n elements, the total amount of copying or moving from all reallocations is roughly:

text

1 + 2 + 4 + 8 + ... + n

That geometric sum is O(n). The n normal insertions themselves are also O(n) total. So across n insertions, the total work is still O(n), which means the amortized cost per insertion is O(1).

That is the key argument. Reallocations are expensive, but they are infrequent enough that their cost spreads out over many cheap insertions.

A Concrete Intuition

Imagine pushing eight integers into an initially empty vector whose capacity doubles.

insert 1: cheap
insert 2: reallocate and move 1 old element
insert 3: reallocate and move 2 old elements
insert 4: cheap
insert 5: reallocate and move 4 old elements
inserts 6, 7, 8: cheap

Even though a few steps are expensive, most are cheap. The expensive work does not happen on every insertion, so the average remains constant.

The Growth Factor Is Not Fixed by the Standard

One subtle point is that the C++ standard does not require exact doubling. Different standard library implementations can use different geometric growth strategies.

The amortized O(1) result still relies on the same idea: capacity must grow by more than a constant additive amount. If capacity only increased by 1 each time, the total cost would become quadratic. Geometric growth is what keeps the total reallocation work linear over many insertions.

So the precise constants differ by implementation, but the asymptotic amortized argument remains the same.

`push_back` Versus `insert`

This discussion is specifically about insertion at the end. Insertion in the middle of a vector is different because existing elements after the insertion point must be shifted even when no reallocation is needed.

cpp

1#include <vector>
2
3int main() {
4    std::vector<int> values = {1, 2, 3, 4};
5    values.insert(values.begin() + 1, 99);
6}

That operation is generally O(n) because the vector must move trailing elements to make room. So do not generalize the amortized O(1) result from push_back to arbitrary insertion positions.

`reserve` Can Reduce Reallocations

If you know approximately how many elements will be inserted, reserve can remove much of the reallocation overhead.

cpp

1#include <vector>
2
3int main() {
4    std::vector<int> values;
5    values.reserve(1000);
6
7    for (int i = 0; i < 1000; ++i) {
8        values.push_back(i);
9    }
10}

This does not change the formal amortized analysis of ordinary push_back, but it does improve real-world performance by avoiding repeated growth and movement.

It is a performance hint, not a semantic requirement. The vector still behaves correctly without it.

Amortized Does Not Mean Every Operation Is Cheap

A common misunderstanding is that amortized O(1) means each insertion is fast. That is false. Some insertions are still O(n). Amortized analysis says that if you average the cost over many insertions, the average cost per insertion stays bounded by a constant.

That distinction matters in latency-sensitive code. If one expensive reallocation at the wrong time matters, amortized complexity alone may not be enough. In those cases, reserve or a different data structure may be worth considering.

Common Pitfalls

One common mistake is thinking push_back is worst-case O(1). It is only amortized O(1); a reallocation can still make one insertion linear.

Another pitfall is assuming the same argument applies to insertion in the middle of the vector. It does not, because element shifting already costs linear time.

A third issue is assuming the standard guarantees a specific capacity-doubling factor. It guarantees complexity behavior, not an exact growth formula.

Finally, developers sometimes ignore reserve when the final size is known in advance, which can leave easy performance gains unused.

Summary

'std::vector::push_back is amortized O(1), not worst-case O(1).'
Individual insertions can cost O(n) when reallocation and movement occur.
Geometric capacity growth keeps the total cost of many insertions linear.
The amortized argument applies to end insertion, not arbitrary middle insertion.
'reserve is a practical way to reduce reallocation cost when the expected size is known.'

Amortized analysis of stdvector insertion

Master System Design with Codemia

Introduction

Why a Single Insertion Can Be Expensive

Why the Amortized Cost Is Still O(1)

A Concrete Intuition

The Growth Factor Is Not Fixed by the Standard

push_back Versus insert

reserve Can Reduce Reallocations

Amortized Does Not Mean Every Operation Is Cheap

Common Pitfalls

Summary

Why the Amortized Cost Is Still `O(1)`

`push_back` Versus `insert`

`reserve` Can Reduce Reallocations