Create a list with initial capacity in Python
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Python lists do not have a direct "initial capacity" parameter like Java's ArrayList(capacity). To pre-allocate a list of known size, create it with placeholder values: [None] * n or [0] * n. This avoids repeated memory reallocations during append operations. For the best performance with numeric data, use array.array or NumPy arrays, which pre-allocate contiguous memory blocks. In practice, Python's over-allocation strategy makes append() amortized O(1), so pre-allocation is only necessary for very large lists or performance-critical code.
Pre-allocate with Placeholder Values
[None] * n allocates a list with exactly n slots in one operation, avoiding incremental growth.
Pre-allocate vs Append Performance
How Python Lists Grow Internally
Python's growth pattern is roughly new_size = old_size + (old_size >> 3) + 6. This means the list grows by about 12.5% each time it runs out of space, with an amortized O(1) cost per append.
NumPy Arrays for Numeric Data
NumPy arrays use a fixed amount of memory per element (e.g., 8 bytes for float64). A Python list of floats uses about 28 bytes per float object plus 8 bytes per list pointer — roughly 4x more memory.
array.array for Typed Arrays
bytearray for Byte Buffers
When Pre-allocation Matters
Common Pitfalls
- Using
[[]] * nfor a list of lists:[[]] * ncreates n references to the same list object. Modifying one modifies all:a = [[]] * 3; a[0].append(1)gives[[1], [1], [1]]. Use[[] for _ in range(n)]to create independent lists. - Pre-allocating when a list comprehension would be simpler:
result = [None] * nfollowed by a loop is more verbose and often slower thanresult = [f(i) for i in range(n)]. List comprehensions are optimized in CPython and are the idiomatic way to build lists. - Using
list.append()in performance-critical NumPy code: Appending to a Python list and converting to NumPy at the end (np.array(lst)) is much slower than pre-allocating a NumPy array withnp.empty(n)and filling it with index assignment. The list approach creates Python objects that NumPy must then unbox. - Confusing
np.emptywithnp.zeros:np.emptyallocates memory without initializing it — the array contains whatever was previously in that memory location. This is faster thannp.zerosbut dangerous if you forget to fill every element. Usenp.zerosunless you are certain every element will be written before it is read. - Over-optimizing small lists: For lists under 10,000 elements, the difference between append and pre-allocation is microseconds. Python's over-allocation strategy handles small lists efficiently. Focus optimization efforts on the actual bottleneck, not on pre-allocating a list of 100 items.
Summary
- Use
[value] * nto pre-allocate a Python list with a known size - Use list comprehensions (
[f(i) for i in range(n)]) for the most Pythonic and often fastest approach - Use
numpy.zeros(n)ornumpy.empty(n)for numeric arrays with fixed memory per element - Python lists grow by about 12.5% each time —
append()is already amortized O(1) - Pre-allocation provides measurable speedup only for very large lists (millions of elements)

