NumPy
arrays
Python programming
data manipulation
array concatenation

Concatenate a NumPy array to another NumPy array

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Array concatenation in NumPy is simple when shapes align, and frustrating when they do not. Most runtime errors come from misunderstanding axis rules or using stack where concatenate is needed. A reliable approach is to inspect shapes first, then choose the API that matches the intended dimensional result.

Understand the Shape Rule

np.concatenate joins arrays along an existing axis. Every non-target axis must match exactly.

python
1import numpy as np
2
3a = np.array([[1, 2], [3, 4]])
4b = np.array([[5, 6]])
5
6print(a.shape)  # (2, 2)
7print(b.shape)  # (1, 2)

Because column count matches, row-wise concatenation is valid.

Concatenate by Rows and Columns

Use axis zero for adding rows and axis one for adding columns in two-dimensional arrays.

python
1rows_added = np.concatenate([a, b], axis=0)
2print(rows_added)
3
4col_block = np.array([[7], [8]])
5cols_added = np.concatenate([a, col_block], axis=1)
6print(cols_added)

If shape mismatch occurs, print each shape before the operation so errors are obvious.

Use stack When You Need a New Axis

np.stack does not extend an existing axis. It inserts a new dimension.

python
1x = np.array([1, 2, 3])
2y = np.array([4, 5, 6])
3
4stacked = np.stack([x, y], axis=0)
5print(stacked)
6print(stacked.shape)  # (2, 3)

Choose stack for batching separate samples and concatenate for extending one dimension of existing layout.

One-Dimensional Helpers and Subtle Differences

For one-dimensional arrays, np.concatenate and np.hstack often produce identical output. For higher-dimensional arrays, helper behavior can differ, so verify shape explicitly.

python
1p = np.array([1, 2, 3])
2q = np.array([4, 5])
3
4print(np.concatenate([p, q]))
5print(np.hstack([p, q]))

Relying on intuition across dimensions causes subtle bugs in reusable utility code.

Avoid Repeated Concatenation in Loops

Concatenating inside loops reallocates arrays repeatedly and can become very slow. Collect chunks first, concatenate once.

python
1chunks = []
2for i in range(5):
3    chunks.append(np.full((2, 2), i))
4
5result = np.concatenate(chunks, axis=0)
6print(result.shape)

This pattern is both faster and easier to reason about.

Guard Shape Contracts Explicitly

If arrays come from external stages, validate dimensions before concatenation and raise clear errors.

python
1def concat_rows(arrays):
2    if not arrays:
3        raise ValueError("arrays must not be empty")
4
5    col_count = arrays[0].shape[1]
6    for idx, arr in enumerate(arrays):
7        if arr.ndim != 2:
8            raise ValueError(f"array {idx} must be 2D")
9        if arr.shape[1] != col_count:
10            raise ValueError(f"array {idx} has column mismatch")
11
12    return np.concatenate(arrays, axis=0)

Shape checks near boundary points save debugging time later.

Watch Dtype Promotion

Concatenation can promote dtype unexpectedly, changing memory use and numeric behavior.

python
1x = np.array([1, 2, 3], dtype=np.int32)
2y = np.array([0.5, 1.5], dtype=np.float64)
3z = np.concatenate([x, y])
4
5print(z.dtype)  # float64

If stable dtype is required, cast inputs explicitly before concatenating.

Practical Pipeline Pattern

In feature pipelines, a good pattern is batch-level collection followed by one-stage concatenation, then immediate shape assertion.

python
batches = [np.random.randn(128, 32) for _ in range(10)]
features = np.concatenate(batches, axis=0)
assert features.shape == (1280, 32)

This keeps pipeline stages deterministic and test-friendly.

Concatenation Copies and Memory Layout

np.concatenate creates a new array, so it is not an in-place append. For large tensors, that copy cost matters in both memory and time. If you need repeated growth semantics, collect blocks first and concatenate once at stage boundaries.

python
1left = np.arange(6).reshape(2, 3)
2right = np.arange(6, 12).reshape(2, 3)
3joined = np.concatenate([left, right], axis=0)
4print(joined.flags["C_CONTIGUOUS"])

Checking memory layout is useful when downstream native code expects contiguous arrays.

Common Pitfalls

  • Choosing the wrong axis and producing an unexpected layout.
  • Using stack when concatenate is intended.
  • Concatenating inside tight loops and causing large reallocation cost.
  • Ignoring dtype promotion and getting silent precision changes.
  • Skipping shape validation for upstream-generated arrays.

Summary

  • Check shapes first and ensure non-target axes match.
  • Use concatenate to extend existing axes.
  • Use stack only when you need an additional dimension.
  • Prefer single concatenation after collecting chunks.
  • Add shape and dtype checks for stable production behavior.

Course illustration
Course illustration

All Rights Reserved.