Numpy
Python
Array Normalization
Data Processing
Machine Learning

How to normalize a NumPy array to within a certain range?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

To normalize a NumPy array into a chosen range, the usual technique is min-max scaling. You shift the data so the original minimum becomes the lower bound, then rescale the spread so the original maximum becomes the upper bound.

The Standard Min-Max Formula

If the array has minimum old_min and maximum old_max, and you want a new range from new_min to new_max, the formula is:

new_min + (x - old_min) * (new_max - new_min) / (old_max - old_min)

In NumPy, that becomes straightforward code:

python
1import numpy as np
2
3arr = np.array([2.0, 4.0, 6.0, 8.0], dtype=float)
4new_min, new_max = 0.0, 1.0
5
6old_min = arr.min()
7old_max = arr.max()
8
9normalized = new_min + (arr - old_min) * (new_max - new_min) / (old_max - old_min)
10print(normalized)

This maps the input to the interval [0, 1], but the same formula works for any numeric destination range.

Normalize to Another Range

For example, to scale the same array into [-1, 1]:

python
1import numpy as np
2
3arr = np.array([2.0, 4.0, 6.0, 8.0], dtype=float)
4new_min, new_max = -1.0, 1.0
5
6old_min = arr.min()
7old_max = arr.max()
8
9scaled = new_min + (arr - old_min) * (new_max - new_min) / (old_max - old_min)
10print(scaled)

Only the target bounds change. The scaling logic stays the same.

Handle Constant Arrays Explicitly

If every value in the array is the same, then old_max - old_min equals zero and the usual formula divides by zero. You should handle that case directly:

python
1import numpy as np
2
3
4def minmax_scale(arr, new_min=0.0, new_max=1.0):
5    arr = np.asarray(arr, dtype=float)
6    old_min = arr.min()
7    old_max = arr.max()
8
9    if old_max == old_min:
10        return np.full_like(arr, fill_value=new_min)
11
12    return new_min + (arr - old_min) * (new_max - new_min) / (old_max - old_min)
13
14
15print(minmax_scale([5.0, 5.0, 5.0]))
16print(minmax_scale([2.0, 4.0, 6.0], -1.0, 1.0))

There is no single universally correct output for a constant array, but returning the lower bound or the midpoint is much better than allowing a divide-by-zero warning.

Normalize Along an Axis

Sometimes you do not want to scale the entire array globally. In machine learning, it is common to normalize each feature column independently.

python
1import numpy as np
2
3arr = np.array([
4    [1.0, 10.0],
5    [2.0, 20.0],
6    [3.0, 30.0]
7], dtype=float)
8
9col_min = arr.min(axis=0)
10col_max = arr.max(axis=0)
11
12normalized = (arr - col_min) / (col_max - col_min)
13print(normalized)

Axis-aware scaling matters because global normalization can mix unrelated feature ranges together.

When Min-Max Scaling Is Not the Best Choice

Min-max normalization is useful for bounded inputs, visualization, and models that benefit from fixed numeric ranges. But it is sensitive to outliers. If a few extreme values dominate the range, most other values may get compressed into a narrow band.

In those cases, standardization or robust scaling may be more appropriate. So the right question is not only "how do I normalize" but also "is min-max normalization the right preprocessing step for this data."

Common Pitfalls

  • Forgetting the constant-array case and dividing by zero.
  • Normalizing the whole matrix globally when each feature should be scaled separately.
  • Letting extreme outliers determine the scale for the entire dataset.
  • Using different min and max values for training and inference by accident.
  • Assuming min-max scaling is always the right preprocessing choice.

Summary

  • Min-max scaling maps an array from its original range into a chosen target range.
  • The same formula works for [0, 1], [-1, 1], or any other interval.
  • Constant arrays need explicit handling because the denominator becomes zero.
  • Feature-wise normalization often matters more than global normalization.
  • Pick min-max scaling because it suits the data and model, not just because it is easy to code.

Course illustration
Course illustration

All Rights Reserved.