Rounding
Python
List
Nearest Value
Programming

Rounding a list of values to the nearest value from another list in python

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Mapping each value in one list to the nearest value in a reference list is a common quantization task in analytics, pricing, and sensor processing. The key choices are performance and tie-breaking behavior. A clear implementation should define both explicitly so results stay stable across data sizes and code revisions.

For small inputs, the simplest method is often best.

python
1
2def nearest_value(value, refs):
3    return min(refs, key=lambda r: abs(value - r))
4
5
6def map_nearest(values, refs):
7    return [nearest_value(v, refs) for v in values]
8
9
10values = [2.2, 7.8, 15.1]
11refs = [0, 5, 10, 20]
12print(map_nearest(values, refs))

This is readable and correct, but time complexity is O(len(values) * len(refs)).

Faster Lookup with Sorted References and bisect

If reference list is reused many times, sort it once and use binary search.

python
1from bisect import bisect_left
2
3
4def nearest_sorted(refs_sorted, v):
5    i = bisect_left(refs_sorted, v)
6    if i == 0:
7        return refs_sorted[0]
8    if i == len(refs_sorted):
9        return refs_sorted[-1]
10
11    left = refs_sorted[i - 1]
12    right = refs_sorted[i]
13    if abs(v - left) <= abs(v - right):
14        return left
15    return right
16
17
18def map_nearest_fast(values, refs):
19    refs_sorted = sorted(refs)
20    return [nearest_sorted(refs_sorted, v) for v in values]
21
22
23print(map_nearest_fast(values, refs))

This reduces per-value lookup to logarithmic time after sorting.

Decide Tie-Breaking Policy Explicitly

When a value sits exactly between two references, decide deterministic behavior.

Common policies:

  • lower neighbor wins,
  • upper neighbor wins,
  • custom domain-specific rule.

Lower-wins tie handling in previous code uses <= on left distance. If you want upper-wins, change the comparison.

python
1# upper-wins tie
2if abs(v - left) < abs(v - right):
3    return left
4return right

Document this rule because tie behavior can affect downstream business logic.

Vectorized Option with NumPy for Large Arrays

For numeric workloads, NumPy can be much faster than Python loops.

python
1import numpy as np
2
3
4def map_nearest_numpy(values, refs):
5    values_arr = np.asarray(values)
6    refs_arr = np.asarray(sorted(refs))
7
8    idx = np.searchsorted(refs_arr, values_arr, side="left")
9    idx = np.clip(idx, 1, len(refs_arr) - 1)
10
11    left = refs_arr[idx - 1]
12    right = refs_arr[idx]
13
14    choose_left = np.abs(values_arr - left) <= np.abs(values_arr - right)
15    out = np.where(choose_left, left, right)
16    return out.tolist()
17
18
19print(map_nearest_numpy(values, refs))

Vectorization is useful when values is large and numeric types are consistent.

Input Validation and Robustness

Defensive checks prevent runtime surprises.

python
1
2def validate_inputs(values, refs):
3    if not refs:
4        raise ValueError("refs must not be empty")
5    if any(r is None for r in refs):
6        raise ValueError("refs contains None")

Also decide behavior for NaN values. Many pipelines either drop them or map them to a sentinel bucket.

Testing Cases You Should Include

Minimum tests:

  • empty values list,
  • single reference value,
  • value below min reference,
  • value above max reference,
  • exact midpoint tie,
  • negative and positive number mix.

Example assertions:

python
assert map_nearest_fast([], [0, 10]) == []
assert map_nearest_fast([5], [0, 10])[0] == 0  # lower-wins tie policy

These tests lock expected behavior and prevent accidental policy drift.

Handling Domain-Specific Constraints

In production systems, nearest value is often constrained by policy. For example, pricing may require rounding only upward to available tiers, while sensor normalization may require downward clipping for safety.

Upward-only example:

python
1from bisect import bisect_left
2
3def nearest_upward(refs_sorted, v):
4    i = bisect_left(refs_sorted, v)
5    if i >= len(refs_sorted):
6        return refs_sorted[-1]
7    return refs_sorted[i]

Downward-only variant uses bisect_left with index minus one and boundary checks. Defining these variants explicitly avoids accidental misuse of symmetric nearest logic in places where policy requires directional rounding.

When teams share rounding utilities across products, clear naming conventions and policy-specific helpers prevent subtle billing and analytics inconsistencies.

Common Pitfalls

  • Forgetting to sort references before binary-search lookup.
  • Leaving tie behavior implicit and getting inconsistent business outcomes.
  • Re-sorting references for every value instead of once per batch.
  • Ignoring empty reference list validation.
  • Mixing incompatible numeric types without explicit conversion.

Summary

  • Direct min search is clear and good for small datasets.
  • Sorted references plus bisect scale better for repeated lookups.
  • Tie-breaking policy must be explicit and documented.
  • NumPy vectorization is effective for large numeric arrays.
  • Add input validation and edge-case tests for stable production behavior.

Course illustration
Course illustration

All Rights Reserved.