How many kinds of Distance Function can we use?

distance functions

mathematical concepts

distance metrics

data analysis

machine learning

How many kinds of Distance Function can we use?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

There is no fixed number of distance functions you can use. The real question is which distance function matches the structure of your data and the assumptions of your algorithm. In mathematics there are infinitely many possible metrics, and in applied work there are also many non-metric similarity measures that people informally call distances.

What Makes Something a True Distance Metric

A true distance metric must satisfy four properties for points x, y, and z:

non-negativity: the distance is never negative
identity: the distance is zero only when the points are the same
symmetry: d(x, y) = d(y, x)
triangle inequality: d(x, z) <= d(x, y) + d(y, z)

If a function breaks one of those rules, it may still be useful, but it is not a metric in the strict mathematical sense.

That distinction matters because some algorithms assume a real metric and may behave poorly if you give them a measure that does not satisfy those properties.

Common Distance Functions

Several distance functions appear repeatedly in data analysis and machine learning.

Euclidean distance is the ordinary straight-line distance and is common for continuous numeric features.

Manhattan distance adds absolute coordinate differences and is often useful when movement or difference happens along independent axes.

Chebyshev distance uses the largest coordinate difference and is useful in grid-like motion problems.

Minkowski distance generalizes several of the earlier metrics with a parameter p.

Mahalanobis distance accounts for covariance structure and is helpful when features are correlated.

Hamming distance counts mismatched positions in fixed-length categorical or binary data.

Jaccard distance compares sets or binary feature presence and absence.

Cosine similarity is not a distance metric in the usual strict sense, but people often convert it into a cosine distance-like quantity for text and embedding work.

A Small Python Example

You can compute several of these with NumPy and SciPy.

python

1import numpy as np
2from scipy.spatial.distance import euclidean, cityblock, chebyshev, cosine
3
4x = np.array([1.0, 2.0, 3.0])
5y = np.array([2.0, 4.0, 6.0])
6
7print("Euclidean:", euclidean(x, y))
8print("Manhattan:", cityblock(x, y))
9print("Chebyshev:", chebyshev(x, y))
10print("Cosine distance:", cosine(x, y))

Even on the same pair of vectors, each measure tells a different story because each emphasizes a different aspect of difference.

Choosing the Right Distance

The choice depends on data type and task.

Use Euclidean distance when continuous features are on comparable scales and straight-line geometry makes sense.

Use Manhattan distance when axis-by-axis difference matters more than geometric straight-line distance.

Use Hamming or Jaccard for binary or categorical presence-absence style data.

Use cosine-style comparison when direction matters more than magnitude, as in text vectors or embeddings.

Use Mahalanobis distance when correlated numeric features would make plain Euclidean distance misleading.

The important point is that distance is not just a formula. It encodes what “similar” means in your problem.

Scaling Often Matters More Than the Formula

People often debate Euclidean versus Manhattan while ignoring feature scale. If one feature ranges from 0 to 10000 and another ranges from 0 to 1, the large-scale feature can dominate many distance calculations.

That is why normalization or standardization is often required before applying a distance-based method.

python

1from sklearn.preprocessing import StandardScaler
2
3X = np.array([
4    [1.0, 1000.0],
5    [2.0, 1500.0],
6    [3.0, 9000.0],
7])
8
9scaled = StandardScaler().fit_transform(X)
10print(scaled)

Without scaling, your chosen formula may be mathematically correct but practically unhelpful.

Metric Versus Similarity Measure

Not everything called a “distance” is a strict metric. Cosine-based measures are a good example. They are often excellent for ranking similar documents or embeddings, even though they do not always behave like classical metrics.

This is not a problem by itself. It only becomes a problem when the downstream algorithm assumes metric properties that the measure does not satisfy.

Common Pitfalls

A common mistake is asking for the “best” distance function in the abstract. There is no universally best choice.

Another mistake is using Euclidean distance on mixed or poorly scaled features without preprocessing.

A third issue is treating similarity and distance as interchangeable without checking whether the algorithm expects one or the other.

Finally, do not assume that because two formulas both produce numbers, they encode the same notion of closeness. They may emphasize very different structure in the data.

Summary

There is no fixed number of usable distance functions
A true metric must satisfy non-negativity, identity, symmetry, and triangle inequality
Different data types and tasks call for different distance choices
Feature scaling often matters as much as the distance formula itself
Pick the measure that matches your data semantics, not just the most familiar name