version comparison
software versions
version control
semantic versioning
programming tips

How can I compare arbitrary version numbers?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Comparing version numbers sounds simple until you run into inputs like 1.2, 1.2.0, 2.0-rc1, or 2024.10.15. The right comparison rule depends on the versioning scheme, but most practical solutions follow the same pattern: split the version into meaningful parts, compare numeric pieces numerically, and compare pre-release markers with explicit rules.

Start With the Versioning Rules

Before writing code, decide what counts as the same version and what counts as newer. For example:

  • Is 1.2 the same as 1.2.0?
  • Is 1.0-rc1 older than 1.0?
  • Does build metadata affect precedence?

If your project follows Semantic Versioning, the rules are already defined. If your versions are arbitrary internal strings, you need to define the rules yourself instead of relying on guesswork.

Numeric Segment Comparison

The simplest case is dotted numeric versions. Split on dots, convert segments to integers, and compare left to right.

python
1def compare_numeric_versions(a: str, b: str) -> int:
2    left = [int(part) for part in a.split(".")]
3    right = [int(part) for part in b.split(".")]
4
5    max_len = max(len(left), len(right))
6    left += [0] * (max_len - len(left))
7    right += [0] * (max_len - len(right))
8
9    for x, y in zip(left, right):
10        if x < y:
11            return -1
12        if x > y:
13            return 1
14    return 0
15
16
17print(compare_numeric_versions("1.2", "1.2.0"))   # 0
18print(compare_numeric_versions("2.10", "2.2"))    # 1

This already handles two important cases correctly:

  • '2.10 is greater than 2.2 because 10 > 2'
  • trailing missing numeric segments are treated as zero

Handling Pre-Release Labels

Pre-release labels such as alpha, beta, and rc should not be compared with naive string ordering alone. You need a rule set.

One common convention is:

  • final release is newer than any pre-release of the same numeric version
  • 'alpha < beta < rc < final'

Here is a simple comparison function for versions shaped like 1.2.0 or 1.2.0-rc1:

python
1import re
2
3PRECEDENCE = {"alpha": 0, "beta": 1, "rc": 2, "": 3}
4
5
6def parse_version(version: str):
7    main, *suffix = version.split("-", 1)
8    nums = [int(part) for part in main.split(".")]
9    label = ""
10    label_num = 0
11
12    if suffix:
13        match = re.fullmatch(r"([a-zA-Z]+)(\d*)", suffix[0])
14        if match:
15            label = match.group(1).lower()
16            label_num = int(match.group(2) or 0)
17
18    return nums, label, label_num
19
20
21def compare_versions(a: str, b: str) -> int:
22    nums_a, label_a, num_a = parse_version(a)
23    nums_b, label_b, num_b = parse_version(b)
24
25    max_len = max(len(nums_a), len(nums_b))
26    nums_a += [0] * (max_len - len(nums_a))
27    nums_b += [0] * (max_len - len(nums_b))
28
29    if nums_a != nums_b:
30        return -1 if nums_a < nums_b else 1
31
32    rank_a = PRECEDENCE.get(label_a, -1)
33    rank_b = PRECEDENCE.get(label_b, -1)
34
35    if rank_a != rank_b:
36        return -1 if rank_a < rank_b else 1
37
38    if num_a != num_b:
39        return -1 if num_a < num_b else 1
40
41    return 0

This is not a full SemVer implementation, but it shows the structure clearly.

Prefer Existing Parsers When the Ecosystem Has One

If you are comparing versions from a real package ecosystem, using the ecosystem's parser is safer than inventing your own. Python packages, NuGet packages, Maven artifacts, and OS package managers may all have slightly different precedence rules.

That matters because "arbitrary version numbers" is really a requirements question. Two teams may both say "compare versions" while meaning different ordering semantics.

Common Pitfalls

One common mistake is comparing version strings lexicographically. That makes 2.10 look smaller than 2.2, which is wrong for numeric version semantics.

Another issue is silently treating pre-release tags as plain suffix text without deciding how they rank against final releases.

It is also easy to ignore trailing zeros. Many systems consider 1.2 and 1.2.0 equivalent, but a naive segment-by-segment comparison can mis-handle that if you do not normalize lengths.

Summary

  • Version comparison needs explicit rules before you write code.
  • Numeric segments should be compared numerically, not as plain strings.
  • Trailing missing segments are often treated as zeros.
  • Pre-release labels need defined precedence rather than ad hoc string comparison.
  • If the version source already has a standard parser, use it instead of inventing a new one.

Course illustration
Course illustration

All Rights Reserved.