collinear triples
array mathematics
computational geometry
collinearity in arrays
algorithm challenges

Count Number of Triples in an array that are collinear

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

To count collinear triples, you need to count how many groups of three points lie on the same straight line. The brute-force solution checks every triple directly in O(n^3), but a more useful approach fixes one anchor point at a time, groups other points by slope, and then counts combinations.

Collinearity Test for Three Points

Three points are collinear when the signed area of the triangle they form is zero. For points (x1, y1), (x2, y2), and (x3, y3), the test is:

python
1def are_collinear(p1, p2, p3):
2    x1, y1 = p1
3    x2, y2 = p2
4    x3, y3 = p3
5    return (x2 - x1) * (y3 - y1) == (x3 - x1) * (y2 - y1)

This avoids division, so it is safer than comparing floating-point slopes directly.

Brute Force Baseline

The direct solution tries every triple:

python
1from itertools import combinations
2
3
4def count_collinear_bruteforce(points):
5    total = 0
6    for a, b, c in combinations(points, 3):
7        if are_collinear(a, b, c):
8            total += 1
9    return total

This is easy to understand and correct, but it runs in O(n^3), which becomes expensive quickly.

A Better Counting Idea

Instead of testing every triple explicitly, fix one point as an anchor and group all other points by the slope they make with that anchor.

If m other points share the same slope relative to the anchor, then they contribute:

text
C(m, 2)

collinear triples involving that anchor, because any two of those m points combined with the anchor form a triple on the same line.

The catch is that each triple gets counted once from each of its three points as anchor, so the final total must be divided by 3.

Normalized Slope Representation

You should not use raw floating-point slopes as dictionary keys. Instead, normalize the direction vector with a greatest-common-divisor reduction.

python
1from collections import defaultdict
2from math import gcd
3
4
5def normalized_slope(p1, p2):
6    dx = p2[0] - p1[0]
7    dy = p2[1] - p1[1]
8
9    if dx == 0:
10        return (0, 1)
11    if dy == 0:
12        return (1, 0)
13
14    g = gcd(dx, dy)
15    dx //= g
16    dy //= g
17
18    if dx < 0:
19        dx = -dx
20        dy = -dy
21
22    return (dx, dy)

This makes equivalent slopes hash to the same key, including negative directions.

An O(n^2) Counting Approach

python
1from collections import defaultdict
2from math import comb, gcd
3
4
5def normalized_slope(p1, p2):
6    dx = p2[0] - p1[0]
7    dy = p2[1] - p1[1]
8
9    if dx == 0:
10        return (0, 1)
11    if dy == 0:
12        return (1, 0)
13
14    g = gcd(dx, dy)
15    dx //= g
16    dy //= g
17
18    if dx < 0:
19        dx = -dx
20        dy = -dy
21
22    return (dx, dy)
23
24
25def count_collinear_triples(points):
26    total = 0
27    n = len(points)
28
29    for i in range(n):
30        slope_counts = defaultdict(int)
31
32        for j in range(n):
33            if i == j:
34                continue
35            slope = normalized_slope(points[i], points[j])
36            slope_counts[slope] += 1
37
38        for count in slope_counts.values():
39            if count >= 2:
40                total += comb(count, 2)
41
42    return total // 3
43
44
45points = [(0, 0), (1, 1), (2, 2), (0, 1)]
46print(count_collinear_triples(points))

This runs in O(n^2) slope grouping time plus hashing overhead, which is much better than brute force for larger inputs.

Why Divide by Three

Suppose points A, B, and C are collinear. The anchor method counts them:

  • once with A as anchor
  • once with B as anchor
  • once with C as anchor

So every true triple is counted three times. Dividing by 3 fixes that overcounting.

For lines with more than three points, the same correction still works because the anchor-based combination count overcounts every triple exactly three times.

Common Pitfalls

  • Using floating-point slopes directly and getting precision bugs.
  • Forgetting to normalize slope direction, which splits equivalent lines into different hash keys.
  • Counting anchor-based combinations and forgetting to divide the final answer by 3.
  • Assuming the brute-force solution is acceptable for large n just because the collinearity test itself is cheap.
  • Ignoring duplicate points, which may require separate handling depending on the problem definition.

Summary

  • Three points are collinear when the triangle area test evaluates to zero.
  • Brute force works in O(n^3) by checking every triple directly.
  • A more efficient method fixes one anchor, groups other points by normalized slope, and counts combinations.
  • Use reduced integer slope pairs instead of floating-point values.
  • The anchor-based total must be divided by 3 because each triple is counted once from each of its three points.

Course illustration
Course illustration

All Rights Reserved.