Challenge,how to implement an algorithm for six degree of separation?

Algorithm Implementation

Six Degrees of Separation

Network Theory

Graph Theory

Social Networks

Challenge,how to implement an algorithm for six degree of separation?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

The six degrees of separation problem is a shortest-path problem on an unweighted graph. Each person is a node, each acquaintance relationship is an edge, and the practical question is whether two nodes are connected by a path of length six or less and, often, what that path is.

Model the Network as a Graph

The first design choice is the graph representation. For large sparse social graphs, an adjacency list is the standard choice because most people are connected to a small fraction of the whole network.

python

1graph = {
2    "Alice": ["Bob", "Carol"],
3    "Bob": ["Alice", "Dave"],
4    "Carol": ["Alice", "Eve"],
5    "Dave": ["Bob"],
6    "Eve": ["Carol"],
7}

This representation makes neighbor lookup cheap and works well with breadth-first search.

Use Breadth-First Search for the Shortest Path

Because every edge has the same cost, breadth-first search, usually abbreviated BFS, finds the shortest connection path in terms of number of relationships.

python

1from collections import deque
2
3
4def shortest_path(graph, start, target):
5    queue = deque([(start, [start])])
6    visited = {start}
7
8    while queue:
9        node, path = queue.popleft()
10
11        if node == target:
12            return path
13
14        for neighbor in graph.get(node, []):
15            if neighbor not in visited:
16                visited.add(neighbor)
17                queue.append((neighbor, path + [neighbor]))
18
19    return None
20
21
22print(shortest_path(graph, "Alice", "Eve"))

If the returned path has length 7 nodes, that means six edges of separation. If the returned path is longer, the two people are connected but not within six degrees.

Check the Six-Degree Limit Explicitly

If the question is only whether two people are within six degrees, you do not need to explore the entire component. Stop once depth exceeds six.

python

1from collections import deque
2
3
4def within_six_degrees(graph, start, target):
5    queue = deque([(start, 0)])
6    visited = {start}
7
8    while queue:
9        node, depth = queue.popleft()
10
11        if node == target:
12            return True
13
14        if depth == 6:
15            continue
16
17        for neighbor in graph.get(node, []):
18            if neighbor not in visited:
19                visited.add(neighbor)
20                queue.append((neighbor, depth + 1))
21
22    return False

This bounded BFS is useful when you care about the six-degree rule specifically and want to keep the search tighter.

Use Bidirectional BFS for Large Graphs

For very large social graphs, standard BFS can still expand too many nodes. A common optimization is bidirectional BFS: search forward from the start and backward from the target at the same time until the frontiers meet.

This often reduces the explored search space dramatically because you are effectively growing two shallow trees instead of one large one.

The idea is:

keep a frontier from the start node
keep a frontier from the target node
alternate expansions
stop when the frontiers intersect

That is usually the right next step when the graph is large and queries are frequent.

Think About Data Quality, Not Just the Algorithm

Real social networks are messy. Duplicate identities, asymmetric edges, missing relationships, and stale data all affect the result.

For example, if your graph is supposed to be undirected, but one side of the relationship is missing, BFS will under-report connectivity. So before optimizing the search, confirm the graph semantics are correct:

is the edge directed or undirected
are duplicate people merged correctly
are isolated nodes expected
is the graph updated consistently

Many "algorithm problems" in production are actually graph-construction problems.

Complexity and Practical Limits

A BFS traversal runs in O(V + E) time in the explored region, where V is the number of visited nodes and E is the number of traversed edges. For a one-off query on a sparse graph, this is often fine.

For repeated queries at scale, you may want:

bidirectional BFS
graph partitioning or indexing
caching popular path queries
precomputed neighborhoods for small radii

But the baseline algorithm is still BFS.

Common Pitfalls

Using depth-first search for shortest path in an unweighted graph gives the wrong tool for the job. BFS is the correct baseline.
Forgetting a visited set can cause cycles and explosive repeated work.
Treating a directed graph as if it were undirected changes the meaning of connection distance completely.
Exploring the entire graph when you only care about six hops wastes time. Stop expanding past depth six if that is the actual rule.
Debugging the search before validating the graph data can hide the real issue when relationships are incomplete or inconsistent.

Summary

Six degrees of separation is a shortest-path problem on an unweighted graph.
Represent the social network as an adjacency list and use BFS to find the minimum number of hops.
If you only care about the six-hop rule, bound the search depth at six.
For very large graphs, bidirectional BFS is a strong optimization.
Correct graph construction matters as much as the search algorithm itself.