team formation
skill-based grouping
algorithm design
data science
skill assessment

Algorithm that creates teams based on a numeric skill value

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Creating teams from a list of numeric skill values is a balancing problem. You want groups that are fair enough to compete, but you also need an algorithm that is easy to explain and fast enough to run when the roster changes.

Start With the Goal of Balance

A team-building algorithm needs an explicit objective. Common goals are:

  • keep total skill per team close
  • keep the number of players per team equal
  • avoid putting all top players together
  • optionally preserve roles, departments, or availability

If every person has only one numeric score, a practical definition of balance is to minimize the difference between team totals. This is related to partitioning problems, which become expensive to solve exactly as the group grows. In most applications, a greedy approach is a better fit than exhaustive search.

A Greedy Assignment Algorithm

A reliable baseline is:

  1. sort players by skill from highest to lowest
  2. always assign the next player to the currently weakest team
  3. track both team size and total score

That strategy spreads strong players early instead of letting them cluster by accident. A min-heap makes it efficient to repeatedly fetch the lowest-total team.

python
1from heapq import heappop, heappush
2from typing import List, Tuple
3
4Player = Tuple[str, int]
5
6
7def create_balanced_teams(players: List[Player], team_count: int) -> List[List[Player]]:
8    if team_count <= 0:
9        raise ValueError("team_count must be positive")
10    if team_count > len(players):
11        raise ValueError("team_count cannot exceed number of players")
12
13    sorted_players = sorted(players, key=lambda item: item[1], reverse=True)
14
15    heap = []
16    teams: List[List[Player]] = [[] for _ in range(team_count)]
17
18    for team_index in range(team_count):
19        heappush(heap, (0, 0, team_index))
20
21    for player in sorted_players:
22        total_skill, size, team_index = heappop(heap)
23        teams[team_index].append(player)
24        heappush(heap, (total_skill + player[1], size + 1, team_index))
25
26    return teams
27
28
29players = [
30    ("Ava", 95), ("Ben", 88), ("Chen", 84), ("Diya", 82),
31    ("Eli", 76), ("Finn", 72), ("Gita", 70), ("Hana", 65),
32]
33
34for index, team in enumerate(create_balanced_teams(players, 2), start=1):
35    total = sum(skill for _, skill in team)
36    print(f"Team {index}: {team} total={total}")

This does not guarantee the mathematically optimal split, but it performs well and is easy to justify in production systems.

Why Sorting First Matters

If you assign players in random order, the result depends too heavily on input order. Sorting forces the hardest decisions to happen first. The strongest players are distributed before small skill differences dominate the totals.

For example, with scores 95, 94, 60, 59, random order can easily create 95 + 94 on one team and 60 + 59 on another. Sorting and assigning to the weakest team prevents that obvious failure.

A variation called “snake draft” can also work well. After sorting, assign left to right across teams, then right to left on the next pass. It is simple to implement and naturally spreads skill.

python
1
2def snake_draft(players: List[Player], team_count: int) -> List[List[Player]]:
3    teams: List[List[Player]] = [[] for _ in range(team_count)]
4    ordered = sorted(players, key=lambda item: item[1], reverse=True)
5    direction = 1
6    team_index = 0
7
8    for player in ordered:
9        teams[team_index].append(player)
10
11        if direction == 1:
12            if team_index == team_count - 1:
13                direction = -1
14            else:
15                team_index += 1
16        else:
17            if team_index == 0:
18                direction = 1
19            else:
20                team_index -= 1
21
22    return teams

The heap-based method tends to react better when team sizes and totals both matter, while snake draft is easier to explain to non-technical users.

Measuring Whether the Result Is Good

Do not stop at producing teams. Measure the result. At minimum, compute:

  • total skill per team
  • average skill per team
  • difference between strongest and weakest team
  • standard deviation of team totals if you want a single balance score

If one extra person is allowed on some teams, you should also track the largest size gap. A team that is numerically balanced but short one member may still be unfair.

Common Pitfalls

The first pitfall is assuming one number captures all of skill. In real settings, a single rating may hide role differences. Two high scorers who both play the same position can make a team less functional than the totals suggest.

Another mistake is optimizing only for total score and ignoring team size. A team with three strong players can look balanced on paper against four moderate players, but the gameplay or workload may not be comparable.

Some implementations also forget deterministic tie-breaking. If two teams have the same total, always pick the smaller team first or use a stable team index. That makes the algorithm repeatable and easier to test.

Finally, avoid promising “perfectly fair” output from a greedy heuristic. It is fair to say the algorithm produces balanced teams quickly, but exact optimal partitioning is a harder problem.

Summary

  • Treat team creation as a balancing problem with a clearly defined objective.
  • Sorting by skill before assignment greatly improves practical results.
  • A greedy heap-based algorithm is a strong default for equalizing totals.
  • Snake draft is simpler but slightly less adaptive.
  • Evaluate team totals and team sizes after assignment instead of trusting intuition.

Course illustration
Course illustration

All Rights Reserved.