Representing and solving a maze given an image

Maze solving

Image processing

Algorithm design

Computer vision

Pathfinding

Representing and solving a maze given an image

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

Solving a maze from an image is really two problems: converting pixels into a traversable representation, and then running a pathfinding algorithm on that representation. The first step is usually thresholding and grid extraction. The second step is typically BFS for shortest path in an unweighted maze or A* if you want a heuristic-driven search.

Turn the Image into a Binary Maze

Most maze images are black walls on a white background or the reverse. The first job is to reduce the image to passable versus blocked cells.

A simple OpenCV-based example:

python

1import cv2
2import numpy as np
3
4img = cv2.imread("maze.png", cv2.IMREAD_GRAYSCALE)
5_, binary = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)
6
7# Convert to 0 for wall, 1 for open cell
8maze = (binary == 255).astype(np.uint8)
9
10print(maze.shape)
11print(maze[:5, :5])

This gives a matrix where 1 means open space and 0 means wall. If your maze colors are inverted, flip the comparison or use THRESH_BINARY_INV.

Reduce Noise Before Solving

Real images often contain blur, anti-aliasing, or small specks. A little preprocessing can make the grid much more reliable.

python

1import cv2
2
3img = cv2.imread("maze.png", cv2.IMREAD_GRAYSCALE)
4blurred = cv2.GaussianBlur(img, (5, 5), 0)
5_, binary = cv2.threshold(blurred, 127, 255, cv2.THRESH_BINARY)

For noisier inputs, morphological operations can help close tiny gaps or remove isolated dots:

python

kernel = np.ones((3, 3), np.uint8)
clean = cv2.morphologyEx(binary, cv2.MORPH_OPEN, kernel)

The point is to make the passable-cell map structurally accurate before pathfinding begins.

Represent the Maze as a Grid Graph

Once the image is binary, the maze is naturally a grid graph. Every open cell is a node, and its neighbors are the adjacent open cells.

For four-direction movement:

python

1def neighbors(r, c, maze):
2    rows, cols = maze.shape
3    for dr, dc in [(-1, 0), (1, 0), (0, -1), (0, 1)]:
4        nr, nc = r + dr, c + dc
5        if 0 <= nr < rows and 0 <= nc < cols and maze[nr, nc] == 1:
6            yield nr, nc

This implicit graph is much cheaper than materializing a separate adjacency list for every pixel.

Solve with Breadth-First Search

If every move costs the same, BFS gives the shortest path in number of steps.

python

1from collections import deque
2
3
4def bfs_path(maze, start, goal):
5    q = deque([start])
6    parent = {start: None}
7
8    while q:
9        cur = q.popleft()
10        if cur == goal:
11            break
12
13        for nxt in neighbors(cur[0], cur[1], maze):
14            if nxt not in parent:
15                parent[nxt] = cur
16                q.append(nxt)
17
18    if goal not in parent:
19        return []
20
21    path = []
22    cur = goal
23    while cur is not None:
24        path.append(cur)
25        cur = parent[cur]
26    path.reverse()
27    return path

You still need valid start and goal coordinates. In simple mazes, those are often the first open cell on the top border and the first open cell on the bottom border.

Detect Entrances Automatically

For many maze images, the entrances are openings on the border.

python

1def find_border_openings(maze):
2    rows, cols = maze.shape
3    openings = []
4
5    for c in range(cols):
6        if maze[0, c] == 1:
7            openings.append((0, c))
8        if maze[rows - 1, c] == 1:
9            openings.append((rows - 1, c))
10
11    for r in range(1, rows - 1):
12        if maze[r, 0] == 1:
13            openings.append((r, 0))
14        if maze[r, cols - 1] == 1:
15            openings.append((r, cols - 1))
16
17    return openings

If there are exactly two openings, they are often start and goal. More complex inputs may need metadata or manual selection.

Draw the Solution Back onto the Image

Once you have the path, visualize it on the original image.

python

1color = cv2.cvtColor(binary, cv2.COLOR_GRAY2BGR)
2
3for r, c in path:
4    color[r, c] = (0, 0, 255)
5
6cv2.imwrite("solved_maze.png", color)

This overlays the solved path in red. For thick walls and large passages, you may instead want to solve on a downsampled grid and then upscale the route.

When BFS Is Not Enough

BFS is correct for unweighted mazes, but not every maze image is best handled at full pixel resolution.

Use A* when:

the maze is large and you want faster focused search
you have a natural heuristic such as Manhattan distance

Use graph compression when:

corridors are long and mostly straight
you want to collapse entire corridor segments into fewer nodes

The best representation depends on image size and maze style.

Common Pitfalls

One common mistake is thresholding the image without checking whether walls and paths were inverted. That makes the solver treat walls as open space.

Another mistake is running pathfinding directly on noisy antialiased pixels, which creates broken corridors and false walls.

Developers also materialize enormous pixel-level graphs when an implicit neighbor function is enough.

Finally, many solvers assume the start and goal are known. In real image inputs, entrance detection often needs explicit logic.

Summary

Solve image mazes in two stages: image-to-grid conversion and pathfinding.
Thresholding and cleanup are critical before graph traversal.
Represent the maze as a grid where open cells are traversable nodes.
BFS is the right baseline for shortest path in an unweighted maze.
Visualize the recovered path on the source image to validate the pipeline.