Algorithm to count occurrences of a matrix inside a larger one

matrix processing

algorithm design

pattern matching

computational mathematics

matrix operations

Algorithm to count occurrences of a matrix inside a larger one

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

Counting how many times a small matrix appears inside a larger matrix is a 2D pattern-matching problem. A direct brute-force approach works for modest sizes and is usually easiest to implement correctly. For larger inputs, performance can be improved using rolling hashes or convolution-like techniques. The right choice depends on matrix size, value range, and whether approximate matches are allowed.

Core Sections

1) Brute-force sliding window

The basic idea is to slide the small matrix over every possible top-left position in the large matrix and compare all entries.

python

1def count_occurrences(big, small):
2    R, C = len(big), len(big[0])
3    r, c = len(small), len(small[0])
4
5    if r > R or c > C:
6        return 0
7
8    total = 0
9    for i in range(R - r + 1):
10        for j in range(C - c + 1):
11            match = True
12            for x in range(r):
13                for y in range(c):
14                    if big[i + x][j + y] != small[x][y]:
15                        match = False
16                        break
17                if not match:
18                    break
19            if match:
20                total += 1
21    return total

This is O((R-r+1)(C-c+1)rc) time and O(1) extra space.

2) Early rejection optimization

A simple speed-up is checking a few sentinel values first (for example corners) before full compare.

python

1if big[i][j] != small[0][0]:
2    continue
3if big[i + r - 1][j + c - 1] != small[r - 1][c - 1]:
4    continue

This can reduce full comparisons when mismatches are common.

3) Rolling hash for larger matrices

For large problems, compute hash for every r x c window and compare against small matrix hash. On hash match, do exact verification to avoid collision issues.

python

1# Conceptual flow
2# 1) hash(small)
3# 2) rolling_hash over big windows
4# 3) verify exact equality on hash hit

2D rolling hash reduces average work significantly for large dense inputs.

4) Overlap behavior and correctness

Define whether overlapping matches count. Most implementations count overlaps naturally.

Example:

big: [[1,1,1],[1,1,1]]
small: [[1,1]]
count with overlap = 4

Make this explicit in tests to avoid ambiguity.

Validation and Production Readiness

After implementing any fix or pattern from this topic, validate behavior using a repeatable workflow rather than ad hoc spot checks. The most reliable process has three stages: reproduce baseline behavior, apply one focused change, then verify both expected and adjacent scenarios. This avoids false confidence from a single green run and helps isolate which change actually solved the problem.

A practical command-driven template:

bash

1# 1) capture baseline output/state
2./run_case.sh > before.txt
3
4# 2) apply one focused change from this guide
5# edit code/config and keep the diff minimal
6
7# 3) verify behavior and compare outputs
8./run_case.sh > after.txt
9diff -u before.txt after.txt

If your project includes automated tests, convert the original failure into a regression test immediately. This is the fastest way to prevent the same issue from reappearing during later refactors, dependency upgrades, or environment changes.

bash

1# example quality gate sequence
2./lint.sh
3./test.sh
4./smoke.sh

Also validate edge cases explicitly. Many production defects occur not on the nominal path, but on boundary inputs such as empty collections, null/none values, unusual encodings, or large payloads. Define a compact table of edge scenarios and expected outcomes so reviewers can reproduce your checks quickly.

Before rollout, confirm environment parity. A fix that works in local development can fail in staging or production when runtime versions, OS behavior, file systems, networking, or resource limits differ. Capture version metadata and infrastructure assumptions in your PR or runbook.

bash

1# capture runtime context (example)
2python --version
3node --version
4dotnet --info

Finally, define rollback criteria before deployment. If metrics or logs indicate regressions, teams should know exactly which change to revert and what signals trigger that decision. This operational discipline turns one-off troubleshooting into a maintainable engineering practice and significantly reduces incident recovery time.

Common Pitfalls

Forgetting to handle the case where small matrix is larger than big matrix.
Breaking loops incorrectly and counting partial matches.
Assuming non-overlapping counting when requirements expect overlaps.
Using hash-only comparison without exact verification on collisions.
Skipping edge-case tests such as 1x1 matrices and repeated values.

Summary

The simplest and most reliable solution is sliding-window brute force with early exits. For larger workloads, add rolling-hash filtering plus exact verification. Clarify overlap semantics and test boundaries carefully to keep matrix occurrence counts correct and predictable.