Implementing a Harris corner detector

Harris corner detector

computer vision

image processing

feature detection

algorithm implementation

Implementing a Harris corner detector

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

The Harris corner detector is a classic computer-vision algorithm for finding points where image intensity changes strongly in two directions at once. Those points, often called corners or interest points, are useful in tracking, matching, stitching, and geometric alignment.

Core Sections

The intuition behind Harris corners

A flat image patch does not change much if you shift it slightly. An edge patch changes strongly in one direction but not the other. A corner changes strongly in both directions.

The Harris detector measures that idea by looking at image gradients around each pixel and building a small local matrix, often called the structure tensor or second-moment matrix.

For each pixel, you estimate:

horizontal gradient Ix
vertical gradient Iy
local sums of Ix^2, Iy^2, and IxIy

From these values, the detector computes a response score that is large for corners.

The Harris response formula

The standard Harris response is:

'R = det(M) - k * trace(M)^2'

Where M is the 2 x 2 second-moment matrix built from the local gradient products. The parameter k is usually around 0.04 to 0.06.

Interpretation:

large positive R suggests a corner
large negative R often indicates an edge
small R suggests a flat region

The practical pipeline is:

convert image to grayscale
compute gradients
smooth the gradient products
compute R
threshold and apply non-maximum suppression

A compact OpenCV implementation

OpenCV already provides a Harris implementation, which is useful for learning and for practical work.

python

1import cv2
2import numpy as np
3
4image = cv2.imread("chessboard.png")
5gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
6gray = np.float32(gray)
7
8response = cv2.cornerHarris(gray, blockSize=2, ksize=3, k=0.04)
9response = cv2.dilate(response, None)
10
11threshold = 0.01 * response.max()
12image[response > threshold] = [0, 0, 255]
13
14cv2.imwrite("corners.png", image)

This marks high-response pixels in red. It is a good starting point, but raw thresholding alone usually produces clusters of nearby points rather than one clean point per corner.

A simple manual implementation sketch

If you want to understand the internals, compute the gradient products explicitly and then build the response yourself.

python

1import cv2
2import numpy as np
3
4image = cv2.imread("chessboard.png", cv2.IMREAD_GRAYSCALE)
5image = np.float32(image)
6
7ix = cv2.Sobel(image, cv2.CV_32F, 1, 0, ksize=3)
8iy = cv2.Sobel(image, cv2.CV_32F, 0, 1, ksize=3)
9
10ix2 = cv2.GaussianBlur(ix * ix, (3, 3), 1)
11iy2 = cv2.GaussianBlur(iy * iy, (3, 3), 1)
12ixiy = cv2.GaussianBlur(ix * iy, (3, 3), 1)
13
14k = 0.04
15response = (ix2 * iy2 - ixiy * ixiy) - k * (ix2 + iy2) ** 2
16
17print(response.shape)

This version shows the mathematics more directly and helps explain what cv2.cornerHarris is doing under the hood.

Non-maximum suppression matters

Without non-maximum suppression, Harris often gives a blob of strong responses around each corner instead of a single precise point. A simple threshold only tells you where the response is high, not which pixel is the representative location.

In practical systems, you usually combine:

thresholding to remove weak responses
non-maximum suppression to keep local peaks only

This step is what turns a response map into a usable set of feature points.

Where Harris still fits today

Harris is older than detectors such as FAST, SIFT, ORB, and learned keypoint methods, but it is still important because:

it is conceptually clean
it is mathematically instructive
it performs reasonably well in many structured scenes

It is especially useful for education and for understanding the broader family of feature detectors.

Common Pitfalls

Applying Harris directly to a noisy image without smoothing often creates unstable or excessive detections.
Treating the raw response map as the final corner set without non-maximum suppression produces clustered points rather than clean features.
Choosing the response threshold arbitrarily can either flood the output with weak points or remove genuine corners.
Forgetting to convert the image to grayscale and a floating-point type can lead to incorrect gradient calculations.
Assuming Harris is invariant to every real-world imaging change ignores its sensitivity to scale changes and some illumination conditions.

Summary

The Harris detector finds points where image intensity changes strongly in two directions.
It works by building a local gradient-based matrix and computing a corner response score.
OpenCV provides a quick implementation, but understanding the manual pipeline is valuable.
Thresholding alone is not enough; non-maximum suppression is usually required.
Harris remains a useful detector for learning, prototyping, and some practical vision tasks.