Convert own image to MNIST's image

Image Conversion

MNIST Dataset

Image Processing

Machine Learning

Data Preprocessing

Convert own image to MNIST's image

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

Converting your own digit image to something MNIST-like means more than just resizing it to 28x28. MNIST digits are centered, grayscale, size-normalized, and usually represented with a bright digit on a dark background, so the closer your preprocessing matches that distribution, the better a model trained on MNIST is likely to perform.

What MNIST Images Look Like

A typical MNIST sample has these properties:

one grayscale channel
'28x28 pixels'
centered handwritten digit
relatively tight crop around the foreground
consistent foreground and background polarity

If your custom image is a phone photo, a scanned page, or a white digit on a black background, those differences matter. A classifier trained on MNIST expects a very specific kind of input.

A Practical Conversion Pipeline

A good conversion workflow is:

load the image in grayscale
normalize foreground and background polarity
threshold away background noise
crop to the digit’s bounding box
resize while keeping aspect ratio
center the digit on a 28x28 canvas
normalize pixel values for model input

Skipping the crop and centering steps is one of the biggest reasons custom images fail against MNIST-trained models.

Python Example with Pillow

python

1from PIL import Image, ImageOps
2import numpy as np
3
4
5def convert_to_mnist_style(input_path, output_path):
6    img = Image.open(input_path).convert("L")
7
8    # Invert if the source is dark ink on light paper.
9    img = ImageOps.invert(img)
10    arr = np.array(img, dtype=np.uint8)
11
12    # Remove weak background noise.
13    arr = np.where(arr > 40, arr, 0).astype(np.uint8)
14
15    ys, xs = np.where(arr > 0)
16    if len(xs) == 0 or len(ys) == 0:
17        raise ValueError("No foreground digit found")
18
19    x_min, x_max = xs.min(), xs.max()
20    y_min, y_max = ys.min(), ys.max()
21    cropped = arr[y_min:y_max + 1, x_min:x_max + 1]
22
23    digit = Image.fromarray(cropped)
24    digit.thumbnail((20, 20), Image.Resampling.LANCZOS)
25
26    canvas = Image.new("L", (28, 28), 0)
27    x_off = (28 - digit.width) // 2
28    y_off = (28 - digit.height) // 2
29    canvas.paste(digit, (x_off, y_off))
30
31    canvas.save(output_path)
32    final_arr = np.array(canvas, dtype=np.float32) / 255.0
33    return final_arr
34
35
36sample = convert_to_mnist_style("digit.png", "digit_mnist.png")
37print(sample.shape, sample.min(), sample.max())

This gives you a saved image and a normalized array suitable for model input.

Why the `20x20` Fit Is Useful

A common MNIST-style practice is to fit the digit into roughly a 20x20 region and then place it inside the full 28x28 canvas. That leaves some margin around the digit, which helps mimic the spatial style of MNIST rather than cramming the strokes against the edges.

If you resize directly to 28x28 without preserving aspect ratio or margins, thin digits can become distorted or off-center.

Model Input Shape Matters Too

The image file is only half the problem. You also need the tensor shape expected by the model.

For convolutional models, a single sample is often shaped like:

python

sample = sample.reshape(1, 28, 28, 1).astype(np.float32)

For dense models trained on flattened inputs, it might instead be:

python

sample = sample.reshape(1, 784).astype(np.float32)

A perfect image conversion can still fail if the input tensor shape is wrong.

Batch Conversion

If you are building a custom dataset, apply the same preprocessing to every image.

python

1from pathlib import Path
2
3src = Path("raw_digits")
4dst = Path("mnist_like")
5dst.mkdir(exist_ok=True)
6
7for path in src.glob("*.png"):
8    convert_to_mnist_style(str(path), str(dst / path.name))

Consistency matters more than clever per-image tweaks. A stable preprocessing pipeline produces training and inference data that match each other.

Visual Verification Is Worth It

Before trusting the converted images, inspect several outputs manually. Look for:

clipped strokes
inverted polarity
too much blank margin
noisy backgrounds
squashed aspect ratio

Visual checks catch many problems faster than staring at model accuracy alone.

Common Pitfalls

Resizing directly to 28x28 without cropping and centering the digit first.
Forgetting to align the image polarity with what the MNIST-trained model expects.
Applying a threshold that removes thin strokes along with the background noise.
Feeding raw uint8 values when the model expects normalized floating-point input.
Matching the image size correctly but using the wrong tensor shape at prediction time.

Summary

Converting an image to MNIST style requires cropping, centering, grayscale normalization, and size normalization, not just resizing.
Matching MNIST polarity and margins is important for a model trained on MNIST.
A 20x20 fit inside a 28x28 canvas is a practical approximation of the original dataset style.
The output tensor shape must match the model’s expected input format.
Always inspect a few converted samples visually before assuming the preprocessing is correct.

Convert own image to MNIST's image

Master System Design with Codemia

Introduction

What MNIST Images Look Like

A Practical Conversion Pipeline

Python Example with Pillow

Why the 20x20 Fit Is Useful

Model Input Shape Matters Too

Batch Conversion

Visual Verification Is Worth It

Common Pitfalls

Summary

Why the `20x20` Fit Is Useful