CNN
Keras
MNIST
Image Prediction
Deep Learning

how to predict my own image using cnn in keras after training on MNIST dataset

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

If a CNN trained on MNIST predicts your own digit image badly, the model is often not the main problem. The usual issue is preprocessing. MNIST images are 28 by 28 grayscale digits centered on a dark background with pixel values normalized in a specific way. Your custom image has to be transformed into that same format before prediction.

What the Model Expects

A typical MNIST CNN is trained on tensors shaped like one of these:

  • '(batch, 28, 28, 1) for channels-last models'
  • '(batch, 1, 28, 28) for channels-first models'

and the pixel values are usually scaled to the range 0.0 to 1.0.

That means your custom image must usually be:

  1. resized to 28x28
  2. converted to grayscale
  3. normalized to float32
  4. reshaped to include batch and channel dimensions

If any one of those steps is wrong, the prediction can be poor even if training went fine.

Load and Preprocess a Custom Image

Here is a common Keras-friendly pipeline using Pillow and NumPy:

python
1from PIL import Image, ImageOps
2import numpy as np
3
4
5def preprocess_mnist_image(path):
6    img = Image.open(path).convert("L")
7    img = img.resize((28, 28))
8
9    # MNIST digits are typically light on a dark background.
10    img = ImageOps.invert(img)
11
12    arr = np.array(img).astype("float32") / 255.0
13    arr = arr.reshape(1, 28, 28, 1)
14    return arr

Then use it with your trained model:

python
1x = preprocess_mnist_image("my_digit.png")
2pred = model.predict(x)
3print(pred)
4print(np.argmax(pred, axis=1)[0])

The inversion step matters a lot. Many hand-drawn images are black digits on white background, while MNIST is effectively the opposite.

A Minimal End-to-End Example

python
1from tensorflow import keras
2from PIL import Image, ImageOps
3import numpy as np
4
5model = keras.models.load_model("mnist_cnn.keras")
6
7img = Image.open("digit.png").convert("L")
8img = img.resize((28, 28))
9img = ImageOps.invert(img)
10
11x = np.array(img).astype("float32") / 255.0
12x = x.reshape(1, 28, 28, 1)
13
14pred = model.predict(x)
15label = int(np.argmax(pred, axis=1)[0])
16confidence = float(np.max(pred))
17
18print("predicted digit:", label)
19print("confidence:", confidence)

This is the core workflow: load, transform, reshape, predict.

Why Your Own Image Often Fails

MNIST is a very specific dataset. The digits are:

  • centered
  • tightly cropped
  • grayscale
  • low resolution
  • drawn with consistent stroke thickness

A phone photo of a handwritten number is usually not like that at all. It may have shadows, off-center writing, thick borders, or too much blank space.

That means simple resizing is sometimes not enough. You may also need to:

  • crop the digit region
  • center it in the canvas
  • remove extra background noise
  • preserve aspect ratio when resizing

Preserve Aspect Ratio When Resizing

A safer preprocessing path is to resize while preserving aspect ratio and then paste the digit into a 28 by 28 canvas:

python
1from PIL import Image, ImageOps
2import numpy as np
3
4
5def preprocess_mnist_style(path):
6    img = Image.open(path).convert("L")
7    img = ImageOps.invert(img)
8
9    img.thumbnail((20, 20))
10    canvas = Image.new("L", (28, 28), color=0)
11
12    x = (28 - img.width) // 2
13    y = (28 - img.height) // 2
14    canvas.paste(img, (x, y))
15
16    arr = np.array(canvas).astype("float32") / 255.0
17    return arr.reshape(1, 28, 28, 1)

This often gives better results than stretching the original image directly to 28 by 28.

Inspect the Processed Image

Before blaming the model, print or save the processed image and inspect it. If the processed result does not visually resemble an MNIST digit, the prediction will likely be bad.

For example:

python
processed = preprocess_mnist_style("digit.png")
print(processed.shape)
print(processed.min(), processed.max())

You can also visualize it with matplotlib:

python
1import matplotlib.pyplot as plt
2
3plt.imshow(processed[0, :, :, 0], cmap="gray")
4plt.show()

This is one of the best debugging steps in custom-image prediction.

Common Pitfalls

The biggest pitfall is forgetting to match the training preprocessing. If the model was trained on normalized grayscale images with a dark background, your prediction input needs the same treatment.

Another common issue is shape mismatch. A model trained on (28, 28, 1) input will not accept a raw (28, 28) array without reshaping.

People also often forget background inversion. A white background with a dark digit may look correct to a human but be effectively reversed relative to MNIST.

Finally, poor cropping hurts a lot. If the digit occupies only a small corner of the image, the network is seeing mostly empty space.

Summary

  • Your custom image must be preprocessed to match the format used during MNIST training.
  • Convert to grayscale, resize carefully, normalize, and add batch and channel dimensions.
  • Invert the image if your source background is opposite to the MNIST style.
  • Inspect the processed image itself when predictions look wrong.
  • Most bad custom-image predictions come from preprocessing mismatch, not from the CNN architecture.

Course illustration
Course illustration