how to predict my own image using cnn in keras after training on MNIST dataset
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
If a CNN trained on MNIST predicts your own digit image badly, the model is often not the main problem. The usual issue is preprocessing. MNIST images are 28 by 28 grayscale digits centered on a dark background with pixel values normalized in a specific way. Your custom image has to be transformed into that same format before prediction.
What the Model Expects
A typical MNIST CNN is trained on tensors shaped like one of these:
- '
(batch, 28, 28, 1)for channels-last models' - '
(batch, 1, 28, 28)for channels-first models'
and the pixel values are usually scaled to the range 0.0 to 1.0.
That means your custom image must usually be:
- resized to
28x28 - converted to grayscale
- normalized to
float32 - reshaped to include batch and channel dimensions
If any one of those steps is wrong, the prediction can be poor even if training went fine.
Load and Preprocess a Custom Image
Here is a common Keras-friendly pipeline using Pillow and NumPy:
Then use it with your trained model:
The inversion step matters a lot. Many hand-drawn images are black digits on white background, while MNIST is effectively the opposite.
A Minimal End-to-End Example
This is the core workflow: load, transform, reshape, predict.
Why Your Own Image Often Fails
MNIST is a very specific dataset. The digits are:
- centered
- tightly cropped
- grayscale
- low resolution
- drawn with consistent stroke thickness
A phone photo of a handwritten number is usually not like that at all. It may have shadows, off-center writing, thick borders, or too much blank space.
That means simple resizing is sometimes not enough. You may also need to:
- crop the digit region
- center it in the canvas
- remove extra background noise
- preserve aspect ratio when resizing
Preserve Aspect Ratio When Resizing
A safer preprocessing path is to resize while preserving aspect ratio and then paste the digit into a 28 by 28 canvas:
This often gives better results than stretching the original image directly to 28 by 28.
Inspect the Processed Image
Before blaming the model, print or save the processed image and inspect it. If the processed result does not visually resemble an MNIST digit, the prediction will likely be bad.
For example:
You can also visualize it with matplotlib:
This is one of the best debugging steps in custom-image prediction.
Common Pitfalls
The biggest pitfall is forgetting to match the training preprocessing. If the model was trained on normalized grayscale images with a dark background, your prediction input needs the same treatment.
Another common issue is shape mismatch. A model trained on (28, 28, 1) input will not accept a raw (28, 28) array without reshaping.
People also often forget background inversion. A white background with a dark digit may look correct to a human but be effectively reversed relative to MNIST.
Finally, poor cropping hurts a lot. If the digit occupies only a small corner of the image, the network is seeing mostly empty space.
Summary
- Your custom image must be preprocessed to match the format used during MNIST training.
- Convert to grayscale, resize carefully, normalize, and add batch and channel dimensions.
- Invert the image if your source background is opposite to the MNIST style.
- Inspect the processed image itself when predictions look wrong.
- Most bad custom-image predictions come from preprocessing mismatch, not from the CNN architecture.

