Keras model gets constant loss and accuracy
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
When loss and accuracy stay almost constant across epochs, the model usually is not learning meaningful updates. The cause is rarely “Keras is broken”; it is usually a mismatch between data, labels, model output, loss function, or optimizer behavior.
Start with the Simplest Sanity Check
Before tuning hyperparameters, confirm the model can overfit a tiny subset of the training data. If it cannot learn even 20 samples, the training pipeline has a structural issue.
If a simple model cannot overfit a tiny batch, check the rest of the training setup before experimenting with architecture changes.
Match Output Layer, Labels, and Loss
One of the most common reasons for flat metrics is an incompatible combination of output activation, label encoding, and loss function.
Examples:
- Binary classification: one output unit with
sigmoidplusbinary_crossentropy - Multiclass classification with integer labels:
Dense(num_classes, activation="softmax")plussparse_categorical_crossentropy - Multiclass classification with one-hot labels:
Dense(num_classes, activation="softmax")pluscategorical_crossentropy
Incorrect pairing example:
while labels are plain integers and the output layer is shaped for binary classification. That mismatch can make training appear frozen or meaningless.
Inspect the Labels Directly
Print shapes and unique values before training:
Frequent label problems include:
- all labels are the same class
- labels are shifted or paired with the wrong examples
- labels are one-hot encoded but treated as sparse integers, or the reverse
- regression targets are accidentally passed into a classification model
If labels are wrong, no optimizer setting will rescue the run.
Check Learning Rate and Frozen Weights
A learning rate that is too low can make updates so tiny that metrics barely move. A rate that is too high can also produce apparent stagnation because training bounces around without converging.
You can inspect or change the optimizer explicitly:
Also check whether layers are accidentally frozen:
If the relevant layers are not trainable, gradients may not update what you think they are updating.
Normalize Inputs and Watch for Bad Data Pipelines
Flat metrics can also come from input problems:
- values are all zeros or almost constant
- inputs are not normalized while the model expects them to be
- the generator or dataset keeps yielding the same batch
- train and label arrays are misaligned after preprocessing
Inspect a real batch:
If you are using tf.data or generators, verify that batching and shuffling actually work as intended.
Use Gradient and Prediction Checks
Another useful debugging step is to inspect the raw predictions before and after a few training steps. If predictions never change, the model may not be receiving informative gradients.
If predictions are identical, revisit the loss setup, trainable variables, and optimizer.
Common Pitfalls
The most common mistake is mismatching the final activation, label encoding, and loss function.
Another issue is debugging architecture before confirming data and labels are correct. Constant loss often starts with the dataset, not with the number of layers.
People also overlook frozen layers, especially after transfer-learning experiments.
Finally, do not trust accuracy alone. On imbalanced data, a model can report nearly constant accuracy while learning nothing useful.
Summary
- Constant loss and accuracy usually mean the training pipeline has a structural mismatch.
- First verify the model can overfit a tiny subset of data.
- Make sure labels, output layer, and loss function agree.
- Check learning rate, trainable layers, and data normalization.
- Inspect predictions and batches directly instead of guessing from metrics alone.

