Keras
early stopping
val_loss
error handling
machine learning

Keras early stopping callback error, val_loss metric not available

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

The Keras error saying val_loss is not available appears when EarlyStopping monitors a metric that training never logs. Most often, training is run without validation data, so no validation metrics are produced. Fixing this is usually as simple as adding validation input or changing the monitored metric.

Why val_loss Is Missing

val_loss exists only if model training runs validation each epoch. If fit has no validation_data and no validation_split, callbacks cannot read val_loss.

python
1from tensorflow import keras
2
3model = keras.Sequential([
4    keras.layers.Dense(16, activation="relu", input_shape=(10,)),
5    keras.layers.Dense(1)
6])
7model.compile(optimizer="adam", loss="mse")

At this point, callback monitoring val_loss will fail unless validation is configured.

Correct Usage with Validation Split

If data is in arrays, provide validation_split.

python
1import numpy as np
2
3X = np.random.randn(500, 10)
4y = np.random.randn(500, 1)
5
6es = keras.callbacks.EarlyStopping(
7    monitor="val_loss",
8    patience=5,
9    restore_best_weights=True
10)
11
12history = model.fit(
13    X,
14    y,
15    epochs=50,
16    batch_size=32,
17    validation_split=0.2,
18    callbacks=[es],
19    verbose=0
20)
21
22print(history.history.keys())

Now history contains val_loss, so early stopping can monitor it.

Correct Usage with Explicit Validation Data

For train and validation sets prepared separately, pass validation_data.

python
1X_train, X_val = X[:400], X[400:]
2y_train, y_val = y[:400], y[400:]
3
4history = model.fit(
5    X_train,
6    y_train,
7    validation_data=(X_val, y_val),
8    epochs=50,
9    callbacks=[es],
10    verbose=0
11)

This is preferred when you need deterministic split control.

Monitoring Training Loss Instead

If validation is intentionally disabled, monitor loss rather than val_loss.

python
1es_train = keras.callbacks.EarlyStopping(
2    monitor="loss",
3    patience=3,
4    restore_best_weights=True
5)

This can still prevent overtraining loops, though it does not guard against overfitting as well as validation-based monitoring.

Inspect Available Metrics from History

When callback names are uncertain, inspect training logs.

python
for k in history.history.keys():
    print(k)

Use these exact names in callback monitor settings.

Other callback issues that look similar:

  • typo in monitor string such as val_los
  • monitor metric not produced because metric was never compiled
  • custom training loop not logging metric expected by callback

Always align callback monitor names with actual logs.

Practical Callback Configuration

A robust early stopping setup often includes:

  • patience tuned to noise level
  • min_delta to ignore tiny fluctuations
  • restore_best_weights=True to keep best epoch state
python
1es = keras.callbacks.EarlyStopping(
2    monitor="val_loss",
3    mode="min",
4    patience=8,
5    min_delta=1e-4,
6    restore_best_weights=True
7)

This avoids stopping too early due to minor oscillations.

Using Validation with tf.data Pipelines

If you train with tf.data.Dataset, ensure a separate validation dataset is passed to validation_data. Without it, validation metrics are not emitted and monitor keys with val_ prefix will remain unavailable. Keep train and validation pipelines deterministic when debugging callback behavior.

python
1train_ds = tf.data.Dataset.from_tensor_slices((X_train, y_train)).batch(32)
2val_ds = tf.data.Dataset.from_tensor_slices((X_val, y_val)).batch(32)
3
4history = model.fit(
5    train_ds,
6    validation_data=val_ds,
7    epochs=30,
8    callbacks=[es],
9    verbose=0
10)

This pattern is common in production training code and avoids ambiguity about where validation metrics originate.

Log callback configuration and monitored keys at training start to catch monitor mismatches early.

During experimentation, save callback history and final selected epoch so stopping behavior can be audited. This helps differentiate real convergence from accidental early stops due to misconfigured monitor settings.

Common Pitfalls

  • Monitoring val_loss without providing any validation data in fit.
  • Using wrong monitor key that does not exist in history.history.
  • Assuming callbacks can infer validation metrics from training loss.
  • Running custom loops without logging metrics expected by callbacks.
  • Forgetting to restore best weights and evaluating final, not best, epoch model.

Summary

  • val_loss appears only when validation is run during training.
  • Add validation_split or validation_data to enable validation metrics.
  • If no validation is used, monitor loss instead.
  • Verify available metric names from training history.
  • Tune early stopping parameters for stable and meaningful stopping behavior.

Course illustration
Course illustration

All Rights Reserved.