Does EarlyStopping in Keras save the best model?

Keras

EarlyStopping

Machine Learning

Model Optimization

Deep Learning

Does EarlyStopping in Keras save the best model?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

Keras EarlyStopping stops training when a monitored metric stops improving, but that does not automatically mean your best model is saved anywhere. The exact behavior depends on one callback setting for in-memory weights and a different callback if you want the best model written to disk.

What `EarlyStopping` Actually Does

EarlyStopping monitors a metric such as val_loss or val_accuracy during model.fit(). When the metric stops improving for the configured number of epochs, training ends early.

The important default is this: restore_best_weights=False.

That means if you use EarlyStopping with its defaults, Keras stops training, but the model instance keeps the weights from the last epoch that ran, not necessarily the best epoch.

python

1import tensorflow as tf
2
3model = tf.keras.Sequential([
4    tf.keras.layers.Dense(16, activation="relu"),
5    tf.keras.layers.Dense(1)
6])
7
8model.compile(optimizer="adam", loss="mse")
9
10callback = tf.keras.callbacks.EarlyStopping(
11    monitor="val_loss",
12    patience=3
13)

This callback can stop training early, but it does not by itself restore the best validation-loss weights.

Restoring the Best Weights in Memory

If you want the model object in memory to end training at the best epoch, set restore_best_weights=True.

python

1callback = tf.keras.callbacks.EarlyStopping(
2    monitor="val_loss",
3    patience=3,
4    restore_best_weights=True
5)

With that setting, once training stops, Keras rolls the model weights back to the best observed epoch for the monitored quantity.

That solves an important part of the problem, but it still does not save a file to disk. It only changes the state of the model currently living in memory.

Saving the Best Model to Disk

If you want a durable saved artifact, use ModelCheckpoint, usually together with save_best_only=True.

python

1import tensorflow as tf
2
3early_stopping = tf.keras.callbacks.EarlyStopping(
4    monitor="val_loss",
5    patience=3,
6    restore_best_weights=True
7)
8
9checkpoint = tf.keras.callbacks.ModelCheckpoint(
10    filepath="best_model.keras",
11    monitor="val_loss",
12    save_best_only=True,
13    mode="min"
14)
15
16history = model.fit(
17    x_train,
18    y_train,
19    validation_data=(x_val, y_val),
20    epochs=50,
21    callbacks=[early_stopping, checkpoint]
22)

This combination does two different jobs:

'EarlyStopping decides when to halt training'
'ModelCheckpoint decides what gets written to disk'

If you only care about the final in-memory model, restore_best_weights=True may be enough. If you need to reload the best model later, use checkpointing.

Why This Distinction Matters

It is easy to assume "best model" means one thing, but there are really two separate questions:

what weights should the live model object have after training ends
what file should be persisted for later use

EarlyStopping answers the first question only when restore_best_weights=True. ModelCheckpoint answers the second question.

A Small Demonstration

Suppose validation loss reaches its minimum at epoch 8, then gets slightly worse, and training stops at epoch 11 because patience expires.

With default EarlyStopping, your model keeps epoch 11 weights.

With restore_best_weights=True, your in-memory model is rolled back to epoch 8 weights.

With ModelCheckpoint(save_best_only=True), the file on disk tracks the best epoch independently of whether the training loop later continues for a few more epochs.

That is why many training setups use both callbacks together.

Choosing the Right Monitored Metric

Make sure monitor and mode match your objective:

use monitor="val_loss" with mode="min" when lower is better
use monitor="val_accuracy" with mode="max" when higher is better

If you monitor the wrong quantity, "best" will mean the wrong thing no matter how you configure the callbacks.

Common Pitfalls

The most common mistake is assuming EarlyStopping saves a checkpoint file. It does not.

Another mistake is forgetting that restore_best_weights defaults to False. Many people expect the best epoch to be restored automatically, but Keras only does that when you ask for it.

A third pitfall is using EarlyStopping without validation data while monitoring a validation metric such as val_loss. If the metric is not present in the training logs, the callback cannot work correctly.

Finally, keep your monitor consistent across callbacks. If EarlyStopping watches val_loss and ModelCheckpoint watches val_accuracy, the "best" model in memory and on disk may not match.

Summary

'EarlyStopping does not automatically save the best model to disk'
By default, it also does not restore the best weights in memory
Set restore_best_weights=True if you want the final model object to use the best epoch's weights
Use ModelCheckpoint(save_best_only=True) if you want the best model saved as a file
Many training pipelines use both callbacks together for clean stopping and reliable persistence

Does EarlyStopping in Keras save the best model?

Master System Design with Codemia

Introduction

What EarlyStopping Actually Does

Restoring the Best Weights in Memory

Saving the Best Model to Disk

Why This Distinction Matters

A Small Demonstration

Choosing the Right Monitored Metric

Common Pitfalls

Summary

What `EarlyStopping` Actually Does