Applying callbacks in a custom training loop in Tensorflow 2.0
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
TensorFlow callbacks work automatically with model.fit, but they do not magically wire themselves into a hand-written training loop. In a custom loop, you have to call the callback hooks yourself and provide the metrics that the callbacks expect. Once that is clear, you can still use familiar tools such as early stopping, checkpointing, and progress logging without giving up low-level control.
What Changes in a Custom Loop
With model.fit, TensorFlow owns the training lifecycle. It knows when an epoch starts, when a batch ends, and what logs should be passed to each callback. In a custom loop, you own that lifecycle, so you must drive the callbacks explicitly.
The usual setup looks like this:
- Create the model, optimizer, and loss.
- Build a
CallbackList. - Attach the model with
set_model. - Call lifecycle methods such as
on_train_begin,on_epoch_begin, andon_epoch_end. - Pass metric values in the
logsdictionary.
If you skip those steps, the callbacks may exist but they will not do anything useful.
A Minimal Working Example
The example below trains a tiny regression model with a custom loop and a normal Keras callback list.
This is the core pattern. The callbacks remain standard Keras callbacks, but the loop decides when each event fires.
Feed the Right Metrics to the Callbacks
Callbacks such as EarlyStopping and ReduceLROnPlateau depend on metric names in logs. If you monitor val_loss but never compute or pass val_loss, the callback cannot make decisions.
For example, if you want validation-based early stopping, run a validation pass at the end of each epoch and include that value:
The metric names must match the callback configuration exactly.
Custom Callbacks Still Work Well
You can also write your own callback class and use it in the same loop. That is often useful for printing domain-specific diagnostics or saving artifacts that model.fit does not understand.
Add it to the callbacks list and it will receive the same lifecycle events as built-in callbacks.
Know When a Custom Loop Is Worth It
Custom loops are best when you need manual gradient accumulation, multiple optimizers, reinforcement learning updates, or non-standard batch logic. If the training flow is ordinary supervised learning, model.fit is still simpler and gives you callback support for free. The goal is control where you actually need it, not replacing higher-level APIs by habit.
Common Pitfalls
- Creating callbacks but never wrapping them in a
CallbackListand never calling lifecycle methods. - Monitoring
val_lossor another metric name that is never included in thelogsdictionary. - Forgetting to call
set_modelandset_params, which leaves some callbacks without required context. - Ignoring
model.stop_trainingafterEarlyStoppingrequests termination. - Rewriting callback behavior manually when standard Keras callbacks would already handle it cleanly.
Summary
- Callbacks work in custom TensorFlow loops, but you must drive them explicitly.
- Use
CallbackListplus lifecycle hooks such ason_train_beginandon_epoch_end. - Pass metric values through
logsusing names that match the callback configuration. - Built-in and custom callbacks can both be reused in this pattern.
- Choose a custom loop only when you need training behavior that
model.fitcannot express clearly.

