How to use TensorFlow metrics in Keras
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
In Keras, metrics are the values you want to monitor during training and evaluation in addition to the loss. TensorFlow makes this straightforward through built-in metric names, metric classes, and custom metric implementations when the defaults are not enough.
Add Built-In Metrics in compile
The most common pattern is to pass metrics when compiling the model. Keras then reports them during fit, evaluate, and often in training history.
This setup is typical for binary classification. The loss drives optimization, while the metrics help you interpret model behavior.
Choose Metrics That Match the Task
Metrics should align with the prediction problem:
- classification: accuracy, precision, recall, AUC
- regression: MAE, MSE, RMSE
- ranking or sequence tasks: more specialized metrics depending on the domain
It is common to start with one metric and then realize it hides important behavior. For example, accuracy can look good on imbalanced data even when recall is poor for the minority class.
Metric Names Versus Metric Objects
Keras accepts simple string names for many standard metrics.
That is fine for common cases. Metric objects are better when you want configuration or clear naming.
The object form is also easier to extend when thresholds or other parameters matter.
Example for Regression
This produces training logs with both MAE and RMSE, which are often easier to interpret than the loss alone.
Writing a Custom Metric
If the built-in metrics do not match your business requirement, subclass tf.keras.metrics.Metric.
Custom metrics are helpful when you need reporting logic that is meaningful to the application rather than just to the optimizer.
Common Pitfalls
- Expecting metrics to influence gradient updates is a conceptual mistake. The loss drives optimization; metrics are primarily for monitoring.
- Choosing accuracy alone for imbalanced classification can hide poor minority-class performance. Add precision, recall, AUC, or task-specific metrics when appropriate.
- Mixing metric names that do not match the output shape causes confusing logs or errors. Make sure the metric fits the model output and label format.
- Writing a custom metric as a plain Python calculation can break TensorFlow graph execution or state management. Use
tf.keras.metrics.Metricfor stateful custom metrics. - Comparing models only by one visible metric can lead to bad decisions. Review loss, validation metrics, and domain-specific constraints together.
Summary
- In Keras, metrics are monitoring signals reported during training and evaluation.
- You can pass built-in metrics by string or as metric objects in
compile. - Choose metrics that match the prediction task and class balance.
- Custom metrics are possible by subclassing
tf.keras.metrics.Metric. - Metrics help you understand model behavior, but the loss is what training actually optimizes.

