Keras
custom loss function
machine learning
neural networks
deep learning

How to test a custom loss function in keras?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Testing a custom Keras loss function should happen before you trust any training result that uses it. A loss can compile successfully and still be mathematically wrong, numerically unstable, or incompatible with the tensor shapes your model actually produces.

The safest approach is to test the loss in layers: exact values on tiny tensors, shape behavior, gradients, and then one small integration run through model.compile and fit. That gives you fast feedback before a bug burns hours of training time.

Start With a Tiny Deterministic Example

Before involving a model, call the loss directly on a case where you can compute the answer by hand:

python
1import tensorflow as tf
2
3
4def custom_loss(y_true, y_pred):
5    return tf.reduce_mean(tf.square(y_true - y_pred) + 0.1)
6
7
8y_true = tf.constant([[1.0], [2.0]])
9y_pred = tf.constant([[1.5], [1.5]])
10
11value = custom_loss(y_true, y_pred).numpy()
12print(value)

For this example, the squared errors are 0.25 and 0.25. After adding 0.1 to each, the mean is 0.35. That makes the expected answer obvious.

Turn that into an assertion:

python
1import numpy as np
2
3
4def test_custom_loss_value():
5    y_true = tf.constant([[1.0], [2.0]])
6    y_pred = tf.constant([[1.5], [1.5]])
7
8    got = custom_loss(y_true, y_pred).numpy()
9    expected = 0.35
10
11    assert np.isclose(got, expected)

This simple test catches a surprising number of mistakes immediately.

Check Shape and Reduction Behavior

Many custom loss bugs come from reducing along the wrong axis or returning a scalar when Keras expects a per-sample value. Test the shape explicitly:

python
1import tensorflow as tf
2
3
4def per_sample_loss(y_true, y_pred):
5    return tf.reduce_mean(tf.square(y_true - y_pred), axis=-1)
6
7
8y_true = tf.constant([[1.0, 2.0], [3.0, 4.0]])
9y_pred = tf.constant([[1.0, 1.0], [5.0, 4.0]])
10
11result = per_sample_loss(y_true, y_pred)
12print(result.shape)
13print(result.numpy())

If the result shape is not what you intended, the model may still run, but the training semantics will be off.

Verify Gradients Exist and Are Finite

A mathematically reasonable loss is still useless if TensorFlow cannot differentiate through it correctly. GradientTape is the fastest way to verify that:

python
1import tensorflow as tf
2
3
4def custom_loss(y_true, y_pred):
5    return tf.reduce_mean(tf.square(y_true - y_pred))
6
7
8y_true = tf.constant([[1.0], [2.0]])
9y_pred = tf.Variable([[1.5], [1.5]], dtype=tf.float32)
10
11with tf.GradientTape() as tape:
12    loss_value = custom_loss(y_true, y_pred)
13
14grads = tape.gradient(loss_value, y_pred)
15tf.debugging.assert_all_finite(grads, "Gradient contains NaN or Inf")
16print(grads.numpy())

If gradients are None, NaN, or Inf, training will either fail or behave unpredictably.

Add a Small Integration Test

After unit-testing the loss in isolation, prove that it works inside a real Keras training loop:

python
1import tensorflow as tf
2
3
4def custom_loss(y_true, y_pred):
5    return tf.reduce_mean(tf.square(y_true - y_pred))
6
7
8model = tf.keras.Sequential([
9    tf.keras.layers.Input(shape=(1,)),
10    tf.keras.layers.Dense(1),
11])
12
13model.compile(optimizer="adam", loss=custom_loss)
14
15x = tf.constant([[0.0], [1.0], [2.0], [3.0]])
16y = tf.constant([[0.0], [2.0], [4.0], [6.0]])
17
18history = model.fit(x, y, epochs=2, verbose=0)
19print(history.history["loss"])

This test does not prove the loss is scientifically correct, but it does prove Keras can compile, backpropagate, and train with it.

Test Numerical Edge Cases

If your loss uses division, logarithms, exponentials, or square roots, add tests for boundary values:

  • zeros
  • tiny positive values
  • very large values
  • invalid negative values if the math forbids them

These cases are where silent instability usually lives. An epsilon term may be necessary, but test that the stabilized version still behaves the way you intend.

Common Pitfalls

The biggest mistake is testing only model.compile and assuming the loss math is correct because Keras accepted it. Compilation is a very weak test.

Another common problem is forgetting to test gradients. A loss that returns a scalar number is not automatically differentiable in a useful way.

People also mix up per-sample and globally reduced losses. That can change optimization behavior even when the model appears to train normally.

Finally, do not ignore numerical edge cases. If the loss can explode on a rare batch, it will eventually do so in training.

Summary

  • Test custom losses directly on tiny tensors with hand-computed expected answers.
  • Verify shape and reduction behavior explicitly.
  • Check gradients with GradientTape and reject non-finite results.
  • Add one small compile-and-fit integration test.
  • Include numerical edge-case tests for losses that use unstable operations.

Course illustration
Course illustration

All Rights Reserved.