tensorflow
custom training loop
learning rate
machine learning
deep learning

Learning rate of custom training loop for tensorflow 2.0

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

In a TensorFlow 2 custom training loop, the learning rate is not controlled by model.fit, but by the optimizer you create and use inside the loop. That gives you more flexibility, but it also means you must manage the value deliberately, whether it stays constant, changes by schedule, or is updated manually during training.

The Learning Rate Lives on the Optimizer

In a custom loop, the optimizer is the object that applies gradients, so the learning rate belongs there as well.

python
1import tensorflow as tf
2
3model = tf.keras.Sequential([
4    tf.keras.layers.Dense(32, activation="relu"),
5    tf.keras.layers.Dense(1)
6])
7
8optimizer = tf.keras.optimizers.Adam(learning_rate=1e-3)
9loss_fn = tf.keras.losses.MeanSquaredError()

Every time you call optimizer.apply_gradients(...), that learning rate influences the update size.

A Minimal Custom Training Step

A typical TensorFlow 2 custom loop uses GradientTape.

python
1import tensorflow as tf
2
3
4@tf.function
5def train_step(x, y):
6    with tf.GradientTape() as tape:
7        predictions = model(x, training=True)
8        loss = loss_fn(y, predictions)
9
10    gradients = tape.gradient(loss, model.trainable_variables)
11    optimizer.apply_gradients(zip(gradients, model.trainable_variables))
12    return loss

The important point is that the learning rate is already built into optimizer. You do not pass it separately into train_step unless you are intentionally managing it yourself.

You Can Read or Change It During Training

If you want to inspect or modify the learning rate, use the optimizer directly.

python
print("initial lr:", float(optimizer.learning_rate.numpy()))
optimizer.learning_rate.assign(5e-4)
print("updated lr:", float(optimizer.learning_rate.numpy()))

This is useful when you want to reduce the step size after a certain number of epochs or after validation stops improving.

Schedules Work in Custom Loops Too

A cleaner way to vary the learning rate is to pass a schedule into the optimizer.

python
1schedule = tf.keras.optimizers.schedules.ExponentialDecay(
2    initial_learning_rate=1e-3,
3    decay_steps=1000,
4    decay_rate=0.96,
5    staircase=True,
6)
7
8optimizer = tf.keras.optimizers.Adam(learning_rate=schedule)

Now the learning rate changes automatically as the optimizer step count increases, even though you are not using model.fit.

This is often preferable to writing manual if epoch == ... logic in the loop.

Manual Epoch-Level Control Is Still Fine

Sometimes you do want direct control because the schedule depends on custom metrics or a training phase transition.

python
1for epoch in range(20):
2    if epoch == 10:
3        optimizer.learning_rate.assign(1e-4)
4
5    for x_batch, y_batch in train_dataset:
6        loss = train_step(x_batch, y_batch)
7
8    print("epoch", epoch, "lr", float(optimizer.learning_rate.numpy()), "loss", float(loss.numpy()))

That pattern is simple and perfectly reasonable when the learning-rate change is tied to coarse training milestones.

Choose the Value Based on Stability, Not Habit

The right learning rate depends on the model, optimizer, normalization, batch size, and loss landscape. There is no universally correct number.

A good debugging pattern is:

  • start with a commonly reasonable value for the chosen optimizer
  • watch whether the loss explodes, stalls, or oscillates
  • lower the value if training is unstable
  • consider a schedule if early progress is good but later convergence is noisy

In other words, the custom loop changes how you apply the learning rate, not the basic tuning principles behind it.

Common Pitfalls

The most common mistake is looking for a separate TensorFlow 2 custom-loop API just for learning rate when the optimizer already owns that setting.

Another common issue is forgetting that custom loops also need learning-rate scheduling if the training dynamics benefit from it. Developers also often update the learning rate with plain Python values but never inspect whether the new value was actually applied to the optimizer state they are using.

Summary

  • In a TensorFlow 2 custom training loop, the learning rate belongs to the optimizer.
  • 'GradientTape computes gradients, and optimizer.apply_gradients uses the configured learning rate.'
  • You can inspect or assign optimizer.learning_rate directly.
  • Learning-rate schedules work in custom loops just as they do with high-level training APIs.
  • Tune the value based on training stability and convergence behavior, not by fixed habit.

Course illustration
Course illustration

All Rights Reserved.