Getting reproducible results using tensorflow-gpu

tensorflow

gpu

reproducibility

machine learning

deep learning

Getting reproducible results using tensorflow-gpu

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

Reproducibility with TensorFlow on GPUs is possible, but it is not automatic. Random seeds matter, deterministic kernels matter, input pipelines matter, and even then you should think in terms of "more deterministic under controlled conditions" rather than "identical under every environment change."

Start with Modern TensorFlow Controls

The historical tensorflow-gpu package name reflected an older packaging model. In current TensorFlow, GPU support is part of the main TensorFlow package, and reproducibility controls are exposed through the normal API.

The most important baseline steps are:

set Python, NumPy, and TensorFlow seeds
enable TensorFlow op determinism
keep library and hardware versions fixed

python

1import os
2import random
3import numpy as np
4import tensorflow as tf
5
6seed = 42
7os.environ["PYTHONHASHSEED"] = str(seed)
8random.seed(seed)
9np.random.seed(seed)
10tf.keras.utils.set_random_seed(seed)
11tf.config.experimental.enable_op_determinism()

That does not solve every source of variation, but it closes the biggest gaps.

Determinism Is More Than Random Initialization

Many developers think reproducibility is only about seeding the random number generator. That is necessary, but not sufficient. GPU execution can still vary because some operations have nondeterministic implementations or because parallel execution order changes the floating-point accumulation path.

Enabling op determinism tells TensorFlow to prefer deterministic implementations where available. The tradeoff is that this can reduce performance.

Keep the Input Pipeline Stable

A reproducible model also needs a reproducible data pipeline. Shuffling, parallel mapping, and prefetch behavior can all introduce variation if not configured deliberately.

python

1import tensorflow as tf
2
3seed = 42
4
5dataset = tf.data.Dataset.range(10)
6dataset = dataset.shuffle(10, seed=seed, reshuffle_each_iteration=False)
7dataset = dataset.batch(2)

Using reshuffle_each_iteration=False makes the shuffled order repeat across epochs for the same run configuration.

Control the Environment

Even with the same code, reproducibility can break if these change:

TensorFlow version
CUDA and cuDNN versions
GPU model
driver version
operating system

If exact repeatability matters, pin the environment in a container or reproducible build setup. Code-level determinism cannot compensate for an uncontrolled platform.

Measure the Right Expectation

There are different levels of reproducibility:

same run, same machine, same software stack
same code, different runs on the same machine
same code, different machines and GPU models

The first two are realistic goals. The last one is much harder because hardware and low-level libraries may produce small floating-point differences even when the logic is equivalent.

A Small Deterministic Training Example

python

1import tensorflow as tf
2import numpy as np
3
4seed = 42
5tf.keras.utils.set_random_seed(seed)
6tf.config.experimental.enable_op_determinism()
7
8x = np.array([[0.0], [1.0], [2.0], [3.0]], dtype=np.float32)
9y = np.array([[0.0], [2.0], [4.0], [6.0]], dtype=np.float32)
10
11model = tf.keras.Sequential([
12    tf.keras.layers.Dense(1, input_shape=(1,))
13])
14
15model.compile(optimizer="sgd", loss="mse")
16model.fit(x, y, epochs=5, batch_size=2, verbose=0, shuffle=False)
17
18print(model.get_weights())

This kind of tiny setup is useful for confirming that your determinism controls are working before you move back to a larger pipeline.

Common Pitfalls

Setting only one seed and assuming everything else becomes deterministic is not enough.
Forgetting to enable deterministic ops leaves some GPU kernels free to vary.
Using shuffled datasets without fixed seeds or with reshuffling enabled breaks repeatability.
Comparing results across changed driver or CUDA stacks can make deterministic code look inconsistent.
Expecting determinism without any performance tradeoff is unrealistic.

Summary

Reproducibility on TensorFlow with GPUs requires both seeding and deterministic execution settings.
Use tf.keras.utils.set_random_seed(...) and tf.config.experimental.enable_op_determinism() as the baseline.
Keep the input pipeline deterministic, especially around shuffle behavior.
Pin the software and hardware environment when exact repeatability matters.
Think of reproducibility as a controlled-system property, not just a one-line seed setting.