Keras
TensorFlow
GPU
Deep Learning
Neural Networks

Keras with TensorFlow backend not using GPU

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

When Keras with a TensorFlow backend runs on CPU instead of GPU, the cause is almost always a missing or misconfigured component in the CUDA/cuDNN stack. TensorFlow requires a specific combination of NVIDIA drivers, CUDA toolkit, and cuDNN library versions to detect and use the GPU. The most common fixes are installing tensorflow-gpu (for TensorFlow 1.x), installing the correct CUDA/cuDNN versions, and verifying the GPU is visible with tf.config.list_physical_devices('GPU').

Checking GPU Availability

python
1import tensorflow as tf
2
3# Check if TensorFlow sees any GPUs
4gpus = tf.config.list_physical_devices('GPU')
5print(f"GPUs available: {len(gpus)}")
6for gpu in gpus:
7    print(f"  {gpu}")
8
9# Check if TensorFlow was built with CUDA support
10print(f"Built with CUDA: {tf.test.is_built_with_cuda()}")
11
12# Detailed device listing
13from tensorflow.python.client import device_lib
14print(device_lib.list_local_devices())

If this prints no GPU devices, TensorFlow is not detecting your GPU. The issue is in the driver/CUDA/cuDNN stack.

Fix 1: Install the Correct TensorFlow Package

bash
1# TensorFlow 2.x — single package handles both CPU and GPU
2pip install tensorflow
3
4# TensorFlow 1.x — separate GPU package required
5pip install tensorflow-gpu==1.15.0
6
7# Check installed version
8python -c "import tensorflow as tf; print(tf.__version__)"

Starting with TensorFlow 2.1, the tensorflow pip package includes GPU support by default. For TensorFlow 1.x, you must install tensorflow-gpu separately. If you have the CPU-only tensorflow package installed alongside tensorflow-gpu, remove the CPU-only version to avoid conflicts.

Fix 2: Match CUDA and cuDNN Versions

bash
1# Check NVIDIA driver version
2nvidia-smi
3
4# Check CUDA version
5nvcc --version
6
7# Check cuDNN version
8cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2

TensorFlow requires specific CUDA and cuDNN versions. Mismatched versions are the most common cause of GPU detection failure.

TensorFlowCUDAcuDNN
2.1512.28.9
2.1411.88.7
2.1311.88.6
2.1211.88.6
2.10-2.1111.28.1
1.1510.07.4

Check the official TensorFlow build configurations page for the exact version matrix.

Fix 3: Set CUDA Environment Variables

bash
1# Add to ~/.bashrc or ~/.zshrc
2export PATH=/usr/local/cuda/bin:$PATH
3export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
4
5# For multiple CUDA installations, point to the correct one
6export CUDA_HOME=/usr/local/cuda-11.8
7export PATH=$CUDA_HOME/bin:$PATH
8export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH
9
10# Reload
11source ~/.bashrc

TensorFlow searches LD_LIBRARY_PATH for CUDA and cuDNN shared libraries. If the paths are wrong, it falls back to CPU silently.

Fix 4: Use conda for Automatic CUDA Management

bash
1# conda handles CUDA/cuDNN installation automatically
2conda create -n tf-gpu python=3.10
3conda activate tf-gpu
4conda install tensorflow-gpu -c conda-forge
5
6# Verify
7python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"

conda installs the correct CUDA and cuDNN libraries into the environment, avoiding system-wide version conflicts. This is the easiest path for most users.

Fix 5: GPU Memory Configuration

python
1import tensorflow as tf
2
3# Allow memory growth (prevents TensorFlow from allocating all GPU memory)
4gpus = tf.config.list_physical_devices('GPU')
5for gpu in gpus:
6    tf.config.experimental.set_memory_growth(gpu, True)
7
8# Or limit GPU memory to a fixed amount
9tf.config.set_logical_device_configuration(
10    gpus[0],
11    [tf.config.LogicalDeviceConfiguration(memory_limit=4096)]  # 4 GB
12)

By default, TensorFlow allocates all available GPU memory. If another process already holds the GPU memory, TensorFlow may fail to initialize the GPU. Enabling memory growth allocates memory incrementally.

Fix 6: Verify GPU Usage During Training

python
1import tensorflow as tf
2
3# Force placement on GPU
4with tf.device('/GPU:0'):
5    a = tf.random.normal([1000, 1000])
6    b = tf.random.normal([1000, 1000])
7    c = tf.matmul(a, b)
8    print(c.device)  # Should show /job:localhost/replica:0/task:0/device:GPU:0
9
10# Monitor GPU usage during training
11model = tf.keras.Sequential([
12    tf.keras.layers.Dense(512, activation='relu', input_shape=(784,)),
13    tf.keras.layers.Dense(10, activation='softmax')
14])
15model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')
16
17# Enable device placement logging
18tf.debugging.set_log_device_placement(True)
19
20# Train — log output shows which device each operation runs on
21# model.fit(x_train, y_train, epochs=5)

tf.debugging.set_log_device_placement(True) prints which device (CPU or GPU) each operation runs on. This confirms whether your model is actually using the GPU during training.

Docker with GPU Support

bash
1# Use NVIDIA's TensorFlow Docker image (includes CUDA/cuDNN)
2docker run --gpus all -it tensorflow/tensorflow:latest-gpu python -c \
3    "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
4
5# Requires nvidia-container-toolkit installed on the host
6sudo apt-get install nvidia-container-toolkit
7sudo systemctl restart docker

Docker containers need --gpus all and the nvidia-container-toolkit to pass GPU access into the container. The official TensorFlow GPU image includes pre-configured CUDA and cuDNN.

Windows-Specific Issues

python
1# Windows: Check if TensorFlow finds the CUDA DLLs
2import os
3print(os.environ.get('PATH', ''))
4
5# Common DLL locations on Windows
6# C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin
7# C:\tools\cuda\bin
8# C:\Program Files\NVIDIA\CUDNN\v8.6\bin

On Windows, CUDA DLLs must be on the PATH. The CUDA installer usually adds them, but cuDNN files (cudnn64_8.dll) must be manually copied to the CUDA bin directory or added to PATH.

Common Pitfalls

  • Installing tensorflow instead of tensorflow-gpu for TF 1.x: TensorFlow 1.x has separate CPU and GPU packages. The CPU-only tensorflow package cannot use the GPU regardless of your CUDA setup. Install tensorflow-gpu for TF 1.x.
  • CUDA/cuDNN version mismatch: TensorFlow 2.14 needs CUDA 11.8, not CUDA 12.x. Installing the latest CUDA does not work — you must match the exact version from the compatibility matrix.
  • Missing cuDNN: CUDA alone is not enough. TensorFlow requires cuDNN (the deep neural network library) in addition to the CUDA toolkit. Install cuDNN from the NVIDIA developer site and place the files in the CUDA directory.
  • Another process holding GPU memory: If PyTorch, another TensorFlow session, or a game is using all GPU memory, TensorFlow cannot allocate and falls back to CPU. Use nvidia-smi to check GPU memory usage and kill competing processes.
  • Virtual environment not seeing system CUDA: Python virtual environments may not inherit LD_LIBRARY_PATH. Set the environment variable inside the activated environment or use conda which bundles its own CUDA libraries.

Summary

  • Check GPU visibility with tf.config.list_physical_devices('GPU') first
  • For TensorFlow 1.x, install tensorflow-gpu (TF 2.x includes GPU support in the main package)
  • Match CUDA and cuDNN versions exactly to your TensorFlow version
  • Set LD_LIBRARY_PATH (Linux) or PATH (Windows) to include CUDA directories
  • Use conda for automatic CUDA/cuDNN management without version conflicts
  • Enable tf.debugging.set_log_device_placement(True) to verify operations run on GPU
  • Use tf.config.experimental.set_memory_growth(gpu, True) to prevent memory allocation failures

Course illustration
Course illustration

All Rights Reserved.