Change default GPU in TensorFlow
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
In TensorFlow, changing the "default GPU" usually means one of two things: hide every GPU except the one you want, or keep all GPUs visible and place specific operations on a chosen device. The first option is the cleaner answer when you want the whole process to behave as if only one GPU exists.
Hide every GPU except the one you want
The most reliable approach is to restrict visible devices before TensorFlow initializes the runtime.
After this call, TensorFlow only sees the selected GPU. An important detail is that the visible device is usually renumbered logically. If you expose only physical GPU 1, TensorFlow typically treats it as logical GPU:0 inside the process.
That is why this method is better thought of as "change what TensorFlow can see" rather than "keep the original numbering and pick a default."
Use an environment variable for process-level control
If you want to select the device before Python even imports TensorFlow, use the environment.
This is especially useful in shared servers, notebooks launched by wrappers, or shell scripts that coordinate multiple training jobs.
Use tf.device for targeted placement
If you want all GPUs to remain visible but need one block of code on a particular device, use a device scope.
This does not change the default behavior of the entire runtime. It only places the operations created inside that context on the requested device when placement is possible.
Timing matters
The device visibility configuration must happen before TensorFlow creates logical devices or allocates GPU memory. In practice, that means you should call set_visible_devices immediately after importing TensorFlow and before building models, creating tensors, or calling other GPU-related APIs.
If you wait too long, TensorFlow may raise a runtime error saying the visible devices cannot be modified after initialization.
It is also worth verifying placement instead of assuming it worked. Listing logical devices after configuration, or inspecting the .device field on a tensor created inside a device scope, gives a quick sanity check before you launch a long training job. That small verification step can save a lot of wasted runtime on a crowded machine.
In shared training environments, this also improves job isolation. One process can expose only one GPU to TensorFlow, while a second process exposes a different GPU, and neither job needs to know about the other device at all. That is often simpler than managing placement rules inside a single script that can see every accelerator on the host.
Common Pitfalls
- Calling
set_visible_devicesafter TensorFlow has already initialized the GPU runtime. - Assuming that physical GPU 1 will still be called
GPU:1after visibility is restricted. - Using
tf.devicewhen you actually wanted to hide the other GPUs from the whole process. - Forgetting that notebooks may have imported TensorFlow earlier in the session.
- Skipping memory-growth settings in shared environments and then fighting unnecessary allocation pressure.
Summary
- To change the effective default GPU, hide all other GPUs with
set_visible_devicesorCUDA_VISIBLE_DEVICES. - Apply that configuration before TensorFlow initializes any GPU state.
- After filtering visibility, the chosen physical GPU is often exposed as logical
GPU:0. - Use
tf.deviceonly for scoped placement of specific operations. - Pick one strategy deliberately: process-wide visibility control or local device placement.

