tensorflow
indexing
machine learning
python
data manipulation

how does tensorflow indexing work

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

TensorFlow indexing looks familiar if you already know NumPy, but it is not identical. Simple slices work much the same way, while more advanced selection and update patterns use TensorFlow-specific operations such as tf.gather, tf.gather_nd, tf.boolean_mask, and scatter functions.

Basic Slicing Works Like NumPy

For fixed positions and slice ranges, normal bracket syntax is usually enough.

python
1import tensorflow as tf
2
3x = tf.constant([
4    [10, 20, 30],
5    [40, 50, 60],
6    [70, 80, 90],
7])
8
9print(x[0])
10print(x[:, 1])
11print(x[1:, :2])

This is the right tool for straightforward row and column selection. The result is always a new tensor because TensorFlow tensors are immutable.

Use tf.gather for Index Lists

When the positions you want are themselves stored in another tensor, tf.gather is usually clearer than trying to force everything into slice syntax.

python
1import tensorflow as tf
2
3values = tf.constant([5, 10, 15, 20, 25])
4indices = tf.constant([0, 3, 4])
5
6selected = tf.gather(values, indices)
7print(selected)

For matrices and higher-rank tensors, supply axis to say which dimension should be indexed.

python
1matrix = tf.constant([
2    [1, 2, 3],
3    [4, 5, 6],
4])
5
6print(tf.gather(matrix, [2, 0], axis=1))

This is especially useful when the index values are computed dynamically during model execution.

Use tf.gather_nd for Coordinate Lookups

If you need arbitrary coordinates instead of one-dimensional positions, use tf.gather_nd.

python
1import tensorflow as tf
2
3matrix = tf.constant([
4    [11, 12, 13],
5    [21, 22, 23],
6    [31, 32, 33],
7])
8
9coords = tf.constant([
10    [0, 2],
11    [2, 1],
12    [1, 0],
13])
14
15print(tf.gather_nd(matrix, coords))

Each row in coords identifies one position in the source tensor. This is the TensorFlow equivalent of fancy coordinate selection.

Boolean Selection Uses tf.boolean_mask

TensorFlow does not support every NumPy advanced-indexing pattern directly with bracket syntax. For mask-based filtering, use tf.boolean_mask.

python
1import tensorflow as tf
2
3x = tf.constant([10, 20, 30, 40, 50])
4mask = tf.constant([True, False, True, False, True])
5
6print(tf.boolean_mask(x, mask))

The same idea works for filtering rows in higher-dimensional tensors.

python
1matrix = tf.constant([
2    [1, 2],
3    [3, 4],
4    [5, 6],
5])
6row_mask = tf.constant([True, False, True])
7
8print(tf.boolean_mask(matrix, row_mask))

This is often easier to reason about than translating boolean logic into explicit index arrays.

Updates Require Scatter Operations or Variables

A common source of confusion is trying to assign into a tensor like you would in NumPy. Plain tensors cannot be modified in place, so update-style logic must build a new tensor.

python
1import tensorflow as tf
2
3base = tf.constant([0, 0, 0, 0, 0])
4indices = tf.constant([[1], [3]])
5updates = tf.constant([9, 7])
6
7result = tf.tensor_scatter_nd_update(base, indices, updates)
8print(result)

If you are maintaining mutable model state, use a tf.Variable instead of repeatedly pretending an immutable tensor is mutable.

Watch Shapes Closely

Most indexing bugs in TensorFlow are really shape bugs. The code may look correct while one dimension is off by one or while a batch dimension is missing.

It helps to assert shapes near the indexing operation:

python
1import tensorflow as tf
2
3x = tf.ones((3, 4))
4mask = tf.constant([True, False, True])
5
6tf.debugging.assert_rank(x, 2)
7tf.debugging.assert_equal(tf.shape(mask)[0], tf.shape(x)[0])

These checks fail early and save time when debugging model code or tf.data pipelines.

Common Pitfalls

The first pitfall is assuming NumPy and TensorFlow indexing behave identically in every advanced case. They do not. TensorFlow expects you to use explicit ops for gathers, masks, and scatter-style updates.

Another issue is forgetting that tensors are immutable. If you need stateful updates, use tf.Variable or functions that return a new tensor.

Shape mismatches are also common, especially when batching is involved. A mask or index tensor that is valid for one example may fail once batched input is introduced.

Finally, keep indexing logic covered by tests with realistic tensor shapes. Silent feature-selection bugs can be expensive in training pipelines.

Summary

  • Use normal slice syntax for simple fixed-position indexing.
  • Use tf.gather for index lists and tf.gather_nd for coordinate-based selection.
  • Use tf.boolean_mask for boolean filtering.
  • Use scatter functions or tf.Variable when you need update-like behavior.
  • Add shape checks around dynamic indexing code to catch mistakes early.

Course illustration
Course illustration

All Rights Reserved.