TensorFlow
tf.one_hot
axis argument
machine learning
neural networks

In TensorFlow, what is the argument 'axis' in the function 'tf.one_hot'

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

tf.one_hot creates a new dimension whose size is depth, and the axis argument controls where that new dimension is inserted in the output tensor. If that sounds abstract, the easiest way to understand it is to look at the input shape and then see where the one-hot dimension lands after encoding.

What tf.one_hot Produces

Suppose you start with category indices such as 0, 1, and 2, and a depth of 3. One-hot encoding turns each index into a vector:

text
0 -> [1, 0, 0]
1 -> [0, 1, 0]
2 -> [0, 0, 1]

In TensorFlow:

python
1import tensorflow as tf
2
3indices = tf.constant([0, 2, 1])
4encoded = tf.one_hot(indices, depth=3)
5
6print(encoded)
7print(encoded.shape)

By default, the result shape is (3, 3). The input had shape (3,), and TensorFlow inserted a new one-hot dimension of length 3.

The Meaning of axis

axis tells TensorFlow where to put that new dimension. For a one-dimensional input, the two most useful cases are:

  • 'axis=-1, the default, appends the one-hot dimension at the end'
  • 'axis=0 inserts the one-hot dimension at the front'

Example:

python
1import tensorflow as tf
2
3indices = tf.constant([0, 2, 1])
4
5end_axis = tf.one_hot(indices, depth=3, axis=-1)
6front_axis = tf.one_hot(indices, depth=3, axis=0)
7
8print(end_axis.shape)   # (3, 3)
9print(front_axis.shape) # (3, 3)

For this one-dimensional example, both shapes happen to look the same because the tensor is square, but the layout is conceptually different. The distinction becomes much clearer with non-square or higher-rank inputs.

A Better Example with a Matrix Input

Take a rank-2 input:

python
1import tensorflow as tf
2
3indices = tf.constant([
4    [0, 1],
5    [2, 1]
6])

Its shape is (2, 2). Now use different axis values:

python
1encoded_last = tf.one_hot(indices, depth=3, axis=-1)
2encoded_zero = tf.one_hot(indices, depth=3, axis=0)
3encoded_one = tf.one_hot(indices, depth=3, axis=1)
4
5print(encoded_last.shape)  # (2, 2, 3)
6print(encoded_zero.shape)  # (3, 2, 2)
7print(encoded_one.shape)   # (2, 3, 2)

That is the core idea:

  • 'axis=-1 appends the new dimension'
  • 'axis=0 prepends it'
  • 'axis=1 inserts it between existing dimensions'

The one-hot axis always has size depth.

Why Placement Matters

The output layout matters because downstream TensorFlow operations expect tensors in particular shapes. For example:

  • a dense classifier label matrix often uses one-hot vectors on the last axis
  • some custom operations may expect the class dimension earlier
  • broadcasting and transposition become easier or harder depending on where the dimension lands

In practice, axis=-1 is the most common choice because machine learning code often treats the class dimension as the last dimension.

Inspecting Values, Not Just Shapes

To see the difference concretely:

python
1import tensorflow as tf
2
3indices = tf.constant([[0, 1]])
4
5encoded_last = tf.one_hot(indices, depth=3, axis=-1)
6encoded_zero = tf.one_hot(indices, depth=3, axis=0)
7
8print(encoded_last.numpy())
9print(encoded_zero.numpy())

With axis=-1, each original scalar becomes a length-3 vector at the end. With axis=0, the class dimension is the outermost dimension, so the same data is arranged differently.

The encoded values are still representing the same categories. Only the tensor layout changes.

Negative Axes

TensorFlow supports negative axes the same way many array libraries do. axis=-1 means "insert at the end." For a rank-2 input, axis=-2 would mean "insert before the last existing dimension."

Most code does not need complicated negative indexing here, but it is useful to know the rule when you are working with higher-rank tensors.

When to Leave axis Alone

If you are using tf.one_hot for labels in a typical model pipeline, the default is usually correct:

python
labels = tf.constant([0, 2, 1])
one_hot_labels = tf.one_hot(labels, depth=3)

You only need to specify axis when another part of the pipeline expects a different layout.

That is why many examples omit axis entirely. It is not unimportant. It just often has a sensible default.

Common Pitfalls

The biggest mistake is thinking axis changes which class gets the 1. It does not. Class encoding is determined by the index values and depth. axis only changes where the new one-hot dimension appears in the output shape.

Another issue is looking at a square result such as (3, 3) and assuming axis=0 and axis=-1 are identical. The shapes may match numerically while the tensor layout still differs.

Developers also sometimes forget to account for the extra dimension when feeding the output into later layers. A model expecting (batch, classes) will not accept (classes, batch) without reshaping or transposing.

Finally, if depth is wrong, the output may still have the requested axis placement but represent the wrong class space entirely.

Summary

  • 'tf.one_hot adds a new dimension of length depth.'
  • The axis argument decides where that new dimension is inserted.
  • 'axis=-1 puts the one-hot vectors on the last axis, which is the most common layout.'
  • Different axis values change tensor shape and layout, not the underlying class identity.
  • Check both the output shape and the downstream model expectations before choosing a non-default axis.

Course illustration
Course illustration

All Rights Reserved.