In TensorFlow, what is the argument 'axis' in the function 'tf.one_hot'
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
tf.one_hot creates a new dimension whose size is depth, and the axis argument controls where that new dimension is inserted in the output tensor. If that sounds abstract, the easiest way to understand it is to look at the input shape and then see where the one-hot dimension lands after encoding.
What tf.one_hot Produces
Suppose you start with category indices such as 0, 1, and 2, and a depth of 3. One-hot encoding turns each index into a vector:
In TensorFlow:
By default, the result shape is (3, 3). The input had shape (3,), and TensorFlow inserted a new one-hot dimension of length 3.
The Meaning of axis
axis tells TensorFlow where to put that new dimension. For a one-dimensional input, the two most useful cases are:
- '
axis=-1, the default, appends the one-hot dimension at the end' - '
axis=0inserts the one-hot dimension at the front'
Example:
For this one-dimensional example, both shapes happen to look the same because the tensor is square, but the layout is conceptually different. The distinction becomes much clearer with non-square or higher-rank inputs.
A Better Example with a Matrix Input
Take a rank-2 input:
Its shape is (2, 2). Now use different axis values:
That is the core idea:
- '
axis=-1appends the new dimension' - '
axis=0prepends it' - '
axis=1inserts it between existing dimensions'
The one-hot axis always has size depth.
Why Placement Matters
The output layout matters because downstream TensorFlow operations expect tensors in particular shapes. For example:
- a dense classifier label matrix often uses one-hot vectors on the last axis
- some custom operations may expect the class dimension earlier
- broadcasting and transposition become easier or harder depending on where the dimension lands
In practice, axis=-1 is the most common choice because machine learning code often treats the class dimension as the last dimension.
Inspecting Values, Not Just Shapes
To see the difference concretely:
With axis=-1, each original scalar becomes a length-3 vector at the end. With axis=0, the class dimension is the outermost dimension, so the same data is arranged differently.
The encoded values are still representing the same categories. Only the tensor layout changes.
Negative Axes
TensorFlow supports negative axes the same way many array libraries do. axis=-1 means "insert at the end." For a rank-2 input, axis=-2 would mean "insert before the last existing dimension."
Most code does not need complicated negative indexing here, but it is useful to know the rule when you are working with higher-rank tensors.
When to Leave axis Alone
If you are using tf.one_hot for labels in a typical model pipeline, the default is usually correct:
You only need to specify axis when another part of the pipeline expects a different layout.
That is why many examples omit axis entirely. It is not unimportant. It just often has a sensible default.
Common Pitfalls
The biggest mistake is thinking axis changes which class gets the 1. It does not. Class encoding is determined by the index values and depth. axis only changes where the new one-hot dimension appears in the output shape.
Another issue is looking at a square result such as (3, 3) and assuming axis=0 and axis=-1 are identical. The shapes may match numerically while the tensor layout still differs.
Developers also sometimes forget to account for the extra dimension when feeding the output into later layers. A model expecting (batch, classes) will not accept (classes, batch) without reshaping or transposing.
Finally, if depth is wrong, the output may still have the requested axis placement but represent the wrong class space entirely.
Summary
- '
tf.one_hotadds a new dimension of lengthdepth.' - The
axisargument decides where that new dimension is inserted. - '
axis=-1puts the one-hot vectors on the last axis, which is the most common layout.' - Different
axisvalues change tensor shape and layout, not the underlying class identity. - Check both the output shape and the downstream model expectations before choosing a non-default axis.

