ImageDataGenerator
to_categorical
Keras
data preprocessing
machine learning

how to use to_categorical when using ImageDataGenerator

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

to_categorical and ImageDataGenerator are related only because they both touch training input, but they solve different problems. to_categorical converts integer class labels into one-hot vectors, while ImageDataGenerator augments images and yields batches during training.

The right answer depends on how you create the generator. If you use flow_from_directory with class_mode="categorical", Keras already gives you one-hot labels and you do not need to call to_categorical yourself.

Understand the Label Formats

For multiclass classification, the two common label encodings are:

  • integer labels such as 0, 1, 2
  • one-hot labels such as [1, 0, 0]

Those labels must match your model configuration.

Typical pairings are:

  • 'categorical_crossentropy with one-hot labels'
  • 'sparse_categorical_crossentropy with integer labels'

If the label encoding and loss function do not match, training fails or produces misleading results.

Case 1: flow_from_directory

If you use flow_from_directory and set class_mode="categorical", the generator already returns one-hot labels.

python
1from tensorflow.keras.preprocessing.image import ImageDataGenerator
2
3train_gen = ImageDataGenerator(rescale=1.0 / 255)
4
5train_data = train_gen.flow_from_directory(
6    "data/train",
7    target_size=(128, 128),
8    batch_size=16,
9    class_mode="categorical",
10)
11
12images, labels = next(train_data)
13print(images.shape)
14print(labels.shape)
15print(train_data.class_indices)

In this setup, labels is already shaped like (batch_size, num_classes). Calling to_categorical again would be wrong because the labels are already categorical.

Case 2: flow With In-Memory Arrays

If you already have arrays x and integer labels y, then to_categorical can make sense before calling flow.

python
1import numpy as np
2from tensorflow.keras.preprocessing.image import ImageDataGenerator
3from tensorflow.keras.utils import to_categorical
4
5x = np.random.rand(32, 64, 64, 3).astype("float32")
6y_int = np.random.randint(0, 3, size=(32,))
7
8y_one_hot = to_categorical(y_int, num_classes=3)
9
10gen = ImageDataGenerator(rotation_range=10, horizontal_flip=True)
11train_data = gen.flow(x, y_one_hot, batch_size=8)
12
13batch_x, batch_y = next(train_data)
14print(batch_x.shape)
15print(batch_y.shape)

That is the right use of to_categorical: converting integer labels that you already own before feeding them into a generator that expects you to supply the labels yourself.

Match the Model to the Generator Output

A consistent model for three classes looks like this:

python
1import tensorflow as tf
2
3model = tf.keras.Sequential([
4    tf.keras.layers.Input(shape=(64, 64, 3)),
5    tf.keras.layers.Conv2D(16, 3, activation="relu"),
6    tf.keras.layers.MaxPool2D(),
7    tf.keras.layers.Flatten(),
8    tf.keras.layers.Dense(32, activation="relu"),
9    tf.keras.layers.Dense(3, activation="softmax"),
10])
11
12model.compile(
13    optimizer="adam",
14    loss="categorical_crossentropy",
15    metrics=["accuracy"],
16)

If you decide to keep integer labels instead, skip to_categorical and compile with sparse_categorical_crossentropy instead of categorical_crossentropy.

Validate One Batch Before Training

Before launching a long run, inspect one batch and confirm that the labels look the way you think they do.

python
1import numpy as np
2
3_, yb = next(train_data)
4print("batch label shape:", yb.shape)
5print("row sums:", np.unique(np.sum(yb, axis=1)))
6print("first row:", yb[0])

For one-hot labels, each row should sum to 1. This quick check catches many silent preprocessing mistakes before they waste GPU time.

Common Pitfalls

A common mistake is applying to_categorical to labels that are already one-hot because flow_from_directory(..., class_mode="categorical") already did the conversion.

Another issue is using categorical_crossentropy with integer labels or sparse_categorical_crossentropy with one-hot labels. The model, loss, and generator output all have to agree.

Developers also sometimes forget that class_indices is derived from directory names when using flow_from_directory. If class ordering matters, inspect it explicitly instead of assuming the order.

Finally, do not debug only the model architecture when training looks wrong. Many classification failures are really label-format mismatches.

Summary

  • 'to_categorical converts integer labels to one-hot vectors.'
  • With flow_from_directory(..., class_mode="categorical"), Keras already does that for you.
  • Use to_categorical mainly when you supply integer label arrays yourself to flow.
  • Keep label encoding, output layer shape, and loss function aligned.
  • Inspect one batch early to verify that the labels match your intended training setup.

Course illustration
Course illustration

All Rights Reserved.