how to use to_categorical when using ImageDataGenerator
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
to_categorical and ImageDataGenerator are related only because they both touch training input, but they solve different problems. to_categorical converts integer class labels into one-hot vectors, while ImageDataGenerator augments images and yields batches during training.
The right answer depends on how you create the generator. If you use flow_from_directory with class_mode="categorical", Keras already gives you one-hot labels and you do not need to call to_categorical yourself.
Understand the Label Formats
For multiclass classification, the two common label encodings are:
- integer labels such as
0,1,2 - one-hot labels such as
[1, 0, 0]
Those labels must match your model configuration.
Typical pairings are:
- '
categorical_crossentropywith one-hot labels' - '
sparse_categorical_crossentropywith integer labels'
If the label encoding and loss function do not match, training fails or produces misleading results.
Case 1: flow_from_directory
If you use flow_from_directory and set class_mode="categorical", the generator already returns one-hot labels.
In this setup, labels is already shaped like (batch_size, num_classes). Calling to_categorical again would be wrong because the labels are already categorical.
Case 2: flow With In-Memory Arrays
If you already have arrays x and integer labels y, then to_categorical can make sense before calling flow.
That is the right use of to_categorical: converting integer labels that you already own before feeding them into a generator that expects you to supply the labels yourself.
Match the Model to the Generator Output
A consistent model for three classes looks like this:
If you decide to keep integer labels instead, skip to_categorical and compile with sparse_categorical_crossentropy instead of categorical_crossentropy.
Validate One Batch Before Training
Before launching a long run, inspect one batch and confirm that the labels look the way you think they do.
For one-hot labels, each row should sum to 1. This quick check catches many silent preprocessing mistakes before they waste GPU time.
Common Pitfalls
A common mistake is applying to_categorical to labels that are already one-hot because flow_from_directory(..., class_mode="categorical") already did the conversion.
Another issue is using categorical_crossentropy with integer labels or sparse_categorical_crossentropy with one-hot labels. The model, loss, and generator output all have to agree.
Developers also sometimes forget that class_indices is derived from directory names when using flow_from_directory. If class ordering matters, inspect it explicitly instead of assuming the order.
Finally, do not debug only the model architecture when training looks wrong. Many classification failures are really label-format mismatches.
Summary
- '
to_categoricalconverts integer labels to one-hot vectors.' - With
flow_from_directory(..., class_mode="categorical"), Keras already does that for you. - Use
to_categoricalmainly when you supply integer label arrays yourself toflow. - Keep label encoding, output layer shape, and loss function aligned.
- Inspect one batch early to verify that the labels match your intended training setup.

