Keras Image data generator throwing no files found error?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
The no files found message from Keras image generators usually means the directory layout or file discovery rules do not match what Keras expects. The model code can be correct, yet training fails before the first batch. Fixing this issue is mostly about validating paths, folder structure, and extensions in a systematic way.
How flow_from_directory Discovers Images
ImageDataGenerator.flow_from_directory expects one folder per class under a root directory. It does not treat the root as a class itself. If class folders are missing, the generator reports zero files.
Expected layout:
data/train/cats/*.jpgdata/train/dogs/*.jpg
If images are directly inside data/train, sample count is zero because no class directories exist.
Verify Paths Early
Relative paths often fail when scripts run from a different working directory. Use absolute paths and quick checks before creating the generator.
This short preflight prevents long debug sessions later.
Confirm Supported Extensions and Corrupt Files
Keras only loads known image formats. If your dataset contains unsupported or corrupted files, sample counts can be lower than expected. Validate files once during dataset preparation.
Cleaning bad files avoids silent dataset quality problems.
Use flow_from_dataframe for Custom Label Sources
If labels come from CSV or database rows, directory-based loading can be awkward. flow_from_dataframe provides explicit file and label mapping.
This reduces dependence on folder naming conventions.
Practical Debug Script
A short standalone script can verify dataset health before any model code runs.
Use this in CI or pre-training checks so invalid datasets are caught early.
Debug Checklist Before Training
Run this checklist when the generator reports zero images:
- Print resolved directory path.
- List class subfolders.
- Count files by extension.
- Verify image readability.
- Print
samplesandclass_indices.
With these checks, root cause is usually visible within minutes.
Batch Sanity Check Before model.fit
A fast dry run helps validate generator behavior before expensive training. Pull one batch and inspect shape, label distribution, and value ranges.
If shapes or class counts look wrong, fix data loading first. Training with bad batches can waste hours and produce misleading metrics.
Common Pitfalls
- Keeping all images in one directory without class subfolders when using
flow_from_directory. - Passing relative paths while running code from another working directory.
- Assuming every image file is valid and readable.
- Mixing hidden system files with real images in dataset folders.
- Forgetting that wrong
class_modecan break training even after files are found.
Summary
no files foundis typically a dataset discovery issue, not a model architecture issue.flow_from_directoryrequires class subfolders under the root path.- Absolute path checks and directory prints catch most mistakes quickly.
- Validate extensions and image integrity before training.
- Use
flow_from_dataframewhen labels are not encoded in folder names.

