Keras
fit_generator
time series
batch processing
machine learning

Keras fit_generator - How does batch for time series work?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

For time series models, a batch is usually a collection of fixed-length windows, not one long uninterrupted timeline. In older Keras examples this was often fed through fit_generator(), but in current Keras the same idea is usually handled with model.fit() on a generator, Sequence, or tf.data.Dataset.

What a Time Series Batch Actually Contains

Suppose you have a univariate sequence:

[10, 12, 13, 15, 18, 21, 20, 22]

If your look-back window is 3 and you want to predict the next value, your supervised samples become:

  • input [10, 12, 13] target 15
  • input [12, 13, 15] target 18
  • input [13, 15, 18] target 21
  • and so on

A batch is then just several of these windows grouped together.

So when Keras trains on time series, it is not slicing the original stream arbitrarily. It is training on many window-target pairs.

Legacy fit_generator() Versus Current fit()

Historically, Keras had a separate fit_generator() API. In modern Keras, fit() handles generators and Sequence objects directly, so fit_generator() is mostly legacy vocabulary.

The mental model is still the same:

  • your generator yields one batch at a time
  • each batch contains arrays shaped for the model
  • Keras treats each yielded batch as one training step

Here is a simple generator for a univariate series:

python
1import numpy as np
2
3
4def series_generator(series, look_back, batch_size):
5    X_batch, y_batch = [], []
6
7    while True:
8        for i in range(len(series) - look_back):
9            X_batch.append(series[i:i + look_back])
10            y_batch.append(series[i + look_back])
11
12            if len(X_batch) == batch_size:
13                X = np.array(X_batch).reshape(batch_size, look_back, 1)
14                y = np.array(y_batch)
15                yield X, y
16                X_batch, y_batch = [], []
17
18
19series = np.array([10, 12, 13, 15, 18, 21, 20, 22, 23, 25], dtype=np.float32)

Each yielded X has shape batch, timesteps, features.

Training an RNN with Generated Batches

Once the generator produces correctly shaped windows, the model sees each window as one independent training example.

python
1import tensorflow as tf
2
3look_back = 3
4batch_size = 2
5
6generator = series_generator(series, look_back=look_back, batch_size=batch_size)
7
8model = tf.keras.Sequential([
9    tf.keras.layers.Input(shape=(look_back, 1)),
10    tf.keras.layers.LSTM(8),
11    tf.keras.layers.Dense(1)
12])
13
14model.compile(optimizer="adam", loss="mse")
15model.fit(generator, steps_per_epoch=3, epochs=5)

The important point is that the LSTM still sees temporal order inside each window. Batching does not destroy sequence information because each sample preserves its internal timestep order.

Shuffling and Order

A common source of confusion is whether batches should be shuffled. For time series forecasting, you usually preserve the order inside each window, but you may or may not shuffle the collection of windows depending on the task.

  • for many forecasting tasks, windows can be shuffled safely after construction
  • for stateful RNN training, ordering rules are stricter
  • for walk-forward validation, you should never mix future and past across the train-validation boundary

What must not be broken is time leakage. Training windows should not peek into the future relative to the target period you are evaluating.

Sequence Is Usually Better Than a Bare Generator

For production code, keras.utils.Sequence is often better than a raw generator because it is indexable, thread-safe for Keras use, and easier to reason about.

python
1import numpy as np
2import tensorflow as tf
3
4class WindowSequence(tf.keras.utils.Sequence):
5    def __init__(self, series, look_back, batch_size):
6        self.series = np.asarray(series, dtype=np.float32)
7        self.look_back = look_back
8        self.batch_size = batch_size
9        self.indices = list(range(len(series) - look_back))
10
11    def __len__(self):
12        return (len(self.indices) + self.batch_size - 1) // self.batch_size
13
14    def __getitem__(self, idx):
15        batch_ids = self.indices[idx * self.batch_size:(idx + 1) * self.batch_size]
16        X, y = [], []
17        for i in batch_ids:
18            X.append(self.series[i:i + self.look_back])
19            y.append(self.series[i + self.look_back])
20        X = np.array(X).reshape(len(batch_ids), self.look_back, 1)
21        y = np.array(y)
22        return X, y

You can pass this directly to model.fit().

A More Modern Option: timeseries_dataset_from_array

For many current Keras workflows, the easiest approach is not to hand-roll a generator at all.

python
1import tensorflow as tf
2import numpy as np
3
4series = np.array([10, 12, 13, 15, 18, 21, 20, 22, 23, 25], dtype=np.float32)
5
6X = series[:-1]
7y = series[1:]
8
9dataset = tf.keras.utils.timeseries_dataset_from_array(
10    data=X,
11    targets=y[2:],
12    sequence_length=3,
13    batch_size=2,
14)

This is easier to maintain and reduces off-by-one mistakes.

Common Pitfalls

A common mistake is thinking a batch is a contiguous chunk of the full timeline rather than a stack of windows. For supervised sequence learning, the model trains on windows.

Another issue is getting the shapes wrong. Recurrent layers usually expect batch, timesteps, features.

Developers also often leak future information by building windows incorrectly or by splitting train and validation data after window creation instead of before it.

Finally, fit_generator() examples found online may still work conceptually, but current Keras code should usually prefer model.fit().

Summary

  • A time series batch is usually a group of sliding windows plus targets.
  • 'fit_generator() is legacy terminology; modern Keras typically uses model.fit().'
  • Each sample preserves its internal time order even when samples are batched.
  • 'Sequence and tf.data style inputs are usually cleaner than raw generators.'
  • The biggest risks are shape mistakes and time leakage.

Course illustration
Course illustration

All Rights Reserved.