How can I implement a custom `RNN` specifically an ESN in Tensorflow?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
An Echo State Network, or ESN, is a recurrent model with a fixed reservoir and a trainable readout. In TensorFlow, that means the custom recurrent logic should update hidden state over time while keeping the input and recurrent reservoir weights non-trainable.
What Makes an ESN Different from a Usual RNN
A standard RNN learns recurrent weights with backpropagation through time. An ESN deliberately does not. The reservoir is randomly initialized, scaled so its dynamics remain useful, and then left fixed while only the final readout is trained.
That changes the implementation goals:
- the reservoir weights are created once and frozen
- the sequence still updates state recurrently at each time step
- the output layer is the part that learns from data
So the main question is not "how do I train a custom RNN end to end?" It is "how do I express a fixed recurrent reservoir inside TensorFlow cleanly?"
A Practical Custom ESN Cell
The Keras RNN wrapper can run any cell-like layer that exposes state_size and output_size. That makes a custom ESN cell a good fit.
This cell has the key ESN property: the reservoir weights are part of the layer, but they are not trainable.
Wrap the Cell in a Keras Model
Once the cell exists, you can use the normal tf.keras.layers.RNN wrapper and add a trainable readout layer on top.
In this model, the Dense layer is the trainable readout. The ESN reservoir transforms the sequence into a state representation, and the readout learns how to map that state to the target.
Why Spectral Radius Matters
ESNs are sensitive to reservoir dynamics. If the recurrent matrix is too small, the state may wash out too quickly and stop carrying useful temporal information. If it is too large, the dynamics can become unstable.
That is why many ESN implementations scale the recurrent matrix by spectral radius. The code above does that during build. It is a simplified version, but it captures the standard idea: generate random weights, compute the current radius, then rescale the matrix to the radius you want.
In real experiments, reservoir size, sparsity, input scaling, and leak rate often need to be tuned together. The TensorFlow implementation is only one part of getting a good ESN.
Add Washout When Training
Many ESN workflows ignore the earliest states in each sequence because they are overly influenced by the initial zero state. This is usually called washout.
One simple way to expose later states is to return the full state sequence and slice off the first part before the readout.
This is not the only washout strategy, but it shows the idea clearly: do not force the readout to learn from unstable early reservoir states.
A Custom Layer Is Also Acceptable
You do not have to implement an ESN as an RNN cell. A custom layer that loops over the time axis can also work. The cell-based approach is simply more aligned with Keras conventions and integrates better with masking, return_sequences, and other recurrent-model features.
That said, the real ESN requirement is architectural, not inheritance-based. If the reservoir is fixed and only the readout learns, the design is still ESN-like even if you do not subclass a formal RNN cell type.
Common Pitfalls
One common mistake is leaving reservoir weights trainable. Once the recurrent or input reservoir weights start training normally, the model is no longer behaving like a classic ESN.
Another pitfall is ignoring spectral radius. Random recurrent weights without scaling can produce reservoir dynamics that are too weak or too unstable.
A third issue is skipping washout entirely and training on states dominated by initialization artifacts.
Finally, do not assume TensorFlow has a built-in ESN layer. You usually need to express the reservoir behavior yourself with a custom cell or layer.
Summary
- An ESN keeps the reservoir fixed and trains only the readout layer.
- In TensorFlow, a custom cell wrapped by
tf.keras.layers.RNNis a clean implementation strategy. - Mark the reservoir weights as non-trainable so the model keeps ESN behavior.
- Scale the recurrent matrix to a sensible spectral radius and consider washout during training.
- The important part is the fixed-reservoir design, not a specific inheritance hierarchy.

