TensorFlow
dynamic_rnn
machine learning
neural networks
\\`RNN\\` analysis

Analysis of the output from tf.nn.dynamic_rnn tensorflow function

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

tf.nn.dynamic_rnn is a TensorFlow 1.x API for running recurrent cells over a sequence without manually unrolling the loop. The most important thing to understand is that it returns two related but different results: outputs, which contains per-time-step values, and state, which contains the final recurrent state after the whole sequence has been processed.

What dynamic_rnn Returns

The function returns a pair:

  • 'outputs'
  • 'state'

For a basic RNN or GRU cell:

  • 'outputs has shape batch x time x units'
  • 'state has shape batch x units'

For an LSTM cell, the final state is more complex because an LSTM tracks both memory and hidden output. In TensorFlow 1.x that final state is commonly an LSTMStateTuple containing:

  • 'c, the cell state'
  • 'h, the hidden state'

A Minimal Example

Here is a small tf.compat.v1 example showing the structure clearly.

python
1import tensorflow as tf
2
3tf.compat.v1.disable_eager_execution()
4
5inputs = tf.compat.v1.placeholder(tf.float32, shape=[None, None, 3])
6sequence_lengths = tf.compat.v1.placeholder(tf.int32, shape=[None])
7
8cell = tf.compat.v1.nn.rnn_cell.LSTMCell(num_units=4)
9outputs, state = tf.compat.v1.nn.dynamic_rnn(
10    cell,
11    inputs,
12    sequence_length=sequence_lengths,
13    dtype=tf.float32
14)
15
16print(outputs)
17print(state)

The placeholder shape means:

  • batch size is dynamic
  • time dimension is dynamic
  • each time step has 3 input features

Since the LSTM has 4 units, each output vector has width 4.

Interpreting outputs

outputs contains one vector for every time step of every sequence in the batch.

If the shape is:

[batch_size, max_time, num_units]

then outputs[i, t, :] is the recurrent output for batch item i at time step t.

That makes outputs useful when the model needs information from all time steps, such as:

  • sequence labeling
  • attention over hidden states
  • pooling over recurrent outputs

In other words, outputs is the time-resolved view.

Interpreting state

state is the final recurrent state after the last valid time step for each sequence.

For an LSTM, you often access it like this:

python
final_c = state.c
final_h = state.h

final_h is often used as a compact representation of the entire sequence, for example in:

  • sequence classification
  • encoder-decoder models
  • downstream dense layers

state is the summary view, while outputs is the full timeline.

Effect Of sequence_length

The sequence_length argument is important when sequences in the batch have different lengths. It tells TensorFlow which time steps are real and which are just padding.

If you pass:

python
sequence_lengths = [5, 3]

then the second sequence is considered valid only through time step 2. Later time steps in outputs for that item are masked out in the recurrent computation, and the final state is taken from the last valid step rather than the padded tail.

Without sequence_length, padded values can accidentally influence the final state.

Common Source Of Confusion

A very common misunderstanding is thinking the last slice of outputs is always identical to state. That is only reliably true for simple cells without variable-length complications.

With variable sequence lengths, the last tensor position along the time axis may correspond to padding for some items, while state still corresponds to the last valid sequence element.

So if batching uses padding, state is usually the safer representation for "final sequence encoding".

When To Use Which Output

Use outputs when you need every time step.

Use state when you need one final recurrent summary.

That distinction is one of the key design choices in RNN models. If you are building token-by-token prediction or alignment, use outputs. If you are building one prediction per sequence, state is often the better starting point.

Common Pitfalls

A common mistake is ignoring sequence_length when batching padded sequences. That can make the RNN treat padding as real data.

Another issue is assuming state has the same structure for every cell type. GRU, vanilla RNN, and LSTM do not all package state identically.

Developers also often confuse outputs[:, -1, :] with the true final state for every item in a padded batch. That is not safe when sequences have different lengths.

Finally, remember that tf.nn.dynamic_rnn is a TensorFlow 1.x style API. When reading older code, focus on the tensor semantics rather than expecting modern Keras layer ergonomics.

Summary

  • 'tf.nn.dynamic_rnn returns outputs for every time step and state for the final recurrent state.'
  • 'outputs is shaped like batch x time x units.'
  • For LSTMs, state contains both c and h.
  • 'sequence_length is essential when inputs are padded to different lengths.'
  • Use outputs for time-step-level tasks and state for sequence-level summaries.

Course illustration
Course illustration

All Rights Reserved.