Get the last output of a dynamic_rnn in TensorFlow

TensorFlow

dynamic_rnn

last output

neural networks

machine learning

Get the last output of a dynamic_rnn in TensorFlow

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

When you use tf.nn.dynamic_rnn, there are two closely related outputs: the full sequence of outputs and the final state. If you want the last relevant output for each sequence, the right approach depends on whether you mean the mathematically final state or the last valid time step in a padded batch.

What `dynamic_rnn` returns

dynamic_rnn returns:

'outputs, usually shaped [batch, time, hidden_size]'
'state, the final RNN state'

For a basic RNN cell, the final state often matches the last output. For LSTM and more complex cells, the state structure is more nuanced, and it is often the better thing to use directly.

If you only need the final state, use `state`

For many classification models, the cleanest answer is to use the returned final state instead of slicing outputs.

python

1outputs, state = tf.nn.dynamic_rnn(
2    cell,
3    inputs,
4    sequence_length=sequence_lengths,
5    dtype=tf.float32
6)

If cell is an LSTMCell, state is typically an LSTMStateTuple, and the hidden state component is what many downstream layers use.

python

last_hidden = state.h

That is usually preferable to trying to guess the last output manually.

If you need the last valid output from `outputs`

When sequences are padded to a common maximum length, the last time index in the tensor is not necessarily the last real element for every example. You need to gather using the actual sequence lengths.

python

1import tensorflow as tf
2
3batch_size = tf.shape(outputs)[0]
4max_length = tf.shape(outputs)[1]
5hidden_size = outputs.shape[2]
6
7index = tf.range(0, batch_size) * max_length + (sequence_lengths - 1)
8flat = tf.reshape(outputs, [-1, hidden_size])
9last_outputs = tf.gather(flat, index)

This pattern extracts the last valid output for each sequence in the batch.

Why you cannot just take `outputs[:, -1, :]`

That only works if every sequence has the same length and no padding. In real variable-length sequence work, outputs[:, -1, :] often points at padded timesteps for shorter sequences.

So the safe rule is:

fixed-length sequences: slicing the last timestep is fine
variable-length padded sequences: gather using sequence_length

LSTM detail: state versus output

For an LSTM cell, the final state has two parts:

cell state c
hidden state h

If you want what most people informally call the "last output", state.h is often the most useful quantity.

python

outputs, state = tf.nn.dynamic_rnn(lstm_cell, inputs, sequence_length=sequence_lengths, dtype=tf.float32)
final_representation = state.h

That is cleaner than computing a gather from outputs unless you explicitly need the sequence tensor result.

A practical decision rule

Ask which one you actually need:

use state when you want the model's final recurrent representation
use gathered outputs when you specifically need the last valid timestep output tensor

Those are related, but they are not always interchangeable across cell types.

Common Pitfalls

Using outputs[:, -1, :] on padded variable-length sequences.
Ignoring the returned state and doing manual indexing when the final state would be simpler.
Forgetting that LSTM state is structured, not just a single tensor.
Assuming the final output and final state are always identical across all RNN cell types.
Mixing fixed-length and variable-length sequence logic in the same code path.

Summary

'dynamic_rnn returns both the full output sequence and the final state.'
For many models, the returned final state is the easiest way to get the last representation.
For variable-length padded sequences, gather the last valid timestep using sequence_length.
Do not blindly use outputs[:, -1, :] unless every sequence is truly the same length.
In LSTM models, state.h is often the most useful "last output" equivalent.

Get the last output of a dynamic_rnn in TensorFlow

Master System Design with Codemia

Introduction

What dynamic_rnn returns

If you only need the final state, use state

If you need the last valid output from outputs

Why you cannot just take outputs[:, -1, :]

LSTM detail: state versus output

A practical decision rule

Common Pitfalls

Summary

What `dynamic_rnn` returns

If you only need the final state, use `state`

If you need the last valid output from `outputs`

Why you cannot just take `outputs[:, -1, :]`