Returning mutiple values in the input function for tf.py_func
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Yes, tf.py_func can return multiple values. The key is that Tout must be a list or tuple of TensorFlow dtypes, and the wrapped Python function must return the same number of NumPy values in the same order.
In TensorFlow 2, the modern replacement is tf.numpy_function. The idea is the same, but you still need to be careful about shapes, gradients, and portability.
Return Multiple Outputs by Matching Tout
In TensorFlow 1 style code, this is the essential pattern:
There are two outputs because:
- the Python function returns two values
- '
Toutdeclares two TensorFlow output dtypes'
The order must match. If the function returns (features, label), then Tout must describe the feature dtype first and the label dtype second.
The TensorFlow 2 Equivalent
In TensorFlow 2, use tf.numpy_function instead:
The extra set_shape calls are important. TensorFlow often loses static shape information across Python callbacks.
Use It Inside an Input Pipeline
This pattern often appears in a tf.data pipeline:
If you skip the shape repair step, later model code may fail because the tensors have unknown rank or unknown dimensions.
You should also make sure the NumPy dtypes produced by the Python function really match Tout. If Tout says tf.float32 but the callback returns float64, TensorFlow may insert conversions or fail in harder-to-diagnose ways later.
Know the Limitations
Both tf.py_func and tf.numpy_function call back into Python. That creates important limitations:
- no automatic gradients through the Python code
- weaker portability to SavedModel export, TFLite, or TPU workflows
- slower execution than native TensorFlow ops
So this is useful as an escape hatch, but it should not be the first choice when a native TensorFlow operation can do the same job.
It also means these functions are awkward in highly parallel or distributed pipelines because each call has to cross the TensorFlow-to-Python boundary. That can become a major performance cost if it happens for every small record.
For large datasets, moving the preprocessing into native TensorFlow ops is often the real long-term fix.
Common Pitfalls
- Returning multiple Python values but declaring only one dtype in
Tout. - Forgetting to set shapes after
tf.py_funcortf.numpy_function. - Returning Python objects instead of NumPy arrays or scalar values with clear dtypes.
- Using Python callbacks in code that later needs gradients or portable graph export.
- Choosing
py_funcwhen native TensorFlow preprocessing would have been simpler and faster.
Summary
- '
tf.py_funccan return multiple values as long asToutlists matching output dtypes.' - In TensorFlow 2, the modern equivalent is
tf.numpy_function. - The order of returned values must match the order of dtypes in
Tout. - After the call, restore shapes explicitly because static shape information is often lost.
- Use this technique sparingly because Python callbacks limit gradients, performance, and portability.

