Keras
custom layer
machine learning
deep learning
neural networks

Keras Custom layer without inputs

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Yes, you can build a Keras layer whose output comes from internal trainable state instead of directly transforming an external input tensor. The pattern is valid, but it fits subclassed models better than the Functional API, because graph connectivity and batch handling get more subtle when the layer does not consume a normal symbolic input.

When a No-Input Layer Makes Sense

A no-input layer is useful when the layer represents trainable model state rather than sample-dependent computation. Examples include:

  • a trainable bias vector shared across all examples
  • a learnable global embedding or template
  • calibration offsets applied elsewhere in the model
  • other trainable constants

If the output should vary directly with the incoming example, then a no-input layer is probably the wrong abstraction.

Create the Weights in build

Even with a no-input layer, weights should still be created in build, not inside call.

python
1import tensorflow as tf
2
3class TrainableVector(tf.keras.layers.Layer):
4    def __init__(self, dim, **kwargs):
5        super().__init__(**kwargs)
6        self.dim = dim
7
8    def build(self, input_shape=None):
9        self.vector = self.add_weight(
10            name="vector",
11            shape=(self.dim,),
12            initializer="zeros",
13            trainable=True,
14        )
15
16    def call(self, inputs=None):
17        return self.vector

This is the right pattern because variable creation remains stable across tracing, saving, and repeated calls.

Subclassed Models Are Usually the Best Fit

A subclassed model is the cleanest place to use a true no-input layer.

python
1class BiasOnlyHead(tf.keras.Model):
2    def __init__(self, num_classes):
3        super().__init__()
4        self.bias = TrainableVector(num_classes)
5
6    def call(self, x):
7        batch_size = tf.shape(x)[0]
8        bias = self.bias()
9        bias = tf.expand_dims(bias, axis=0)
10        return tf.repeat(bias, repeats=batch_size, axis=0)
11
12model = BiasOnlyHead(num_classes=3)
13out = model(tf.ones((2, 5)))
14print(out.shape)

The layer itself has no external input, but the model as a whole still has a proper input and output flow.

Functional API Needs Real Connectivity

A fully disconnected tensor does not fit the Functional API naturally, because Functional models expect outputs to be derived from symbolic inputs. If you want to use the internal state inside a Functional model, it usually needs to combine with a real input path.

python
1inp = tf.keras.Input(shape=(8,))
2offset = TrainableVector(8)()
3out = inp + offset
4
5model = tf.keras.Model(inputs=inp, outputs=out)

This works because the graph still has a proper symbolic input. The no-input layer alone is not standing completely outside the model graph.

Do Not Forget Serialization

Custom layers should expose get_config so the model can be reconstructed later.

python
1class TrainableVector(tf.keras.layers.Layer):
2    def __init__(self, dim, **kwargs):
3        super().__init__(**kwargs)
4        self.dim = dim
5
6    def get_config(self):
7        config = super().get_config()
8        config.update({"dim": self.dim})
9        return config

A custom layer that trains correctly but cannot be saved or reloaded cleanly is still incomplete.

Check Gradient Flow Explicitly

No-input layers can feel unusual enough that it is worth checking whether gradients really reach the trainable state.

python
1layer = TrainableVector(4)
2
3with tf.GradientTape() as tape:
4    y = tf.reduce_sum(layer())
5
6grads = tape.gradient(y, layer.trainable_variables)
7print(grads[0].numpy())

If gradients are None, the issue is usually not the lack of inputs itself. It is more likely that the layer output never influenced the optimized loss.

Common Pitfalls

  • Creating weights inside call instead of build causes unstable behavior.
  • Trying to force a completely disconnected no-input layer into the Functional API usually makes graph connectivity awkward.
  • Forgetting to define how the layer output should broadcast across batch dimensions leads to shape bugs.
  • Treating trainable constants as ordinary layers without checking gradient flow can hide training issues.
  • Skipping get_config makes saving and reloading custom layers harder than necessary.

Summary

  • No-input custom layers are valid when they represent trainable model state rather than input-dependent computation.
  • Define weights in build, not in call.
  • Subclassed models are usually the cleanest place for truly no-input layers.
  • Functional models still need a connected symbolic input path.
  • Test gradient flow and serialization before treating the layer as production-ready.

Course illustration
Course illustration

All Rights Reserved.