Keras
neural networks
deep learning
hidden layers
model architecture

Keras How to feed input directly into other hidden layers of the neural net than the first?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Yes, Keras can feed the original input into deeper hidden layers, but you cannot do it cleanly with the basic Sequential API. You need the Functional API so you can branch the computation graph and merge the original input back in later.

Why Sequential Is the Wrong Tool

Sequential models assume one straight chain:

text
input -> layer1 -> layer2 -> layer3

Once you want a later layer to also receive the raw input, the graph is no longer linear. That makes the Functional API the correct abstraction.

This is the same mechanism behind:

  • skip connections
  • residual connections
  • dense connections
  • raw-feature bypasses in tabular models

So the idea is standard, even if the wording of the question sounds unusual.

Concatenate the Input into a Deeper Layer

The most direct pattern is to concatenate the original input with an intermediate hidden representation:

python
1import tensorflow as tf
2from tensorflow.keras import layers, Model
3
4inputs = tf.keras.Input(shape=(8,))
5
6x1 = layers.Dense(16, activation="relu")(inputs)
7x2 = layers.Concatenate()([x1, inputs])
8x3 = layers.Dense(16, activation="relu")(x2)
9outputs = layers.Dense(1)(x3)
10
11model = Model(inputs=inputs, outputs=outputs)
12model.summary()

Here the second hidden block receives both:

  • learned features from x1
  • the original eight input features

That is usually what people mean when they want to "feed input directly into another hidden layer."

Use Addition for Residual-Style Shortcuts

If shapes match, you can use addition instead of concatenation:

python
1import tensorflow as tf
2from tensorflow.keras import layers, Model
3
4inputs = tf.keras.Input(shape=(16,))
5
6x1 = layers.Dense(16, activation="relu")(inputs)
7x2 = layers.Dense(16)(x1)
8skip = layers.Add()([x2, inputs])
9outputs = layers.Dense(1)(skip)
10
11model = Model(inputs=inputs, outputs=outputs)

This is closer to a residual connection. It works well when the two tensors represent compatible feature spaces and have the same shape.

Use:

  • 'Concatenate when you want to preserve both feature sets explicitly'
  • 'Add when you want a residual shortcut'

Shape Matching Is the Main Constraint

The Keras part is easy. The real technical constraint is tensor shape compatibility.

For Concatenate:

  • batch dimension must match
  • feature dimensions can differ

For Add:

  • the entire shape must match

If shapes do not align for addition, project one tensor first:

python
projected = layers.Dense(16)(inputs)
skip = layers.Add()([x2, projected])

That is a normal modeling decision, not a hack.

Why You Might Do This

Feeding raw input into deeper layers can make sense when:

  • original features remain important even after earlier transformations
  • you want a shortcut path for optimization
  • you want to preserve low-level information
  • you are implementing an architecture inspired by skip-connected networks

In tabular models, this can sometimes help because deeper layers do not have to reconstruct basic feature information from compressed hidden states alone.

Multi-Input Thinking Helps

A useful mental model is that a later hidden layer is simply receiving more than one tensor. One tensor came from an earlier hidden layer, and the other came directly from the input.

Keras does not care whether the tensors originate from:

  • the original input
  • a side branch
  • another subnet
  • a residual path

As long as you define the graph explicitly, the model is valid.

Keep the Graph Intentional

Because the Functional API is flexible, it is easy to overconnect everything and create a graph that is hard to reason about. A good merge should express a real modeling idea:

  • preserve raw information
  • help gradients flow
  • combine complementary representations

If the answer is just "more connections must be better," the architecture usually becomes harder to debug without a clear gain.

Common Pitfalls

  • Trying to express skip-style input wiring with the Sequential API.
  • Using Add on tensors whose shapes do not match.
  • Concatenating repeatedly until the feature dimension becomes unnecessarily large.
  • Feeding raw input everywhere without a clear reason.
  • Forgetting that deeper layers receive tensors, so the computation graph has to be wired explicitly.

Summary

  • Keras can feed the original input into deeper hidden layers by using the Functional API.
  • 'Concatenate is useful when you want the later layer to see both learned and raw features.'
  • 'Add is useful for residual-style shortcuts when shapes match.'
  • Shape compatibility is the main technical issue, not Keras support.
  • Build these connections intentionally so the model stays understandable.

Course illustration
Course illustration

All Rights Reserved.