Keras How to feed input directly into other hidden layers of the neural net than the first?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Yes, Keras can feed the original input into deeper hidden layers, but you cannot do it cleanly with the basic Sequential API. You need the Functional API so you can branch the computation graph and merge the original input back in later.
Why Sequential Is the Wrong Tool
Sequential models assume one straight chain:
Once you want a later layer to also receive the raw input, the graph is no longer linear. That makes the Functional API the correct abstraction.
This is the same mechanism behind:
- skip connections
- residual connections
- dense connections
- raw-feature bypasses in tabular models
So the idea is standard, even if the wording of the question sounds unusual.
Concatenate the Input into a Deeper Layer
The most direct pattern is to concatenate the original input with an intermediate hidden representation:
Here the second hidden block receives both:
- learned features from
x1 - the original eight input features
That is usually what people mean when they want to "feed input directly into another hidden layer."
Use Addition for Residual-Style Shortcuts
If shapes match, you can use addition instead of concatenation:
This is closer to a residual connection. It works well when the two tensors represent compatible feature spaces and have the same shape.
Use:
- '
Concatenatewhen you want to preserve both feature sets explicitly' - '
Addwhen you want a residual shortcut'
Shape Matching Is the Main Constraint
The Keras part is easy. The real technical constraint is tensor shape compatibility.
For Concatenate:
- batch dimension must match
- feature dimensions can differ
For Add:
- the entire shape must match
If shapes do not align for addition, project one tensor first:
That is a normal modeling decision, not a hack.
Why You Might Do This
Feeding raw input into deeper layers can make sense when:
- original features remain important even after earlier transformations
- you want a shortcut path for optimization
- you want to preserve low-level information
- you are implementing an architecture inspired by skip-connected networks
In tabular models, this can sometimes help because deeper layers do not have to reconstruct basic feature information from compressed hidden states alone.
Multi-Input Thinking Helps
A useful mental model is that a later hidden layer is simply receiving more than one tensor. One tensor came from an earlier hidden layer, and the other came directly from the input.
Keras does not care whether the tensors originate from:
- the original input
- a side branch
- another subnet
- a residual path
As long as you define the graph explicitly, the model is valid.
Keep the Graph Intentional
Because the Functional API is flexible, it is easy to overconnect everything and create a graph that is hard to reason about. A good merge should express a real modeling idea:
- preserve raw information
- help gradients flow
- combine complementary representations
If the answer is just "more connections must be better," the architecture usually becomes harder to debug without a clear gain.
Common Pitfalls
- Trying to express skip-style input wiring with the Sequential API.
- Using
Addon tensors whose shapes do not match. - Concatenating repeatedly until the feature dimension becomes unnecessarily large.
- Feeding raw input everywhere without a clear reason.
- Forgetting that deeper layers receive tensors, so the computation graph has to be wired explicitly.
Summary
- Keras can feed the original input into deeper hidden layers by using the Functional API.
- '
Concatenateis useful when you want the later layer to see both learned and raw features.' - '
Addis useful for residual-style shortcuts when shapes match.' - Shape compatibility is the main technical issue, not Keras support.
- Build these connections intentionally so the model stays understandable.

