How to predict values with a trained Tensorflow model

Tensorflow

machine learning

model prediction

deep learning

AI techniques

How to predict values with a trained Tensorflow model

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

Running predictions with a trained TensorFlow model sounds simple, but most failures come from mismatched preprocessing and shape assumptions. A model that performed well in training can produce meaningless output if inference input differs even slightly. The safe approach is to version the full inference contract, then enforce it before every prediction call.

Load the Correct Artifact and Verify Its Signature

If your model is a Keras artifact, load it with tf.keras.models.load_model. If it is an exported SavedModel, inspect the available signatures and call the correct one. This matters when teams serve multiple model versions or custom signatures.

python

1import tensorflow as tf
2
3MODEL_PATH = "./artifacts/fraud_model"
4model = tf.keras.models.load_model(MODEL_PATH)
5
6print("inputs:", model.inputs)
7print("outputs:", model.outputs)

For a generic SavedModel:

python

1import tensorflow as tf
2
3saved = tf.saved_model.load("./artifacts/fraud_savedmodel")
4print(saved.signatures.keys())
5
6serve_fn = saved.signatures["serving_default"]
7print(serve_fn.structured_input_signature)
8print(serve_fn.structured_outputs)

Do this once at startup and fail fast if the signature is not what your service expects.

Reuse Training Preprocessing Exactly

A trained model expects the same feature ordering, scaling, categorical encoding, and missing value handling used during training. Never rebuild this from memory. Store preprocessing code in a shared module that both training and inference import.

python

1import numpy as np
2import tensorflow as tf
3
4FEATURE_MEAN = np.array([42.7, 0.18, 900.0], dtype=np.float32)
5FEATURE_STD = np.array([11.2, 0.09, 310.0], dtype=np.float32)
6
7
8def preprocess(raw_batch: np.ndarray) -> tf.Tensor:
9    if raw_batch.ndim != 2 or raw_batch.shape[1] != 3:
10        raise ValueError("expected shape (batch, 3)")
11
12    standardized = (raw_batch - FEATURE_MEAN) / FEATURE_STD
13    return tf.convert_to_tensor(standardized, dtype=tf.float32)

If training used StringLookup, Normalization, or TextVectorization, save those layers as part of the model so inference stays consistent.

Predict Single and Batch Inputs

Use training=False when calling the model directly. This prevents dropout and batch norm updates from changing output behavior.

python

1import numpy as np
2
3sample = np.array([[50.0, 0.2, 1200.0]], dtype=np.float32)
4x = preprocess(sample)
5
6probs = model(x, training=False).numpy()
7print("probability:", float(probs[0, 0]))
8print("label:", int(probs[0, 0] >= 0.5))

Batch prediction is the same API with more rows:

python

1batch = np.array([
2    [50.0, 0.2, 1200.0],
3    [31.0, 0.1, 450.0],
4    [60.0, 0.3, 1500.0],
5], dtype=np.float32)
6
7x_batch = preprocess(batch)
8out = model.predict(x_batch, verbose=0)
9print(out.squeeze())

Use model.predict for convenience in scripts and direct model calls in services where you want stricter control.

Build a Small Inference Wrapper

Wrap loading, preprocessing, prediction, and postprocessing in one class. This avoids duplicated logic across notebooks, APIs, and background jobs.

python

1from dataclasses import dataclass
2import numpy as np
3import tensorflow as tf
4
5
6@dataclass
7class PredictionResult:
8    score: float
9    label: int
10
11
12class FraudPredictor:
13    def __init__(self, path: str, threshold: float = 0.5):
14        self.model = tf.keras.models.load_model(path)
15        self.threshold = threshold
16
17    def predict_one(self, features: np.ndarray) -> PredictionResult:
18        x = preprocess(features.reshape(1, -1))
19        score = float(self.model(x, training=False).numpy()[0, 0])
20        return PredictionResult(score=score, label=int(score >= self.threshold))
21
22
23predictor = FraudPredictor("./artifacts/fraud_model", threshold=0.65)
24result = predictor.predict_one(np.array([45.0, 0.17, 1050.0], dtype=np.float32))
25print(result)

This structure is easy to test and easy to swap when model versions change.

Validate Inference Before Deployment

Run a known sample test from your training set and compare output against expected ranges. Include one edge case and one malformed input case. Example checks:

Known positive sample returns score above threshold.
Known negative sample returns score below threshold.
Wrong shape raises a clear error message.

You can automate this in CI so model artifact updates are blocked if contract checks fail.

Common Pitfalls

Reordering features at inference time and silently corrupting predictions.
Applying different normalization parameters than training.
Forgetting training=False and getting unstable output.
Mixing model versions and preprocessing versions without compatibility checks.
Treating raw logits as probabilities when the final layer is not sigmoid or softmax.

Summary

Load the exact model artifact intended for inference.
Enforce the input contract for shape, order, dtype, and preprocessing.
Use a shared preprocessing pipeline between training and serving.
Wrap prediction logic in a reusable class for consistency.
Add regression checks with known samples before each release.