Keras
deep learning
Android
machine learning model
mobile app development

Keras deep learning model to android

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Deploying a Keras model on Android typically means converting it to TensorFlow Lite and running inference inside the mobile app. The conversion itself is usually easy, while most production failures come from preprocessing mismatches, shape assumptions, or label-order drift. A reliable deployment pipeline treats training and Android inference as one integrated system.

Train and Export a Keras Model

Start with a model that trains and saves correctly in Python.

python
1import tensorflow as tf
2
3model = tf.keras.Sequential([
4    tf.keras.layers.Input(shape=(28, 28, 1)),
5    tf.keras.layers.Conv2D(16, 3, activation="relu"),
6    tf.keras.layers.MaxPool2D(),
7    tf.keras.layers.Flatten(),
8    tf.keras.layers.Dense(10, activation="softmax"),
9])
10
11model.compile(
12    optimizer="adam",
13    loss="sparse_categorical_crossentropy",
14    metrics=["accuracy"],
15)
16
17# model.fit(train_ds, epochs=3)
18model.save("saved_model")

Document input shape, dtype, and class order now, not later.

Convert to TensorFlow Lite

Use TFLite converter to generate a mobile runtime model.

python
1import tensorflow as tf
2
3converter = tf.lite.TFLiteConverter.from_saved_model("saved_model")
4tflite_model = converter.convert()
5
6with open("model.tflite", "wb") as f:
7    f.write(tflite_model)

Optional optimization examples:

  • dynamic range quantization for smaller size
  • full integer quantization for faster inference on some devices

Evaluate quality impact before adopting aggressive quantization.

Integrate Model Into Android App

Place model.tflite in app/src/main/assets, then add TensorFlow Lite dependency.

gradle
dependencies {
    implementation "org.tensorflow:tensorflow-lite:2.14.0"
}

If using delegates or support libraries, include additional artifacts explicitly.

Run Inference in Kotlin

Load model from assets and run interpreter with correctly shaped buffers.

kotlin
1import android.content.res.AssetFileDescriptor
2import org.tensorflow.lite.Interpreter
3import java.io.FileInputStream
4import java.nio.ByteBuffer
5import java.nio.channels.FileChannel
6
7fun loadModelFile(afd: AssetFileDescriptor): ByteBuffer {
8    FileInputStream(afd.fileDescriptor).use { input ->
9        val channel = input.channel
10        return channel.map(FileChannel.MapMode.READ_ONLY, afd.startOffset, afd.declaredLength)
11    }
12}
13
14val afd = context.assets.openFd("model.tflite")
15val modelBuffer = loadModelFile(afd)
16val interpreter = Interpreter(modelBuffer)
17
18val input = Array(1) { Array(28) { FloatArray(28) } }
19val output = Array(1) { FloatArray(10) }
20
21interpreter.run(input, output)

Model input shape and dtype must match training contract exactly.

Keep Preprocessing Identical Across Platforms

Most deployment issues are preprocessing drift:

  • training used normalization but Android did not
  • training used one resize method, app used another
  • channel order assumptions differ

Keep preprocessing documented and shared where possible.

A practical check is running the same sample through Python and Android and comparing class predictions.

Manage Labels and Postprocessing

Store labels in an asset file and map argmax index to label text consistently.

kotlin
val predictedIndex = output[0].indices.maxByOrNull { output[0][it] } ?: -1

If class order changes in retraining, update labels file with the model.

Performance on Real Devices

Optimize inference path for mobile constraints:

  • reuse one interpreter instance
  • avoid repeated allocations per frame
  • benchmark on target devices, not emulator only
  • profile preprocessing time, not only model execution

In many apps, image preprocessing is the real bottleneck.

Model Update Strategy

Define how model updates are delivered:

  • app-bundled models updated through app releases
  • remote model delivery with version control and rollback

Whichever strategy you choose, add compatibility checks between model version and app preprocessing logic.

Common Pitfalls

A common pitfall is successful TFLite conversion followed by incorrect Android input shape. Another is normalization mismatch between training and app inference. Teams often forget label mapping synchronization after retraining. Recreating interpreter per request also hurts latency and battery. Finally, model quality checks are skipped after quantization, leading to silent accuracy drops.

Summary

  • Train and export a stable Keras model with documented input contract.
  • Convert to TensorFlow Lite and validate conversion output.
  • Integrate model in Android with correct buffer shape and dtype.
  • Keep preprocessing identical between Python and mobile runtime.
  • Reuse interpreter and benchmark on real target hardware.
  • Treat model conversion, labels, and app inference as one end-to-end pipeline.

Course illustration
Course illustration

All Rights Reserved.