keras what is the difference between model.predict and model.predict_proba
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
In Keras, model.predict() and model.predict_proba() return the same output — there is no difference. predict_proba() was a scikit-learn compatibility wrapper that simply called predict() internally. It was deprecated in TensorFlow 2.6 and removed in later versions. For classification models with softmax or sigmoid output, model.predict() already returns probabilities. For class labels, use np.argmax(model.predict(x), axis=1) (multi-class) or (model.predict(x) > 0.5).astype(int) (binary).
model.predict() Returns Probabilities
With a sigmoid output layer, model.predict() returns values between 0 and 1 — these are already class probabilities.
Multi-Class Classification
With a softmax output layer, each row contains probabilities for all classes that sum to 1.
Converting Probabilities to Class Labels
What predict_proba Was
If you see code using predict_proba(), replace it with predict() — the output is identical.
Regression Models
For regression, there is no concept of "predict_proba" — the output is the predicted target value.
Scikit-Learn Wrapper Compatibility
In scikit-learn's API, predict() returns class labels and predict_proba() returns probabilities. The Keras wrapper follows this convention, which is the origin of the confusion.
Comparison Table
| Method | Keras (native) | Scikit-Learn | KerasClassifier wrapper |
predict() | Raw output (probabilities for classification) | Class labels | Class labels |
predict_proba() | Deprecated (same as predict) | Class probabilities | Class probabilities |
predict_classes() | Removed in TF 2.6 | N/A | N/A |
Batch Prediction and Performance
Common Pitfalls
- Using
predict_proba()on modern TF:predict_proba()was removed in recent TensorFlow versions. Replace allpredict_proba()calls withpredict(). The output is identical for Keras models. - Expecting class labels from
predict(): Native Keraspredict()returns raw model output (probabilities for classification). Usenp.argmax()or thresholding to convert to class labels. Do not confuse with scikit-learn'spredict()which returns labels. - Forgetting the activation function: If the last layer has no activation (
Dense(3)withoutsoftmax),predict()returns raw logits, not probabilities. Apply softmax manually:tf.nn.softmax(model.predict(X)). - Using
predict_classes():model.predict_classes()was also removed in TF 2.6. Usenp.argmax(model.predict(x), axis=1)instead. - Predict on single sample:
model.predict(single_sample)expects a batch dimension. Reshape withnp.expand_dims(sample, 0)orsample.reshape(1, -1)before calling predict.
Summary
model.predict()andmodel.predict_proba()return the same output in Keraspredict_proba()was deprecated and removed — always usepredict()- For classification with softmax/sigmoid,
predict()already returns probabilities - Convert to class labels with
np.argmax()(multi-class) or thresholding (binary) - The scikit-learn
KerasClassifierwrapper has different semantics wherepredict()returns labels andpredict_proba()returns probabilities - For regression models,
predict()returns continuous values, not probabilities

