machine learning
python
model extraction
formulas
algorithms

Is it possible to extract the formulas of the trained machine learning models in python?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

Yes, but the answer depends entirely on the model family. Some trained models have a compact formula you can print directly, while others only have a procedural definition made of thousands or millions of learned parameters. In those cases, "extracting the formula" usually means exporting coefficients, rules, or an interpretable approximation.

Models with Direct Formulas

Linear and logistic models are the easiest to inspect. A trained linear regression model is just an intercept plus one coefficient per feature.

Example with scikit-learn:

python
1from sklearn.linear_model import LinearRegression
2import numpy as np
3
4X = np.array([[1.0, 2.0], [2.0, 1.0], [3.0, 4.0], [4.0, 3.0]])
5y = np.array([5.0, 5.0, 11.0, 11.0])
6
7model = LinearRegression().fit(X, y)
8
9print("intercept:", model.intercept_)
10print("coefficients:", model.coef_)

If the coefficients are [a, b] and the intercept is c, the formula is:

text
y = c + a*x1 + b*x2

Logistic regression is similar, except the linear expression is passed through a sigmoid to get a probability.

Tree Models Give Rules, Not One Small Equation

Decision trees and random forests do not collapse into one neat algebraic expression. What you can extract is a set of branching rules.

python
1from sklearn.datasets import load_iris
2from sklearn.tree import DecisionTreeClassifier, export_text
3
4X, y = load_iris(return_X_y=True)
5tree = DecisionTreeClassifier(max_depth=3, random_state=0).fit(X, y)
6
7print(export_text(tree))

That output is often the best "formula" available because the model is literally a hierarchy of threshold tests.

Neural Networks Usually Do Not Have a Human-Sized Formula

A neural network is still a mathematical function, but the resulting expression is rarely useful in raw symbolic form. A two-layer network is already:

text
output = activation(W2 * activation(W1 * x + b1) + b2)

You can export the weights:

python
1import tensorflow as tf
2
3model = tf.keras.Sequential([
4    tf.keras.layers.Dense(8, activation="relu", input_shape=(4,)),
5    tf.keras.layers.Dense(1)
6])
7
8for layer in model.layers:
9    weights = layer.get_weights()
10    print(layer.name, [w.shape for w in weights])

But printing every coefficient rarely improves understanding. For complex networks, interpretation tools are usually more valuable than a raw symbolic dump.

Better Alternatives for Complex Models

If your real goal is interpretability, use the tool that matches the model:

  • Coefficients for linear models
  • Rule export for tree models
  • Feature importance, SHAP values, or partial dependence for ensembles
  • Saliency maps, integrated gradients, or surrogate models for neural networks

For example, a surrogate decision tree can approximate a black-box model's behavior on a bounded dataset, even though it is not the original formula.

When Symbolic Extraction Is Reasonable

There are a few cases where exact or near-exact symbolic extraction is practical:

  • Small linear or polynomial models
  • Symbolic regression models
  • Tiny neural networks used for research or teaching
  • Models intentionally trained with sparsity or monotonic structure

In ordinary production deep learning, the model definition plus its learned tensors are the real representation. There may not be a concise closed-form equation worth reading.

Common Pitfalls

  • Expecting every model to produce a short algebraic formula like a textbook regression.
  • Confusing learned parameters with interpretability. A giant weight matrix is technically the formula, but not a useful explanation.
  • Ignoring preprocessing. The effective model often includes scaling, encoding, or feature generation before the estimator itself.
  • Treating a surrogate explanation as the same thing as the original model.

Summary

  • Some models, especially linear ones, have a direct printable formula.
  • Tree-based models are best described as rule systems rather than one equation.
  • Neural networks can expose weights, but usually not a useful human-sized symbolic formula.
  • For complex models, interpretation tools are often better than raw parameter dumps.
  • Always include preprocessing in whatever "formula" you claim to have extracted.

Course illustration
Course illustration

All Rights Reserved.