Is it possible to extract the formulas of the trained machine learning models in python?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Yes, but the answer depends entirely on the model family. Some trained models have a compact formula you can print directly, while others only have a procedural definition made of thousands or millions of learned parameters. In those cases, "extracting the formula" usually means exporting coefficients, rules, or an interpretable approximation.
Models with Direct Formulas
Linear and logistic models are the easiest to inspect. A trained linear regression model is just an intercept plus one coefficient per feature.
Example with scikit-learn:
If the coefficients are [a, b] and the intercept is c, the formula is:
Logistic regression is similar, except the linear expression is passed through a sigmoid to get a probability.
Tree Models Give Rules, Not One Small Equation
Decision trees and random forests do not collapse into one neat algebraic expression. What you can extract is a set of branching rules.
That output is often the best "formula" available because the model is literally a hierarchy of threshold tests.
Neural Networks Usually Do Not Have a Human-Sized Formula
A neural network is still a mathematical function, but the resulting expression is rarely useful in raw symbolic form. A two-layer network is already:
You can export the weights:
But printing every coefficient rarely improves understanding. For complex networks, interpretation tools are usually more valuable than a raw symbolic dump.
Better Alternatives for Complex Models
If your real goal is interpretability, use the tool that matches the model:
- Coefficients for linear models
- Rule export for tree models
- Feature importance, SHAP values, or partial dependence for ensembles
- Saliency maps, integrated gradients, or surrogate models for neural networks
For example, a surrogate decision tree can approximate a black-box model's behavior on a bounded dataset, even though it is not the original formula.
When Symbolic Extraction Is Reasonable
There are a few cases where exact or near-exact symbolic extraction is practical:
- Small linear or polynomial models
- Symbolic regression models
- Tiny neural networks used for research or teaching
- Models intentionally trained with sparsity or monotonic structure
In ordinary production deep learning, the model definition plus its learned tensors are the real representation. There may not be a concise closed-form equation worth reading.
Common Pitfalls
- Expecting every model to produce a short algebraic formula like a textbook regression.
- Confusing learned parameters with interpretability. A giant weight matrix is technically the formula, but not a useful explanation.
- Ignoring preprocessing. The effective model often includes scaling, encoding, or feature generation before the estimator itself.
- Treating a surrogate explanation as the same thing as the original model.
Summary
- Some models, especially linear ones, have a direct printable formula.
- Tree-based models are best described as rule systems rather than one equation.
- Neural networks can expose weights, but usually not a useful human-sized symbolic formula.
- For complex models, interpretation tools are often better than raw parameter dumps.
- Always include preprocessing in whatever "formula" you claim to have extracted.

