gplearn
sympy
expression conversion
machine learning
code export

How to export the output of gplearn as a sympy expression or some other readable format?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

gplearn models are symbolic programs, so the main challenge is not getting a prediction out of them but making the evolved program readable and reusable elsewhere. In practice, the export path usually starts with the estimator's best program object, then moves either to a string representation, a Graphviz tree, or a custom conversion into a sympy expression.

Start with the Best Program Object

After fitting a SymbolicRegressor, the learned expression is stored on the estimator's internal program object.

python
1from gplearn.genetic import SymbolicRegressor
2import numpy as np
3
4X = np.array([[1.0], [2.0], [3.0], [4.0]])
5y = np.array([3.0, 5.0, 7.0, 9.0])
6
7est = SymbolicRegressor(
8    population_size=200,
9    generations=10,
10    random_state=0,
11)
12est.fit(X, y)
13
14print(est._program)

That string form is often the first readable output you need. It is not yet a native sympy object, but it is a clear serialization of the evolved expression tree.

One caveat matters here: _program is an internal attribute by naming convention. It is widely used in examples, but it is still not the cleanest long-term public export API.

Use Built-In Readable Formats First

Before writing a symbolic converter, check whether a simpler export format already solves your problem.

For visual inspection, gplearn exposes Graphviz output on the program object:

python
dot = est._program.export_graphviz()
print(dot)

That is often enough when you want:

  • a readable tree
  • documentation for a report
  • quick manual inspection of the evolved expression

If your goal is human-readable output, the string form plus Graphviz may already be sufficient.

Convert to sympy with a Small Parser

If you want algebraic manipulation, simplification, or code generation, convert the exported form into sympy. The exact parser depends on which function set your model uses, but the general pattern is:

  1. map gplearn function names to sympy equivalents
  2. parse terminals such as X0, X1, and constants
  3. recursively build a sympy expression tree

Here is a minimal example for a small subset:

python
1import sympy as sp
2
3x0, x1 = sp.symbols("x0 x1")
4
5namespace = {
6    "add": lambda a, b: a + b,
7    "sub": lambda a, b: a - b,
8    "mul": lambda a, b: a * b,
9    "div": lambda a, b: a / b,
10    "X0": x0,
11    "X1": x1,
12}
13
14expr = namespace["add"](namespace["mul"](2, x0), 3)
15print(sp.simplify(expr))

A full converter needs to parse the program string rather than build the tree manually, but this mapping is the central idea. Once the expression is in sympy, you can simplify it, pretty-print it, differentiate it, or generate code.

Validate the Converted Expression

Always verify that the converted expression matches the original gplearn model numerically on sample inputs.

python
1import numpy as np
2import sympy as sp
3
4x0 = sp.symbols("x0")
5expr = 2 * x0 + 1
6fn = sp.lambdify(x0, expr, "numpy")
7
8values = np.array([1.0, 2.0, 3.0])
9print(fn(values))

This validation step matters because symbolic export bugs are often subtle. A wrong function mapping or argument order can produce a plausible-looking expression that does not actually match the evolved program.

Common Pitfalls

  • Assuming gplearn already returns a ready-made sympy object by default.
  • Building a custom parser without matching every function in the estimator's function set.
  • Treating _program as a guaranteed stable public API without checking library changes.
  • Skipping numeric validation after conversion and trusting the symbolic form blindly.

Summary

  • The learned expression lives on the estimator's best program object, commonly exposed as est._program.
  • The simplest exports are the program's string representation and Graphviz tree.
  • Converting to sympy requires mapping gplearn functions and terminals into symbolic equivalents.
  • Numeric validation is essential after conversion.
  • The best export format depends on whether you want readability, visualization, or symbolic manipulation.

Course illustration
Course illustration

All Rights Reserved.