Decision Tree in Matlab

Decision Tree

MATLAB

Machine Learning

Data Analysis

Algorithm

Decision Tree in Matlab

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

Decision trees in MATLAB are usually built with fitctree for classification or fitrtree for regression. They are useful because the model structure is easy to inspect, requires little feature scaling, and gives you a strong baseline for tabular data before moving to more complex models.

Core Sections

Classification trees with `fitctree`

A classification tree learns rules that split input features into increasingly pure groups. In MATLAB, each row of the feature matrix is an observation and the target vector contains the class labels.

matlab

1load fisheriris
2
3X = meas;
4Y = species;
5
6tree = fitctree(X, Y);
7view(tree, 'Mode', 'graph')

This trains a classifier on the iris dataset and opens a graphical tree view. The splits are chosen automatically based on the training data and the default splitting criterion.

To make predictions:

matlab

predictedLabels = predict(tree, X);
accuracy = mean(strcmp(predictedLabels, Y));
disp(accuracy)

That gives you a quick training-set sanity check, though a held-out set or cross-validation is better for real evaluation.

Regression trees with `fitrtree`

If the target is numeric instead of categorical, use fitrtree.

matlab

1load carsmall
2
3X = [Weight Horsepower];
4Y = MPG;
5
6validRows = all(~isnan(X), 2) & ~isnan(Y);
7X = X(validRows, :);
8Y = Y(validRows);
9
10rtree = fitrtree(X, Y);
11predictedMPG = predict(rtree, X);
12rmse = sqrt(mean((predictedMPG - Y).^2));
13disp(rmse)

A regression tree predicts continuous values by splitting the data to reduce target variance in each branch.

Control tree growth

Decision trees can overfit badly if you let them grow without constraints. MATLAB lets you control complexity through parameters such as MaxNumSplits, MinLeafSize, and pruning settings.

matlab

tree = fitctree(X, Y, ...
    'MaxNumSplits', 10, ...
    'MinLeafSize', 5);

These settings limit how detailed the tree can become. Smaller leaves and more splits make the model more flexible, but also more likely to memorize noise.

Evaluate with cross-validation

Training accuracy is not enough. MATLAB supports cross-validation directly on tree models.

matlab

1load fisheriris
2
3X = meas;
4Y = species;
5
6tree = fitctree(X, Y, 'CrossVal', 'on');
7loss = kfoldLoss(tree);
8disp(loss)

kfoldLoss reports an estimate of generalization error across validation folds. For quick model comparison, this is much more useful than checking only the resubstitution error on the same data used for fitting.

Inspect feature importance and structure

Trees are popular partly because they are interpretable. MATLAB exposes predictor importance estimates and visualization tools.

matlab

tree = fitctree(X, Y);
importance = predictorImportance(tree);
disp(importance)

The graph view and importance scores help answer two practical questions:

which variables drive the splits
whether the tree is too deep or too noisy to trust

Interpretability is not perfect, but it is far better than with many black-box models.

When a single tree is enough

A single decision tree is often a good choice when:

you need a fast baseline
interpretability matters
the feature set is tabular and mixed-scale
you want simple if-then style rules

If performance plateaus, ensembles such as bagged trees or boosted trees often outperform a single tree. Still, a plain tree is a good place to start because it makes debugging data issues easier.

Common Pitfalls

Evaluating the model only on the training data and mistaking that for real performance.
Letting the tree grow too deep and overfit noise in the dataset.
Ignoring missing values or invalid rows before fitting the model.
Expecting a single tree to outperform ensemble methods on harder tabular problems.
Treating predictor importance scores as proof of causality rather than as model-specific heuristics.

Summary

Use fitctree for classification and fitrtree for regression in MATLAB.
Decision trees are easy to train, inspect, and use as a baseline for tabular data.
Control tree size with options such as MaxNumSplits and MinLeafSize.
Prefer cross-validation or a held-out test set over training-set accuracy.
Start with a single tree for interpretability, then move to ensembles if you need more predictive power.

Decision Tree in Matlab

Master System Design with Codemia

Introduction

Core Sections

Classification trees with fitctree

Regression trees with fitrtree

Control tree growth

Evaluate with cross-validation

Inspect feature importance and structure

When a single tree is enough

Common Pitfalls

Summary

Classification trees with `fitctree`

Regression trees with `fitrtree`