Calculate Confusion Matrix of a FastText Classifier model

FastText

Confusion Matrix

Classifier Model

Machine Learning

Text Classification

Calculate Confusion Matrix of a FastText Classifier model

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Start Practicing Learn More

Introduction

FastText can report precision and recall with built-in test commands, but it does not directly print a confusion matrix for you. To build one, you need to run predictions on labeled evaluation data, collect the true and predicted labels, and then compute the matrix yourself.

That sounds more complicated than it is. Once you have the labels in two Python lists, sklearn.metrics.confusion_matrix does the heavy lifting.

Predict On A Labeled Test Set

FastText supervised data typically stores one or more labels at the start of each line with the __label__ prefix. For a single-label classifier, you can parse the first label as the ground truth, strip it from the text, and then ask the trained model for a prediction.

python

1import fasttext
2
3model = fasttext.train_supervised("train.txt")
4
5def load_examples(path):
6    with open(path, "r", encoding="utf-8") as handle:
7        for line in handle:
8            parts = line.strip().split()
9            true_label = parts[0]
10            text = " ".join(parts[1:])
11            yield true_label, text

This assumes single-label classification. If your training data is multi-label, confusion matrices need more careful interpretation because a simple one-label-per-row table no longer captures the full situation cleanly.

Collect True And Predicted Labels

Once you can load the evaluation set, gather predictions into parallel lists.

python

1from sklearn.metrics import confusion_matrix, classification_report
2
3y_true = []
4y_pred = []
5
6for true_label, text in load_examples("test.txt"):
7    predicted_labels, scores = model.predict(text, k=1)
8    y_true.append(true_label)
9    y_pred.append(predicted_labels[0])
10
11labels = sorted(set(y_true) | set(y_pred))
12cm = confusion_matrix(y_true, y_pred, labels=labels)
13
14print(labels)
15print(cm)
16print(classification_report(y_true, y_pred, labels=labels))

labels is important because it fixes the row and column order. Without an explicit label list, the output can be harder to compare across runs.

Turn The Matrix Into Something Readable

A raw NumPy array is fine for quick debugging, but it helps to present the matrix as a DataFrame so the class names are visible.

python

1import pandas as pd
2
3cm_df = pd.DataFrame(cm, index=labels, columns=labels)
4print(cm_df)

In this layout, rows usually represent true labels and columns represent predicted labels. High values on the diagonal are correct classifications. Off-diagonal values show where the model confuses one class with another.

This is where confusion matrices become genuinely useful. Overall accuracy may look fine, but the matrix can reveal that two specific classes are regularly mistaken for each other.

Compare With FastText's Built-In Metrics

FastText already has a quick evaluation method:

python

result = model.test("test.txt")
print(result)

That gives you aggregate metrics such as the number of examples and precision at one, but it does not tell you which labels are being confused. The confusion matrix fills that gap by exposing the error structure.

Handle Multi-Label Data Carefully

If each example can have multiple labels, a standard single confusion matrix is not always the right tool. You may need one-vs-rest evaluation for each label or a multi-label confusion approach from scikit-learn. The key point is that you must decide how to compare predicted and true sets of labels before calling a metric designed for one class per example.

For many FastText tutorials, the data is single-label classification, so the simple workflow above is enough. Just do not assume the same logic works unchanged when the dataset format allows multiple labels on one line.

Common Pitfalls

The most common mistake is building the matrix from the training set instead of a held-out test set, which makes the model look much better than it really is. Another is forgetting that FastText predictions include the __label__ prefix, so your true labels and predicted labels must use the same format. Developers also sometimes compare labels in inconsistent orders, which leads to a matrix that is technically correct but hard to read. Finally, for imbalanced datasets, do not rely only on the confusion matrix counts. Pair it with precision, recall, and support so you understand whether rare classes are being ignored.

Summary

FastText does not print a confusion matrix directly, so you must compute it from predictions.
Run the model on a labeled evaluation set and collect y_true and y_pred.
Use sklearn.metrics.confusion_matrix and keep the label order explicit.
Convert the matrix to a pandas DataFrame for easier reading.
Be careful with multi-label datasets because they need a different evaluation strategy.