Python
machine learning
SMOTE
scikit-learn
error handling

AttributeError 'SMOTE' object has no attribute 'fit_sample'

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

The error AttributeError: 'SMOTE' object has no attribute 'fit_sample' usually means your code is using an old API against a newer version of imbalanced-learn. In current versions, the method name is fit_resample, not fit_sample. The fix is usually straightforward, but it is also a good moment to verify that SMOTE is being used in the correct part of the training pipeline.

Replace fit_sample with fit_resample

The direct code change is:

python
1from imblearn.over_sampling import SMOTE
2
3smote = SMOTE(random_state=42)
4X_resampled, y_resampled = smote.fit_resample(X_train, y_train)

If your code still says fit_sample, update it:

python
# old style
# X_resampled, y_resampled = smote.fit_sample(X_train, y_train)

That is the main reason for the attribute error.

Why the method name changed

The newer fit_resample name is more consistent across the resampling API. SMOTE is not only "sampling" in the loose sense. It is fitting to the training data and returning a resampled dataset, so the newer method name better matches the broader library design.

In practice, the important part is that your installed imbalanced-learn version and your code example must agree.

Apply SMOTE only to the training data

Even after fixing the method name, make sure you are using SMOTE in the right place. The correct pattern is to resample the training set only:

python
1from sklearn.model_selection import train_test_split
2from imblearn.over_sampling import SMOTE
3
4X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=42)
5
6smote = SMOTE(random_state=42)
7X_train_resampled, y_train_resampled = smote.fit_resample(X_train, y_train)

Do not apply SMOTE to the test set. That would leak synthetic information into evaluation and make the model look better than it really is.

Use pipelines when resampling is part of model training

If the workflow includes cross-validation or repeated fitting, an imblearn pipeline is often the cleaner design:

python
1from imblearn.pipeline import Pipeline
2from imblearn.over_sampling import SMOTE
3from sklearn.linear_model import LogisticRegression
4
5pipeline = Pipeline([
6    ("smote", SMOTE(random_state=42)),
7    ("model", LogisticRegression(max_iter=1000))
8])
9
10pipeline.fit(X_train, y_train)

This keeps resampling tied to the training process and makes evaluation workflows easier to manage correctly.

Check your installed package version

If you copied code from an old blog post or notebook, the API mismatch is likely version-related. The fix is usually to update the code, not to downgrade the library blindly.

You can inspect the installed version with:

python
import imblearn
print(imblearn.__version__)

That helps explain why one tutorial uses fit_sample while your environment expects fit_resample.

Treat the API change as a maintenance signal

This error is also a reminder to check neighboring code. If one call still uses an old API, the rest of the notebook or project may also assume older library behavior. Updating one method name may fix the immediate exception, but it is worth scanning the rest of the imbalance-handling pipeline for version drift too.

Common Pitfalls

The most common mistake is changing the method name but still applying SMOTE before the train-test split, which causes data leakage.

Another common issue is mixing examples from old imbalanced-learn tutorials with a newer installed package version.

People also assume SMOTE should be applied to the test set for balance, but evaluation data should reflect the real target distribution you want to measure.

Finally, if you are doing cross-validation, use a proper pipeline so resampling happens inside the training folds rather than once globally.

Summary

  • 'fit_sample was replaced by fit_resample in newer imbalanced-learn APIs.'
  • The direct fix is to call smote.fit_resample(X_train, y_train).
  • Apply SMOTE only to training data, not to the test set.
  • Prefer a pipeline when resampling is part of repeated model fitting or cross-validation.
  • Treat the error as an API-version mismatch first, not as a problem with SMOTE itself.

Course illustration
Course illustration

All Rights Reserved.