What is the difference between RepeatedStratifiedKFold and StratifiedKFold in sklearn?
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
StratifiedKFold and RepeatedStratifiedKFold both solve the same core problem: evaluating a classifier while preserving class balance in every fold. The difference is that one produces a single stratified split plan, while the other repeats that process multiple times with different shuffles so your estimate is less dependent on one lucky or unlucky partition.
What StratifiedKFold does
Use StratifiedKFold when you want standard k-fold cross-validation for a classification task. It keeps the ratio of labels in each fold close to the ratio in the full dataset. That matters when the target is imbalanced, because plain KFold can accidentally create folds with very few positive examples.
If you set n_splits=5, the estimator is trained and evaluated five times. Each sample appears in the validation set once. When shuffle=False, the split is deterministic and based on input order. When shuffle=True, the data is shuffled before folds are created, which is usually safer unless ordering is already random.
This gives you one set of five validation scores. It is efficient and easy to reason about, so it is often the default choice for model selection and baseline experiments.
What RepeatedStratifiedKFold adds
RepeatedStratifiedKFold runs stratified k-fold more than once. Each repetition reshuffles the dataset and creates a new set of folds, so the model is evaluated across more train and validation combinations.
That matters because a single cross-validation run can still be noisy. If your dataset is small, borderline imbalanced, or sensitive to sampling, the average from one five-fold split can move around more than you would like. Repeating the process reduces the chance that your conclusion depends on one particular split.
With five folds repeated three times, you get fifteen evaluation scores instead of five. The mean is often more stable, and the standard deviation gives a better sense of score variability.
When to use each one
Choose StratifiedKFold when:
- the dataset is large enough that one cross-validation run is already stable
- training is expensive and you want to keep evaluation time under control
- you need a simple, reproducible benchmark
Choose RepeatedStratifiedKFold when:
- the dataset is small or moderately noisy
- class imbalance makes fold composition more sensitive
- you want a more reliable comparison between similar models
The trade-off is straightforward: repeated evaluation gives a better estimate, but it multiplies training cost. A model that takes five minutes to evaluate with five folds will take roughly fifteen minutes with three repeats.
Interpreting the results
A common mistake is to assume repeated cross-validation produces a fundamentally different metric. It does not. You are still measuring the same thing, but with more resampling. Think of it as spending more compute to reduce sampling noise.
For example, if two models differ by only a tiny amount, a single StratifiedKFold run may not be enough to trust the ranking. Repeating the folds gives you more evidence that the observed difference is consistent.
This pattern is useful when you want to compare several classifiers under the same repeated split strategy.
Common Pitfalls
- Using these splitters for regression tasks. Stratification is designed for classification labels, not continuous targets.
- Forgetting
shuffle=TruewithStratifiedKFoldwhen input rows are ordered by class or time. Ordered data can distort the folds. - Comparing models with different random seeds or different splitters. Use the same cross-validation object for a fair comparison.
- Treating repeated cross-validation as free. It improves stability, but it can become expensive for large datasets or heavy models.
- Relying only on mean score. The spread of scores matters, especially when model performance is close.
Summary
- '
StratifiedKFoldcreates one stratified set ofkfolds for classification.' - '
RepeatedStratifiedKFoldrepeats stratified k-fold multiple times with different shuffles.' - Repetition usually gives a more stable estimate of model performance.
- The benefit of repetition comes with a proportional increase in compute cost.
- For fast baseline work,
StratifiedKFoldis often enough. - For noisy or small datasets,
RepeatedStratifiedKFoldis often the safer choice.

