Kfold Cross Validation
GridSearchCV
Machine Learning
Hyperparameter Tuning
Model Evaluation

Kfold Cross Validation and GridSearchCV

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

K-fold cross validation estimates how well a model generalizes by training and evaluating it on multiple train-validation splits. GridSearchCV builds on that idea by trying many hyperparameter combinations and scoring each one with cross validation, then returning the best configuration.

What K-fold cross validation does

In K-fold cross validation, the dataset is split into k folds. The model trains on k - 1 folds and validates on the remaining fold. This repeats until every fold has served as the validation fold once.

For example, with k = 5:

  • fold 1 is validation, folds 2 to 5 are training,
  • then fold 2 is validation, folds 1, 3, 4, and 5 are training,
  • and so on until all five validations are complete.

The final score is usually the mean of the five validation scores.

This is better than a single train-test split when you want a more stable estimate of performance.

What GridSearchCV adds

Hyperparameters are settings you choose before training, such as:

  • 'C for logistic regression,'
  • 'max_depth for a decision tree,'
  • or n_neighbors for k-nearest neighbors.

GridSearchCV takes a parameter grid and evaluates every combination with cross validation. In scikit-learn, it looks like this:

python
1from sklearn.datasets import load_iris
2from sklearn.model_selection import GridSearchCV
3from sklearn.svm import SVC
4
5X, y = load_iris(return_X_y=True)
6
7param_grid = {
8    'C': [0.1, 1, 10],
9    'kernel': ['linear', 'rbf'],
10}
11
12grid = GridSearchCV(
13    estimator=SVC(),
14    param_grid=param_grid,
15    cv=5,
16    scoring='accuracy',
17)
18
19grid.fit(X, y)
20
21print(grid.best_params_)
22print(grid.best_score_)

This tries all combinations of C and kernel, evaluates each with 5-fold cross validation, and reports the best result.

Why they are often used together

Cross validation by itself tells you how stable a model is on repeated splits. GridSearchCV adds systematic hyperparameter tuning. Together, they answer two useful questions at once:

  • how well does the model generalize,
  • and which hyperparameter setting performs best under that evaluation scheme.

That is why GridSearchCV is such a common baseline in scikit-learn workflows. It gives you a reproducible, inspectable search process instead of a one-off manual tuning guess. That makes experiments easier to explain and repeat.

Use stratification when classes are imbalanced

For classification, it is often better to use stratified folds so each fold preserves the approximate class distribution. In scikit-learn, many classification estimators inside GridSearchCV already work well with stratified splitting, but it is still worth understanding why it matters.

If one fold accidentally contains too few examples of a minority class, the score becomes noisier and the parameter search becomes less trustworthy.

Common Pitfalls

The biggest mistake is tuning hyperparameters on the whole dataset and then reporting the best cross-validation score as if it were a final test-set result. You still need a truly held-out test set if you want an unbiased final evaluation.

Another issue is building a parameter grid that is much too large. GridSearchCV is exhaustive by design, so every extra parameter combination increases training cost.

Be careful with preprocessing too. If scaling or encoding is needed, put it in a Pipeline so each cross-validation fold applies preprocessing correctly without leaking information from validation data into training.

Finally, choose a scoring metric that matches the problem. Accuracy is common, but for imbalanced classification, F1, recall, precision, or ROC AUC may be more meaningful.

Summary

  • K-fold cross validation evaluates a model across multiple train-validation splits.
  • 'GridSearchCV tests many hyperparameter combinations using cross validation.'
  • Together they provide a practical way to tune and evaluate a model.
  • Use pipelines and appropriate scoring metrics to avoid misleading results.
  • Keep a separate test set if you need an unbiased final performance estimate.

Course illustration
Course illustration

All Rights Reserved.