GridSearchCV
Scikit-Learn
machine learning
model tuning
Python

How to estimate the progress of a GridSearchCV from verbose output in Scikit-Learn?

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

GridSearchCV can take a long time, especially when the grid is large and cross-validation uses many folds. The verbose output gives enough information to estimate progress, but you have to translate "candidates," "folds," and parallel jobs into something closer to a real percentage.

The First Line Gives You the Total Number of Fits

When verbose is enabled, scikit-learn prints a line like this near the start:

text
Fitting 5 folds for each of 12 candidates, totalling 60 fits

That line is the key. The total work is:

python
total_fits = number_of_candidates * number_of_folds

So in the example above:

python
total_fits = 12 * 5
print(total_fits)  # 60

If you know how many individual fit jobs have finished, you can estimate percentage complete as:

python
progress = completed_fits / total_fits

How to Count Completed Fits from Verbose Output

At higher verbosity levels, scikit-learn prints a line as each fit finishes. A typical example looks like this:

text
[CV 3/5] END max_depth=10, n_estimators=200;, score=0.842 total time=2.1s

Each of those lines represents one completed fit. If the run has 60 total fits and you have seen 15 completed-fit lines, then the rough progress estimate is:

python
1completed_fits = 15
2total_fits = 60
3
4print(f"{completed_fits / total_fits:.0%}")  # 25%

This is the simplest mental model, and it works surprisingly well as a first approximation.

Estimating Remaining Time

If verbose output includes per-fit timings, you can also estimate the remaining wall-clock time. Suppose the average completed fit has taken 2.1 seconds so far and 45 fits remain.

With one worker:

python
remaining_seconds = 45 * 2.1
print(remaining_seconds)  # 94.5

With parallel execution, divide by the number of workers as a rough approximation:

python
remaining_seconds = (45 * 2.1) / 4
print(remaining_seconds)  # about 23.6

This is only approximate because fit times are rarely uniform. Some parameter combinations train much faster than others, and the last batch of jobs may not use all workers evenly.

Parallel Jobs Make Progress Look Uneven

n_jobs changes the shape of the progress, not just the speed. If four workers are running fits simultaneously, the console may print results in bursts. That can make the search feel stalled and then suddenly jump forward.

So do not assume that silent output means nothing is happening. It may simply mean that several long-running fits are still in progress and none has finished printing yet.

This is also why ETA estimates improve later in the run. Early timings are based on a small sample and can be very misleading.

You Can Compute the Grid Size Up Front

Before running the search, you can calculate the number of parameter candidates directly from the grid:

python
1from sklearn.model_selection import ParameterGrid
2
3param_grid = {
4    "max_depth": [5, 10, None],
5    "n_estimators": [100, 200],
6}
7
8candidate_count = len(list(ParameterGrid(param_grid)))
9print(candidate_count)  # 6

Then multiply by your cross-validation fold count. That gives you the same total number of fits that verbose mode later reports, and it helps you sanity-check the run before you start.

A Better Interpretation Habit

When reading verbose output, think in terms of "fits completed," not "parameter sets completed." One parameter set is not done until all folds for that setting have finished. That distinction matters because a line from one fold does not mean the full candidate has been evaluated.

If you want a more stable ETA, wait until at least ten to twenty fits have finished before trusting the average timing.

Common Pitfalls

The biggest pitfall is forgetting that total work equals candidates times folds. Many people only count the number of parameter combinations and underestimate the runtime by a large factor.

Another mistake is assuming every fit takes the same amount of time. In real model grids, some settings are much slower, so early ETA numbers are often optimistic.

Parallel execution also causes confusion. Output lines may arrive out of order or in bursts, so progress does not always look linear even when the search is healthy.

Summary

  • Read the first verbose line to get the total number of fits.
  • Estimate progress as completed-fit lines divided by total fits.
  • Multiply candidate count by cross-validation folds to understand the real workload.
  • Use per-fit times for ETA, but expect rough estimates early in the run.
  • Parallel jobs speed things up, but they also make console progress appear uneven.

Course illustration
Course illustration

All Rights Reserved.