Random forest class_weight and sample_weight parameters
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Imbalanced classification data can make a random forest look accurate while still failing on the minority class. In scikit-learn, class_weight and sample_weight are two different levers for this problem. They sound similar, but they operate at different levels and should be chosen based on whether imbalance is class-wide or record-specific.
Understanding that difference early prevents weeks of misleading evaluation and unnecessary model complexity.
What class_weight Changes
class_weight applies a multiplier per class label. During split evaluation, mistakes on higher-weight classes cost more, so trees are encouraged to create boundaries that better protect those classes.
The most common option is class_weight="balanced", which scales each class inversely to its frequency in the training data. This is a fast baseline when imbalance is global and you do not have extra confidence signals per row.
This often improves minority recall, though precision may drop. Evaluate with class-specific metrics, not only overall accuracy.
What sample_weight Changes
sample_weight is passed to fit and weights individual training rows. Use it when some records are more trustworthy, more expensive to misclassify, or need temporal emphasis.
For example, you might upweight recent fraud records or downweight noisy labels.
With sample_weight, you can go beyond class imbalance. You can express business value per row, confidence by source system, or freshness effects in evolving datasets.
Using Both Together
You can combine class_weight and sample_weight. Internally, they interact multiplicatively during training. That can be useful, but it can also over-amplify minority or high-cost regions if you are not careful.
A practical approach:
- start with
class_weight="balanced"alone. - tune core tree settings such as depth and minimum leaf size.
- add
sample_weightonly when you have a clear row-level rationale. - re-evaluate calibration and threshold performance.
Do not assume higher weighting is always better. Very aggressive weighting can increase variance and false positives.
Evaluation Strategy That Matches Imbalance
Use metrics that reflect the target decision quality:
- recall and precision for minority class.
- PR-AUC for rare positive cases.
- confusion matrix at your operating threshold.
- cost-based score when false positives and false negatives have different impact.
For thresholded decisions, tune threshold after model training, not during random forest fitting. Weighting influences learned probabilities and ranking, while threshold controls business trade-off at inference time.
Common Pitfalls
- Judging success by overall accuracy on highly imbalanced data.
- Using both
class_weightand heavysample_weightwithout checking overcorrection. - Forgetting stratified splitting and creating unstable evaluation sets.
- Assuming weighting replaces feature engineering and data-quality cleanup.
- Not recalibrating probability thresholds after changing class or sample weights.
Summary
class_weighthandles class-level imbalance and is a strong first baseline.sample_weighthandles row-level importance and business-specific costs.- Both can be combined, but over-weighting can hurt precision and stability.
- Evaluate with minority-focused metrics, not only aggregate accuracy.
- Tune model settings and decision threshold together for production performance.

