Cost function in logistic regression gives NaN as a result
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
Logistic regression is a foundational machine learning algorithm used for binary classification tasks. It estimates the probability that a given input belongs to a category, usually employing the logistic (sigmoid) function. However, during the training phase, particularly when calculating the cost function, practitioners may encounter an unusual problem where the cost function returns NaN (Not a Number). This can result from numerical instability or data preprocessing issues. This article aims to explore the reasons why this happens and how to resolve it.
The Logistic Regression Model
Logistic Function
Logistic regression models the probability that the dependent variable belongs to a particular category. The logistic, or sigmoid function, is defined as:
where is the weighted sum of the input features.
Cost Function
The cost function for logistic regression is given by the log loss function:
where: • is the number of training samples. • is the true label of sample . • is the predicted probability for sample .
Why Does the Cost Function Return NaN?
Several issues can lead to NaN values in the cost function:
- Division by Zero: When is exactly 0 or 1, or can result in
NaN. - Numerical Overflow/Underflow: Large positive or negative inputs to the sigmoid function can cause overflow in exponential calculations, leading to
NaN. - Data Precision: Very small feature values or a large range of values in datasets might affect calculations due to floating-point precision limits.
- Improper Data Scaling: Data that has not been normalized or standardized can lead to poor performance and numerical issues.
- Extreme Learning Rate: A learning rate that is too large can result in drastic updates to parameters leading to undefined operations.
Handling NaN in the Cost Function
Techniques to Prevent NaN
- Clipping Predictions: Constrain the values of to a range slightly away from 0 and 1:

