Cast string to float is not supported in Linear Model
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
This error means your linear model received string data where it expected numeric features. Linear models operate on numbers, so if a column contains raw strings such as "red", "42" stored as text, or missing values encoded as words, the model cannot automatically turn that into a usable floating-point tensor.
The fix is not to force the model to cast blindly. The fix is to preprocess the input correctly: convert numeric-looking strings into numeric values, and encode categorical strings into numeric features before training.
Why Linear Models Need Numeric Input
A linear model computes a weighted sum of feature values. That requires arithmetic such as multiplication and addition, which only makes sense for numeric tensors.
If your dataset contains:
then age and salary may be convertible to numbers, but city is categorical text. Those two cases need different preprocessing.
Convert Numeric Strings Explicitly
If the feature is conceptually numeric but stored as text, parse it before the model sees it.
Example with pandas:
After conversion, age is numeric and can be used directly by a linear model.
If parsing fails, that is useful information. It means the column is not actually clean numeric data yet.
Encode Categorical Strings Instead of Casting Them
If the feature is true category text such as a city name or product type, do not cast it to float. Encode it.
A simple TensorFlow preprocessing pipeline looks like this:
Now the string category becomes a numeric one-hot vector that a linear layer can consume.
You can combine numeric and categorical preprocessing in a Keras model:
The important part is that the raw string never reaches the linear layer unprocessed.
Check the Input Pipeline, Not Just the Model
Many of these errors come from the data pipeline rather than the model definition. Common sources include:
- CSV readers that infer every column as string
- missing values represented by text such as
"NA" - feature dictionaries where one field has the wrong dtype
- training data and serving data using different schemas
So when you see this error, inspect the actual tensor dtypes entering the model. In TensorFlow, printing tensor.dtype or validating your dataset schema early usually saves time.
Separate Numeric Parsing from Categorical Encoding
Do not lump all strings together. Ask two questions for each column:
- Is this feature fundamentally numeric but stored as text?
- Or is it categorical text that needs encoding?
Numeric text should be parsed. Categorical text should be encoded. Those are different transformations with different meanings.
Common Pitfalls
- Trying to cast category labels such as
"Toronto"directly to float. - Assuming the model will infer how to parse numeric-looking strings automatically.
- Letting CSV import keep numeric columns as object or string types and never validating them.
- Mixing training and inference schemas so the same feature arrives as float in one place and string in another.
Summary
- A linear model requires numeric features, so raw strings must be preprocessed first.
- Convert numeric-looking strings with explicit parsing.
- Encode categorical strings with one-hot, embedding, or similar feature transformations.
- Validate dtypes in the input pipeline, not only in the model code.
- Treat numeric text and categorical text as different problems with different fixes.

