ValueError x and y must be the same size
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
The error ValueError: x and y must be the same size usually appears when plotting data in Matplotlib, especially with plot or scatter. It means the library expected one y value for every x value, but the two sequences have different lengths or incompatible shapes.
What the Error Actually Means
In a basic plot, points are paired by position:
- '
x[0]goes withy[0]' - '
x[1]goes withy[1]' - and so on
If one sequence is longer, Matplotlib has no unambiguous way to match points.
This code fails:
There are four x values and only three y values, so Matplotlib raises the size error before drawing anything.
Diagnose the Problem First
Before changing the data, inspect both length and shape.
This quick check tells you whether the mismatch is obvious. In real projects, the arrays often look similar enough that the bug hides inside filtering, missing values, or a preprocessing step that touched only one side.
If you build arrays through pandas, inspect them right before plotting:
That matters because earlier transformations may have changed the row counts.
Fix the Data at the Source
The best fix is to generate x and y from the same filtered dataset instead of trimming one side blindly.
For example, this pattern is correct:
Now both sequences come from the same rows, so the lengths match naturally.
A Common Real Bug: Filtering Only One Array
Many mismatches happen when you drop missing values or apply a condition to only one column.
Broken version:
Here x still has four rows, but y has only three after dropna().
Correct version:
The filtering is applied to the dataset as a whole, so paired rows stay paired.
When You Only Have y
Sometimes you do not have an explicit x array and only want to plot values against their index. In that case, create an x array with the same length as y.
This works because np.arange(len(y)) creates exactly one x position per y value.
Shape Problems with NumPy Arrays
Length mismatches are the usual issue, but shape mismatches can also confuse debugging. For example, a column vector and a flat array may both contain the same count of numbers while still behaving differently in downstream code.
Matplotlib often handles these cases, but if behavior looks odd, flatten the inputs deliberately:
That gives you a predictable one-dimensional representation.
Common Pitfalls
The most common mistake is fixing the error by slicing one array to match the other without asking why the mismatch happened. That can hide a data-quality bug rather than solve it.
Another problem is applying filtering, sorting, or grouping to only one variable. If x and y are supposed to describe the same observations, every row-level transformation must be applied consistently.
People also forget to inspect shapes after converting from pandas to NumPy. A Series and a DataFrame column selection can produce slightly different shapes, which matters if earlier code expected a flat vector.
Finally, if you are plotting in a function, print or assert lengths before calling Matplotlib. A short validation step is much cheaper than tracing a plotting error later.
Summary
- The error means Matplotlib cannot pair each
xvalue with a correspondingyvalue. - Check
len()and.shapeimmediately before plotting. - Prefer fixing the mismatch at the data-preparation step, not by trimming arrays blindly.
- Filter or clean the whole dataset so paired rows stay aligned.
- If you only have
y, createxwithnp.arange(len(y)).

