pyplot scatter plot marker size
Master System Design with Codemia
Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.
Introduction
In Matplotlib, scatter marker size is controlled with the s argument to plt.scatter or Axes.scatter. The important detail is that s represents marker area in points squared, not diameter. That is why doubling the number does not make a point look twice as wide.
Basic Usage of s
If you want every point to have the same size, pass one scalar value.
This is enough for ordinary scatter plots where size does not encode data.
Use an Array for Per-Point Sizes
If marker size should represent another variable, pass a list or NumPy array the same length as the input points.
This is how bubble charts are built. The extra variable is mapped to area, not radius.
Normalize Raw Values Before Plotting
Raw business metrics often have extreme ranges. If you feed them directly to s, one point can dominate the chart and make smaller points unreadable.
Normalization is often more important than the exact numeric value you choose for s.
Improve Readability With Alpha and Edge Styling
Marker size rarely works alone. Overlap can make dense areas unreadable, so adjust transparency and edge styling too.
A lower alpha often improves readability more than endlessly tuning the raw size.
Build a Reusable Mapping Helper
In reports and dashboards, size mapping logic tends to repeat. Put it in a helper so different charts stay visually consistent.
Once size mapping is centralized, you can tune the visual range in one place instead of hard-coding magic numbers in every chart.
Choose Marker Size for the Output Medium
A scatter plot that looks good in a notebook may look terrible in a slide deck or a printed report. Marker size should be tested at the final figure dimensions.
If a chart will be displayed on a small screen, large markers can hide relationships. If it will be printed at high resolution, very small markers can disappear. The right value for s is partly about medium, not just data.
Common Pitfalls
A common mistake is assuming s is the marker radius or diameter. It is area, so visual changes are nonlinear.
Another mistake is passing raw values directly from a dataset without scaling them first. Large outliers can make the chart unreadable.
It is also easy to forget overlap handling. If points cluster tightly, marker size alone will not solve the problem. Use alpha, edge colors, or even jitter when necessary.
Summary
- Control scatter marker size with the
sparameter. - '
srepresents area in points squared, not diameter.' - Pass one scalar for uniform size or an array for per-point sizes.
- Normalize large value ranges before mapping them to marker area.
- Tune size together with alpha, edges, and final display dimensions.

