pyplot
scatter plot
marker size
matplotlib
data visualization

pyplot scatter plot marker size

Master System Design with Codemia

Enhance your system design skills with over 120 practice problems, detailed solutions, and hands-on exercises.

Introduction

In Matplotlib, scatter marker size is controlled with the s argument to plt.scatter or Axes.scatter. The important detail is that s represents marker area in points squared, not diameter. That is why doubling the number does not make a point look twice as wide.

Basic Usage of s

If you want every point to have the same size, pass one scalar value.

python
1import matplotlib.pyplot as plt
2
3x = [1, 2, 3, 4, 5]
4y = [2, 3, 5, 4, 6]
5
6plt.scatter(x, y, s=80)
7plt.title("Uniform Marker Size")
8plt.xlabel("x")
9plt.ylabel("y")
10plt.show()

This is enough for ordinary scatter plots where size does not encode data.

Use an Array for Per-Point Sizes

If marker size should represent another variable, pass a list or NumPy array the same length as the input points.

python
1import matplotlib.pyplot as plt
2
3x = [10, 20, 30, 40]
4y = [15, 18, 10, 25]
5weights = [100, 400, 900, 1600]
6
7plt.scatter(x, y, s=weights, alpha=0.6, edgecolors="black")
8plt.title("Bubble Scatter")
9plt.xlabel("Feature A")
10plt.ylabel("Feature B")
11plt.show()

This is how bubble charts are built. The extra variable is mapped to area, not radius.

Normalize Raw Values Before Plotting

Raw business metrics often have extreme ranges. If you feed them directly to s, one point can dominate the chart and make smaller points unreadable.

python
1import numpy as np
2import matplotlib.pyplot as plt
3
4raw = np.array([5, 8, 13, 100, 250], dtype=float)
5min_area = 40
6max_area = 600
7
8scaled = min_area + (raw - raw.min()) * (max_area - min_area) / (raw.max() - raw.min())
9
10x = np.arange(len(raw))
11y = np.array([1, 3, 2, 4, 3])
12
13plt.scatter(x, y, s=scaled, c=raw, cmap="viridis", alpha=0.8)
14plt.colorbar(label="Raw metric")
15plt.title("Normalized Marker Sizes")
16plt.show()

Normalization is often more important than the exact numeric value you choose for s.

Improve Readability With Alpha and Edge Styling

Marker size rarely works alone. Overlap can make dense areas unreadable, so adjust transparency and edge styling too.

python
1import matplotlib.pyplot as plt
2
3x = [1.0, 1.1, 1.2, 1.3, 1.4]
4y = [2.0, 2.1, 1.9, 2.2, 2.0]
5
6plt.scatter(
7    x,
8    y,
9    s=220,
10    alpha=0.4,
11    edgecolors="navy",
12    linewidths=1.0,
13)
14plt.title("Overlapping Points")
15plt.show()

A lower alpha often improves readability more than endlessly tuning the raw size.

Build a Reusable Mapping Helper

In reports and dashboards, size mapping logic tends to repeat. Put it in a helper so different charts stay visually consistent.

python
1import numpy as np
2
3
4def map_to_marker_area(values, min_area=30, max_area=500):
5    arr = np.asarray(values, dtype=float)
6    if arr.max() == arr.min():
7        return np.full(arr.shape, (min_area + max_area) / 2.0)
8    return min_area + (arr - arr.min()) * (max_area - min_area) / (arr.max() - arr.min())
9
10
11sizes = map_to_marker_area([3, 5, 7, 20, 35])
12print(sizes)

Once size mapping is centralized, you can tune the visual range in one place instead of hard-coding magic numbers in every chart.

Choose Marker Size for the Output Medium

A scatter plot that looks good in a notebook may look terrible in a slide deck or a printed report. Marker size should be tested at the final figure dimensions.

If a chart will be displayed on a small screen, large markers can hide relationships. If it will be printed at high resolution, very small markers can disappear. The right value for s is partly about medium, not just data.

Common Pitfalls

A common mistake is assuming s is the marker radius or diameter. It is area, so visual changes are nonlinear.

Another mistake is passing raw values directly from a dataset without scaling them first. Large outliers can make the chart unreadable.

It is also easy to forget overlap handling. If points cluster tightly, marker size alone will not solve the problem. Use alpha, edge colors, or even jitter when necessary.

Summary

  • Control scatter marker size with the s parameter.
  • 's represents area in points squared, not diameter.'
  • Pass one scalar for uniform size or an array for per-point sizes.
  • Normalize large value ranges before mapping them to marker area.
  • Tune size together with alpha, edges, and final display dimensions.

Course illustration
Course illustration

All Rights Reserved.