pyplot scatter plot marker size

Question

In the pyplot document for scatter plot:

matplotlib.pyplot.scatter(x, y, s=20, c='b', marker='o', cmap=None, norm=None,
                              vmin=None, vmax=None, alpha=None, linewidths=None,
                              faceted=True, verts=None, hold=None, **kwargs)

The marker size

s: size in points^2. It is a scalar or an array of the same length as x and y.

What kind of unit is points^2? What does it mean? Does s=100 mean 10 pixel x 10 pixel?

Basically I'm trying to make scatter plots with different marker sizes, and I want to figure out what does the s number mean.

@tcaswell, you mean s=20 means the marker size equals that of a fontsize=20 letter? — LWZ
– LWZ, Commented Feb 12, 2013 at 19:19
matplotlib.pyplot.plot() has ms parameter (markersize) an equivalent for matplotlib.pyplot.scatter() parameter s (size). Just a reminder.. — niekas
– niekas, Commented Nov 6, 2014 at 10:08
@neikas it seems to me they are not, since one is in pixels (markersize) and other is in this weird squared points unit (size). This has always been confusing to me, but I believe it has to do with scatterplot marker size being used to denote amount in a visually proportional way. — heltonbiker
– heltonbiker, Commented May 26, 2017 at 18:02
@heltonbiker is right on this one. If you want to match the markersize from the plot function to s from the scatter function, you need to square it, i.e. s = markersize**2. — Marses
– Marses, Commented Sep 20, 2020 at 9:01

Mateen Ulhaq · Accepted Answer · 2025-08-28 01:44:07Z

704

`s` is an area

s is an area (measured in pt²), and is the square of a length (measured in pt):

s = area = length**2

For example, these are all proportional to length**2:

square_area = w**2 --> 1 * length**2
circle_area = π * r**2 --> π/4 * length**2

To double the width and height of any marker, multiply s by a factor of 4:

A = W * H       -->   (2W) * (2H) = 4A

(Warning: s is proportional to the marker's shaded area, but is usually not equal to it.)

Why is it like this?

There is a reason, however, that the size of markers is defined in this way. Because of the scaling of area as the square of width, doubling the width actually appears to increase the size by more than a factor 2 (in fact it increases it by a factor of 4). To see this consider the following two examples and the output they produce.

# doubling the width of markers
x = [0,2,4,6,8,10]
y = [0]*len(x)
s = [20*4**n for n in range(len(x))]
plt.scatter(x,y,s=s)
plt.show()

gives

enter image description here

Notice how the size increases very quickly. If instead we have

# doubling the area of markers
x = [0,2,4,6,8,10]
y = [0]*len(x)
s = [20*2**n for n in range(len(x))]
plt.scatter(x,y,s=s)
plt.show()

gives

enter image description here

Now the apparent size of the markers increases roughly linearly in an intuitive fashion.

As for the exact meaning of what a 'point' is, it is fairly arbitrary for plotting purposes, you can just scale all of your sizes by a constant until they look reasonable.

Edit: (In response to comment from @Emma)

It's probably confusing wording on my part. The question asked about doubling the width of a circle so in the first picture for each circle (as we move from left to right) it's width is double the previous one so for the area this is an exponential with base 4. Similarly the second example each circle has area double the last one which gives an exponential with base 2.

However it is the second example (where we are scaling area) that doubling area appears to make the circle twice as big to the eye. Thus if we want a circle to appear a factor of n bigger we would increase the area by a factor n not the radius so the apparent size scales linearly with the area.

Edit to visualize the comment by @TomaszGandor:

This is what it looks like for different functions of the marker size:

x = [0,2,4,6,8,10,12,14,16,18]
s_exp = [20*2**n for n in range(len(x))]
s_square = [20*n**2 for n in range(len(x))]
s_linear = [20*n for n in range(len(x))]
plt.scatter(x,[1]*len(x),s=s_exp, label='$s=2^n$', lw=1)
plt.scatter(x,[0]*len(x),s=s_square, label='$s=n^2$')
plt.scatter(x,[-1]*len(x),s=s_linear, label='$s=n$')
plt.ylim(-1.5,1.5)
plt.legend(loc='center left', bbox_to_anchor=(1.1, 0.5), labelspacing=3)
plt.show()

edited Aug 28 at 1:44

Mateen Ulhaq

27.8k21 gold badges121 silver badges155 bronze badges

answered Feb 13, 2013 at 18:59

Dan

13.5k8 gold badges43 silver badges58 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Emma Over a year ago

I'm probably misunderstanding your point, but in your second example you are increasing s exponentially (s=[20, 40, 80, 160, 320, 640]) and saying that that gives us a nice linear-looking size increase. Wouldn't it make more sense if increasing the size linearly (ex. s=[20, 40, 60, 80, 100, 120]) gave us the linear-looking result?

Dan Over a year ago

@Emma Your intuition is right, it's poor wording on my part (alternatively poor choice of x axis scaling). I explained some more in an edit because it was too long for a comment.

Sigur Over a year ago

Is it possible to change s value according to the size of figure window? I mean, if we maximize the figure windows, I'd like to have bigger size marks.

Tomasz Gandor Over a year ago

Great example (just the necessary stuff!). This should not be 4 ** n and 2 ** n, but n ** 4 and n ** 2. With 2 ** n the second plot does not scale linearly in terms of circle diameter. It still goes too fast (just not that much over the top).

Tomasz Gandor Over a year ago

To put it shorter - the second plot shows square root of exponential - which is another exponential, just a bit less steep.

|

ImportanceOfBeingErnest · Accepted Answer · 2019-02-17 04:28:36Z

Because other answers here claim that s denotes the area of the marker, I'm adding this answer to clearify that this is not necessarily the case.

Size in points^2

The argument s in plt.scatter denotes the markersize**2. As the documentation says

s : scalar or array_like, shape (n, ), optional
size in points^2. Default is rcParams['lines.markersize'] ** 2.

This can be taken literally. In order to obtain a marker which is x points large, you need to square that number and give it to the s argument.

So the relationship between the markersize of a line plot and the scatter size argument is the square. In order to produce a scatter marker of the same size as a plot marker of size 10 points you would hence call scatter( .., s=100).

import matplotlib.pyplot as plt

fig,ax = plt.subplots()

ax.plot([0],[0], marker="o",  markersize=10)
ax.plot([0.07,0.93],[0,0],    linewidth=10)
ax.scatter([1],[0],           s=100)

ax.plot([0],[1], marker="o",  markersize=22)
ax.plot([0.14,0.86],[1,1],    linewidth=22)
ax.scatter([1],[1],           s=22**2)

plt.show()

Connection to "area"

So why do other answers and even the documentation speak about "area" when it comes to the s parameter?

Of course the units of points**2 are area units.

For the special case of a square marker, marker="s", the area of the marker is indeed directly the value of the s parameter.
For a circle, the area of the circle is area = pi/4*s.
For other markers there may not even be any obvious relation to the area of the marker.

In all cases however the area of the marker is proportional to the s parameter. This is the motivation to call it "area" even though in most cases it isn't really.

Specifying the size of the scatter markers in terms of some quantity which is proportional to the area of the marker makes in thus far sense as it is the area of the marker that is perceived when comparing different patches rather than its side length or diameter. I.e. doubling the underlying quantity should double the area of the marker.

What are points?

So far the answer to what the size of a scatter marker means is given in units of points. Points are often used in typography, where fonts are specified in points. Also linewidths is often specified in points. The standard size of points in matplotlib is 72 points per inch (ppi) - 1 point is hence 1/72 inches.

It might be useful to be able to specify sizes in pixels instead of points. If the figure dpi is 72 as well, one point is one pixel. If the figure dpi is different (matplotlib default is fig.dpi=100),

1 point == fig.dpi/72. pixels

While the scatter marker's size in points would hence look different for different figure dpi, one could produce a 10 by 10 pixels^2 marker, which would always have the same number of pixels covered:

import matplotlib.pyplot as plt

for dpi in [72,100,144]:

    fig,ax = plt.subplots(figsize=(1.5,2), dpi=dpi)
    ax.set_title("fig.dpi={}".format(dpi))

    ax.set_ylim(-3,3)
    ax.set_xlim(-2,2)

    ax.scatter([0],[1], s=10**2, 
               marker="s", linewidth=0, label="100 points^2")
    ax.scatter([1],[1], s=(10*72./fig.dpi)**2, 
               marker="s", linewidth=0, label="100 pixels^2")

    ax.legend(loc=8,framealpha=1, fontsize=8)

    fig.savefig("fig{}.png".format(dpi), bbox_inches="tight")

plt.show()

If you are interested in a scatter in data units, check this answer.

Wondering how would one calculate what s parameter to give to scatter to get a circle which covers diameter of, let's say, 0.1 in real coordinates of the plot (so as to fill the gap between let's say 0.4 and 0.5 on a plot from (0,0) to (1,1)?
@ImportanceOfBeingErnest could you pls explain how to get the radius of a scatter based on the s param passed? I thought it is np.sqrt(s)/2. stackoverflow.com/q/64399664/9900084

jmd_dk · Accepted Answer · 2018-10-24 13:54:37Z

38

It is the area of the marker. I mean if you have s1 = 1000 and then s2 = 4000, the relation between the radius of each circle is: r_s2 = 2 * r_s1. See the following plot:

plt.scatter(2, 1, s=4000, c='r')
plt.scatter(2, 1, s=1000 ,c='b')
plt.scatter(2, 1, s=10, c='g')

enter image description here

I had the same doubt when I saw the post, so I did this example then I used a ruler on the screen to measure the radii.

edited Oct 24, 2018 at 13:54

jmd_dk

13.2k11 gold badges71 silver badges104 bronze badges

answered Apr 20, 2016 at 19:24

Joaquin

4814 silver badges3 bronze badges

1 Comment

Ayan Mitra Over a year ago

This is the cleanest and most fat free answer. Thanks

zhaoqing · Accepted Answer · 2017-04-13 06:54:20Z

36

You can use markersize to specify the size of the circle in plot method

import numpy as np
import matplotlib.pyplot as plt

x1 = np.random.randn(20)
x2 = np.random.randn(20)
plt.figure(1)
# you can specify the marker size two ways directly:
plt.plot(x1, 'bo', markersize=20)  # blue circle with size 10 
plt.plot(x2, 'ro', ms=10,)  # ms is just an alias for markersize
plt.show()

From here

answered Apr 13, 2017 at 6:54

zhaoqing

8059 silver badges8 bronze badges

4 Comments

Dom Over a year ago

The question was about scatterplot, and in matplotlib the two plotting functions have different parameters (markersize for plot, and s for scatter). So this answer doesn't apply.

Przemek D Over a year ago

@Dom I upvoted, because this question pops up as the first result in google even when I search "pyplot plot marker size", so this answer helps.

zhaoqing Over a year ago

I know the plot method and the scatter method are different in plt but they both can realize the 'scatter plot' and adjust markersize, so this answer is just another working around if you use plot method @Dom

fchen Over a year ago

scatter can specify colormap.

Ike · Accepted Answer · 2017-06-05 18:20:46Z

9

I also attempted to use 'scatter' initially for this purpose. After quite a bit of wasted time - I settled on the following solution.

import matplotlib.pyplot as plt
input_list = [{'x':100,'y':200,'radius':50, 'color':(0.1,0.2,0.3)}]    
output_list = []   
for point in input_list:
    output_list.append(plt.Circle((point['x'], point['y']), point['radius'], color=point['color'], fill=False))
ax = plt.gca(aspect='equal')
ax.cla()
ax.set_xlim((0, 1000))
ax.set_ylim((0, 1000))
for circle in output_list:    
   ax.add_artist(circle)

This is based on an answer to this question

answered Jun 5, 2017 at 18:20

Ike

1,06911 silver badges10 bronze badges

2 Comments

grabantot Over a year ago

very helpfull. But why use two loops?

Ike Over a year ago

@grabantot no reason, just didn't think too much into it.

cottontail · Accepted Answer · 2023-08-09 20:42:20Z

Since this is the top search engine result for "how to change scatter plot marker sizes in python", here is a summary of scatter plot marker size definitions in some of the most popular plotting libraries in Python:

matplotlib (where s=markersize**2):
- plt.scatter(x, y, s=9)
- plt.plot(x, y, 'o', markersize=3)
pandas:
- df.plot(x='A', y='B', kind='scatter', s=9)
- df.plot(x='A', y='B', marker='o', linestyle='', markersize=3)
seaborn: sns.scatterplot(x=x, y=y, s=9)

On the subject of scatter plot marker size, one important thing not mentioned here is that each marker defines edge color and width and depending on how these are set, the resulting markers may end up with different sizes.

By default, an edge with width=1 (lw=1) point is drawn around the markers, which makes the markers bigger. As you can see from the following code, if we remove the edge lines (by lw=0), we get a much smaller marker. In fact, s=1 and lw=1 draws a marker the same size as s=4 and lw=0.

fig, ax = plt.subplots(figsize=(6/72, 2/72), dpi=72*50)
ax.scatter([-0.3], [0], s=1, lw=0)
ax.scatter([1.5], [0], s=1)
ax.scatter([3.8], [0], s=4, lw=0)
ax.set(position=[0,0,1,1], xticks=[], yticks=[], xlim=(-1,5))
plt.setp(ax.spines.values(), linewidth=0.01);

For a more concrete example, in the following figure, the same marker size (s=36) was passed to seaborn, matplotlib and pandas scatter-plot plotters but because the default edge color and marker edge width in seaborn are different from those in the other two methods, the scatter plots end up with markers of different sizes.

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

x = range(20)
df = pd.DataFrame({'A': x, 'B': x})

fig, axs = plt.subplots(1, 3, figsize=(15,3), facecolor='white')

sns.scatterplot(x=x, y=x, ax=axs[0], s=36);
axs[1].scatter(x, x, s=36);
df.plot.scatter('A', 'B', ax=axs[2], s=36);

ms = {f"ax{i}": ax.collections[0] for i, ax in enumerate(axs)}
axs[0].set_title(f"seaborn scatterplot\nmarker edgewidth: {ms['ax0'].get_lw()[0]}\nedgecolor: {ms['ax0'].get_ec()[0]}")
axs[1].set_title(f"matplotlib scatter\nmarker edgewidth: {ms['ax1'].get_lw()[0]}\nedgecolor: {ms['ax1'].get_ec()[0].round(2)}");
axs[2].set_title(f"pandas scatter plot\nmarker edgewidth: {ms['ax2'].get_lw()[0]}\nedgecolor: {ms['ax2'].get_ec()[0].round(2)}");

DomTomCat · Accepted Answer · 2016-11-05 19:37:50Z

2

If the size of the circles corresponds to the square of the parameter in s=parameter, then assign a square root to each element you append to your size array, like this: s=[1, 1.414, 1.73, 2.0, 2.24] such that when it takes these values and returns them, their relative size increase will be the square root of the squared progression, which returns a linear progression.

If I were to square each one as it gets output to the plot: output=[1, 2, 3, 4, 5]. Try list interpretation: s=[numpy.sqrt(i) for i in s]

edited Nov 5, 2016 at 19:37

DomTomCat

8,6191 gold badge54 silver badges66 bronze badges

answered May 26, 2015 at 20:47

user34028

1437 bronze badges

2 Comments

Sigur Over a year ago

Should be i in output shouldn't?

user3212761 Over a year ago

Agree with @Sigur

Collectives™ on Stack Overflow

pyplot scatter plot marker size

7 Answers 7

`s` is an area

Why is it like this?

7 Comments

Size in points^2

Connection to "area"

What are points?

3 Comments

1 Comment

4 Comments

2 Comments

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

s is an area

Why is it like this?

7 Comments

Size in points^2

Connection to "area"

What are points?

3 Comments

1 Comment

4 Comments

2 Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related

`s` is an area