2

I have tried to use the adjust_text function from adjustText to get the scatter point labels in matplotlib to not overlap.

#Adding the names
for i, txt in enumerate(bigdf['Player']):
    if bigdf['Goals'][i] >= 5 or bigdf["Assists"][i] >= 3:
        ax.annotate(txt, (bigdf['Goals'][i]+0.15, bigdf["Assists"][i]))
        adjust_text(ax.annotate, x=bigdf['Goals'], y=bigdf["Assists"])
    else:
        None

I am using data located in a dataframe (bigdf), where I want the player names to appear next to their scatter point on the graph. However, when I plot them, some of the names overlap and make it so it is unreadable. I have tried the following code to attempt to adjust the text so they do not overlap but to no avail.

That's what it looks like right now: Figure showing overlapping labels

Portion of DataFrame

Any suggestions?

4
  • A picture of what it looks like right now, as well as your data set/a dummy dataset would be nice to reproduce what you are experiencing. Commented Sep 1, 2020 at 14:52
  • Hi Nils, i have added your recommendations. Commented Sep 1, 2020 at 19:00
  • Please add the dataframe as copyable code instead of as a picture Commented Sep 2, 2020 at 13:13
  • Further, I don't know of an automatized way to add the player names without overlapping, but I would suggest to just place the text manually as this seems to occur only 4 times anyways. Commented Sep 2, 2020 at 13:15

1 Answer 1

2

The point of adjust_text() is achieved by giving you the text to annotate in list form: the first graph has no embellishments and the second has arrows pointing to the scattered values. Note: Some of the scatter marks are missing for unknown reasons.

import pandas as pd

df = pd.read_csv('./Data/PremierLeague_1920.csv', encoding='utf-8')
df.head()
|    |   RANK | PLAYER                    | TEAM            |   GP |   GS |   MIN |   G |   ASST |   SHOTS |   SOG |
|---:|-------:|:--------------------------|:----------------|-----:|-----:|------:|----:|-------:|--------:|------:|
|  0 |      1 | Jamie Vardy               | Leicester City  |   35 |   34 |  3034 |  23 |      5 |      71 |    43 |
|  1 |      2 | Daniel William John Ings  | Southampton     |   38 |   32 |  2812 |  22 |      2 |      66 |    38 |
|  2 |      3 | Pierre-Emerick Aubameyang | Arsenal         |   36 |   35 |  3138 |  22 |      3 |      70 |    42 |
|  3 |      4 | Raheem Shaquille Sterling | Manchester City |   33 |   30 |  2660 |  20 |      1 |      68 |    38 |
|  4 |      5 | Mohamed Salah Ghaly       | Liverpool       |   34 |   33 |  2884 |  19 |     10 |      95 |    59 |

# 2team pick up
df1 = df[(df['TEAM'] == 'Leicester City') | (df['TEAM'] == 'Liverpool')]

import matplotlib.pyplot as plt
from adjustText import adjust_text

fig = plt.figure(figsize=(6,6),dpi=144)
ax = fig.add_subplot(111)

players = []
team_name = ['Leicester City','Liverpool']
for index, row in df1.iterrows():
    player_name = row[1]
    team = row[2]
    goal = row[6]
    assist = row[7]
    if team == team_name[0]:
        color = 'b'
    else:
        color = 'r'
    ax.scatter(goal, assist, c=color, s=25, alpha=0.8, edgecolors='none')
    if goal >=5 or assist >=3:
        players.append(ax.annotate(player_name, xy=(goal + 1, assist + 1), size=8))

adjust_text(players)
ax.legend(loc='best', labels=team_name)
ax.grid(False)

plt.show()

enter image description here

adjust_text(players, arrowprops=dict(arrowstyle='->', color='red'))

enter image description here

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks! The code works in getting the player names separated. Also I like how you have used df.iterrows() , previously I was plotting team by team.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.