convert pandas dataframe multi values column into separate rows

Question

I am working with fifa dataset and there I have 2 columns player name and playing positions. The playing position column contains multiple values as players can play in multiple positions. It is a fifa dataset from kaggle

df = pd.read_csv('dataset.csv')
df = df['name', 'pos']
df.head()

index	name	pos
0	messi	ST, RW, LW
1	lewa	ST
2	ronaldo	LW, ST

I want to convert that and map those to 4 separate positions for instance to Att, mid, def, gk . I tried using replace and give it a dictionary with mapping but seems to only work with players that have only 1 position. For instance messi has RW, LW, ST so my mapping doesnot work but someone like Lewandowski who has only ST it works.

Then I tried splitting the column and exploding it into different rows, and then replace , which made three 0 indexed rows with messi but that replaces only on first instance of index as well. How do i solve this ?

df1 = df[['name', 'pos']]
df1['pos'] = df['pos'].str.split(",")
df1 = df1.explode('pos')
new_pos = {'RW': 'ATT', 'LW': 'ATT', 'ST': 'ATT', 'CF': 'ATT', 'CAM': 'MID', 'CDM': 'MID', 'GK': 'GK', 'CM': 'MID' }
df1.replace({'pos': new_pos}, inplace=True)
df1.head()

below is how the df looks, there is no index column i have just added here to show to more clearly.

index	name	pos
0	messi	ATT
0	messi	ST
0	messi	LW

Expected output:

index	name	pos
0	messi	ATT
1	lewa	ATT
2	ronaldo	ATT

Please provide an explicit reproducible example of your input and the matching expected output — mozway
– mozway, Commented Oct 3, 2022 at 19:08

mozway · Accepted Answer · 2022-10-03 19:27:04Z

1

If you only want the first position, the best might be to use a list comprehension:

df['new_pos'] = [new_pos[next(iter(x.split(',')))] for x in df['pos']]

output:

   index     name         pos new_pos
0      0    messi  ST, RW, LW     ATT
1      1     lewa          ST     ATT
2      2  ronaldo      LW, ST     ATT

The (likely much less efficient) pandas equivalent would be:

df['new_pos'] = (df['pos']
                 .str.split(',')
                 .explode()
                 .map(new_pos)
                 .groupby(level=0).first()
                )

answered Oct 3, 2022 at 19:27

mozway

267k13 gold badges56 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

convert pandas dataframe multi values column into separate rows

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related