I am working with fifa dataset and there I have 2 columns player name and playing positions. The playing position column contains multiple values as players can play in multiple positions. It is a fifa dataset from kaggle
df = pd.read_csv('dataset.csv')
df = df['name', 'pos']
df.head()
| index | name | pos |
|---|---|---|
| 0 | messi | ST, RW, LW |
| 1 | lewa | ST |
| 2 | ronaldo | LW, ST |
I want to convert that and map those to 4 separate positions for instance to Att, mid, def, gk . I tried using replace and give it a dictionary with mapping but seems to only work with players that have only 1 position. For instance messi has RW, LW, ST so my mapping doesnot work but someone like Lewandowski who has only ST it works.
Then I tried splitting the column and exploding it into different rows, and then replace , which made three 0 indexed rows with messi but that replaces only on first instance of index as well. How do i solve this ?
df1 = df[['name', 'pos']]
df1['pos'] = df['pos'].str.split(",")
df1 = df1.explode('pos')
new_pos = {'RW': 'ATT', 'LW': 'ATT', 'ST': 'ATT', 'CF': 'ATT', 'CAM': 'MID', 'CDM': 'MID', 'GK': 'GK', 'CM': 'MID' }
df1.replace({'pos': new_pos}, inplace=True)
df1.head()
below is how the df looks, there is no index column i have just added here to show to more clearly.
| index | name | pos |
|---|---|---|
| 0 | messi | ATT |
| 0 | messi | ST |
| 0 | messi | LW |
Expected output:
| index | name | pos |
|---|---|---|
| 0 | messi | ATT |
| 1 | lewa | ATT |
| 2 | ronaldo | ATT |