I have a dataset 'df' with 3 columns.
>> Original Data
Student Id Name Marks
0 id_1 John 112
1 id_2 Rafs 181
2 id_2 Rafs 182
3 id_2 Rafs 183
4 id_3 Juan 222
5 id_3 Juan 312
6 id_3 Roller 21
Trying to keep the columns 'Student_Id' and 'Name' as it is but convert 'Marks' to multiple columns. Such that each unique 'Student_Id' and 'Name' will have a row of all possible Marks. Also we need not create columns manually but it should be dynamically created depending on the values.
>> Expected Output
Student Id Name Marks1 Marks2 Marks3
0 id_1 John 112 <NA> <NA>
1 id_2 Rafs 181 182 183
2 id_3 Juan 222 312 <NA>
3 id_3 Roller 21 <NA> <NA>
Sample data to replicate the input
import pandas as pd
data = [
["id_1", 'John', 112],
["id_2", 'Rafs', 181],
["id_2", 'Rafs', 182],
["id_2", 'Rafs', 183],
["id_3", 'Juan', 222],
["id_3", 'Juan', 312],
["id_3", 'Roller', 21]
]
df = pd.DataFrame(data, columns = ['Student Id', 'Name', 'Marks'])
I tried the below but I am not getting the desired output. It gives results in brackets() an also the Marks is missing.
df3 = df.pivot_table(index=['Student Id','Name'], columns='Marks', aggfunc = 'max')
>>Output
Empty DataFrame
Columns: []
Index: [(id_1, John), (id_2, Rafs), (id_3, Juan), (id_3, Roller)]