1

I am stuck in one problem and am not able to go ahead.. please need help to move further. I have input excel in this format...

enter image description here

Name    usn          Sub   marks
dhdn    1bm15mca13    c     90
                     java   95
                     python 98
subbu   1bm15mca13   java   92
                     perl   91
paddu   1bm15mca13    c#    80
                     java   81

And am trying to get expected dictionary in this format:

d = [
{
"name":"dhdn",
"usn":1bm15mca13",
"sub":["c","java","python"],
"marks":[90,95,98]
},
{
"name":"subbu",
"usn":1bm15mca14",
"sub":["java","perl"],
"marks":[92,91]
},
{
"name":"paddu",
"usn":1bm15mca17",
"sub":["c#","java"],
"marks":[80,81]
}
]

Tried code but it is working for only two column

import pandas as pd
existing_excel_file = 'test.xls'

df_service = pd.read_excel(existing_excel_file, sheet_name='Sheet1')

df_service = df_service.fillna(method='ffill')
result = [{'name':k,'sub':g["sub"].tolist()} for k,g in df_service.groupby("name")]
print (result)

Please provide idea or suggestion to solve my problem.

8
  • 1
    Why dont you group by both name and usn. Then your index will be a tuple. of which one will be name and other will be usn. Also, in the grouped dataframe, you still have all the columns. so g['marks'].tolist() will give you the marks list. Commented Feb 27, 2020 at 18:53
  • @najeem please can u show how? Commented Feb 27, 2020 at 19:03
  • Please paste data that I can use instead of a screenshot. May be after the ffill. Commented Feb 27, 2020 at 19:10
  • import pandas as pd existing_excel_file = 'test.xls' df_service = pd.read_excel(existing_excel_file, sheet_name='Sheet2') df_service = df_service.fillna(method='ffill') result = [{'name':k,'usn':k,'sub':g["sub"].tolist(),"marks":g["marks"].tolist()} for k,g in df_service.groupby(['name', 'usn'])] print (result) I tried this way but here name usn both coming in single key value Commented Feb 27, 2020 at 19:13
  • 'name':k[0], 'usn':k[1] Commented Feb 27, 2020 at 19:17

1 Answer 1

1
import pandas as pd
existing_excel_file = 'test.xls'

df_service = pd.read_excel(existing_excel_file, sheet_name='Sheet1')

df_service = df_service.fillna(method='ffill')
result = [{'name':k[0],'usn':k[1],'sub':v["sub"].tolist(),"marks":v["marks"].tolist()} for k,v in df_service.groupby(['name', 'usn'])]
pprint (result)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.