1

I have a nested dict with following structure: course_id, nested dict with: 2 recommended courses and number of purchases for every course. For example entries of this dict look smth like this:

 {490: {566: 253, 551: 247},
 357: {571: 112, 356: 100},
 507: {570: 172, 752: 150}}

I tried this code to make a dataframe from this dict:

result=pd.DataFrame.from_dict(dicts, orient='index').stack().reset_index()
result.columns=['Course ID','Recommended course','Number of purchases']

Pls. see the output

This doesn't quite work for me, because I want an output where there will be 5 columns. Course ID, recommended course 1, purchases 1, recommended course 2, purchases 2. Is there any solution for this? Thanks in advance.

2
  • 1
    Please show what the expect output looks like with a synthetic dataframe. Commented Jun 19, 2020 at 16:44
  • 1
    Yes, and please explain what is what in the nested dictionary. Commented Jun 19, 2020 at 16:46

3 Answers 3

1

I would recommend you just re-shape your dictionary then re-create your dataframe, however you're not far off from getting your target output from your current dataframe.

we can groupby and use cumcount to create our unique column then unstack and assign our column from the multi index header that was created.

s1 = result.groupby(['Course ID',
             result.groupby(['Course ID']).cumcount() + 1]).first().unstack()

s1.columns = [f"{x}_{y}" for x,y in s1.columns]


              Recommended course_1  Recommended course_2  Number of purchases_1  \
Course ID                                                                      
357                         571                   356                  112.0   
490                         566                   551                  253.0   
507                         570                   752                  172.0   

           Number of purchases_2  
Course ID                         
357                        100.0  
490                        247.0  
507                        150.0
Sign up to request clarification or add additional context in comments.

Comments

0

Not an efficient one, but should work in your case:-

df = pd.DataFrame([(k,list(v.keys())[0],list(v.values())[0],list(v.keys())[1],list(v.values())[1]) for k,v in a.items()], columns = ['Course ID','Recommended course 1','purchases 1', 'Recommended Course 2', 'purchases 2'])
print(df)

Output:-

   Course ID  Recommended course 1  purchases 1  Recommended Course 2  \
0        490                   566          253                   551
1        357                   571          112                   356
2        507                   570          172                   752

   purchases 2
0          247
1          100
2          150

Comments

0

You can use itertools chain to convert the nested dict into a flat list of key, value pairs, and store into a dictionary d2 using dictionary comprehension where the keys are the course id, and then proceed with forming the dataframe using pandas.

import pandas as pd
from itertools import chain

d = {
    490: {566: 253, 551: 247},
    357: {571: 112, 356: 100},
    507: {570: 172, 752: 150}
}

d2 = {k: list(chain.from_iterable(v.items())) for k, v in d.items()}
df = pd.DataFrame.from_dict(d2, orient='index').reset_index()
df.columns = ['id','rec_course1', 'n_purch_1', 'rec_course2', 'n_purch_2']

df

    id   rec_course1  n_purch_1  rec_course2  n_purch_2
0  490           566        253          551        247
1  357           571        112          356        100
2  507           570        172          752        150

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.