How to build a MultiIndex Pandas DataFrame from a nested dictionary with lists

Question

I have the following dictionary.

d= {'key1': {'sub-key1': ['a','b','c','d','e']},
    'key2': {'sub-key2': ['1','2','3','5','8','9','10']}}

With the help of this post, I managed to successfully convert this dictionary to a DataFrame.

df = pd.DataFrame.from_dict({(i,j): d[i][j] 
                            for i in d.keys() 
                            for j in d[i].keys()},
                            orient='index')

However, my DataFrame takes the following form:

                  0  1  2  3  4     5     6
(key1, sub-key1)  a  b  c  d  e  None  None
(key2, sub-key2)  1  2  3  5  8     9    10

I can work with tuples, as index values, however I think it's better to work with a multilevel DataFrame. Post such as this one have helped me to create it in two steps, however I am struggling to do it in one step (i.e. from the initial creation), as the list within the dictionary as well as the tuples afterwards are adding a level of complication.

So you already have a working solution and would like to improve your code ? Please post your working solution, and use codereview.stackexchange.com — WNG
– WNG, Commented Nov 21, 2017 at 14:58
Use df.index = pd.MultiIndex.from_tuples(df.index) on what you've created already? — Zero
– Zero, Commented Nov 21, 2017 at 15:16
@Zero its been a long time seeing you. Where have you been ? — Bharath M Shetty
– Bharath M Shetty, Commented Nov 21, 2017 at 15:21

jezrael · Accepted Answer · 2017-11-21 15:22:10Z

19

I think you are close, for MultiIndex is possible used MultiIndex.from_tuples method:

d = {(i,j): d[i][j] 
       for i in d.keys() 
       for j in d[i].keys()}

mux = pd.MultiIndex.from_tuples(d.keys())
df = pd.DataFrame(list(d.values()), index=mux)
print (df)
               0  1  2  3  4     5     6
key1 sub-key1  a  b  c  d  e  None  None
key2 sub-key2  1  2  3  5  8     9    10

Thanks, Zero for another solution:

df = pd.DataFrame.from_dict({(i,j): d[i][j] 
                            for i in d.keys() 
                            for j in d[i].keys()},
                            orient='index')

df.index = pd.MultiIndex.from_tuples(df.index)
print (df)
               0  1  2  3  4     5     6
key1 sub-key1  a  b  c  d  e  None  None
key2 sub-key2  1  2  3  5  8     9    10

edited Nov 21, 2017 at 15:22

answered Nov 21, 2017 at 15:03

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Bharath M Shetty Over a year ago

Sir I see @Zero's comment now. You can add his name if you update the answer

Brad Solomon Over a year ago

One improvement. Just mux = pd.MultiIndex.from_tuples(d). Similar to how for k in dictionary... iterates over its keys not key/value pairs.

BENY Over a year ago

@Bharath just curious why we do not do modify within the dataframe ...rather than in the dict ...

BENY · Accepted Answer · 2017-11-21 15:47:53Z

3

I will using stack for two level dict....

df=pd.DataFrame(d)

df.T.stack().apply(pd.Series)
Out[230]: 
               0  1  2  3  4    5    6
key1 sub-key1  a  b  c  d  e  NaN  NaN
key2 sub-key2  1  2  3  5  8    9   10

answered Nov 21, 2017 at 15:47

BENY

324k22 gold badges176 silver badges250 bronze badges

2 Comments

BENY Over a year ago

@Bharath also...I just curious .....why we need modify the dict ....personally I do not like reconstruct the dict ,,,

Bharath M Shetty Over a year ago

maybe because its multiindex way of doing it, and it would be much faster than transpose and apply

Collectives™ on Stack Overflow

How to build a MultiIndex Pandas DataFrame from a nested dictionary with lists

2 Answers 2

3 Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related