1

I would like to construct a MultiIndex DataFrame from a deeply-nested dictionary of the form

md = {'50': {'100': {'col1': ('0.100',
                              '0.200',
                              '0.300',
                              '0.400'),
                     'col2': ('6.263E-03',
                              '6.746E-03',
                              '7.266E-03',
                              '7.825E-03')},
             '101': {'col1': ('0.100',
                              '0.200',
                              '0.300',
                              '0.400'),
                     'col2': ('6.510E-03',
                              '7.011E-03',
                              '7.553E-03',
                              '8.134E-03')}
             '102': ...
            }
      '51': ...
     }

I've tried

df = pd.DataFrame.from_dict({(i,j): md[i][j][v] for i in md.keys() for j in md[i].keys() for v in md[i][j]}, orient='index')

following Construct pandas DataFrame from items in nested dictionary, but I get a DataFrame with 1 row and many columns.

Bonus: I'd also like to label the MultiIndex keys and the columns 'col1' and 'col2', as well as convert the strings to int and float, respectively.

How can I reconstruct my original dictionary from the dataframe? I tried df.to_dict('list').

1 Answer 1

3

Check out this answer: https://stackoverflow.com/a/24988227/9404057. This method unpacks the keys and values of the dictionary, and reforms the data into an easily processed format for multiindex dataframes. Note that if you are using python 3.5+, you will need to use .items() rather than .iteritems() as shown in the linked answer:

>>>>import pandas as pd
>>>>reform = {(firstKey, secondKey, thirdKey): values for firstKey, middleDict in md.items() for secondKey, innerdict in middleDict.items() for thirdKey, values in innerdict.items()}
>>>>df = pd.DataFrame(reform)

To change the data type of col1 and col to int and float, you can then use pandas.DataFrame.rename() and specify any values you want:

df.rename({'col1':1, 'col2':2.5}, axis=1, level=2, inplace=True)

Also, if you'd rather have the levels on the index rather than the columns, you can also use pandas.DataFrame.T

If you wanted to reconstruct your dictionary from this MultiIndex, you could do something like this:

>>>>md2={}
>>>>for i in df.columns:
        if i[0] not in md2.keys():
            md2[i[0]]={}
        if i[1] not in md2[i[0]].keys():
            md2[i[0]][i[1]]={}
    md2[i[0]][i[1]][i[2]]=tuple(df[i[0]][i[1]][i[2]].values)
Sign up to request clarification or add additional context in comments.

2 Comments

How can I reconstruct my original dictionary from the dataframe?
@redhotsnow I added a reconstruction technique to my answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.