Converting lists of lists of tuples to MultiIndex pandas dataframe

Question

I have a structure like this:

[
  [
    ('2019-12-01', '0.03555', '0.03', '0.03', '0.03'), 
    ('2019-12-02', '0.03', '0.03', '1', '0.03')
  ],
  [
    ('2019-12-01', '0.111', '0.02', '0.03', '0.03'), 
    ('2019-12-02', '0.03', '0.03', '0.03', '0.03')
  ]
]

I would like each list entry to be an index in a pandas dataframe, with the tuples being rows in the df. Something like this:

                         LIST_1                      LIST_2
         date      p1    p2     p3    p4    |   p1    p2     p3    p4
0   2019-12-01  0.03555  0.03  0.03   0.03  | 0.03  0.03  0.03   0.03
1   2019-12-02     0.03  0.03     1   0.03  | 0.03  0.03  0.03   0.03

I know this is messy, to be honest, I'm unsure the best way to structure it in Pandas as I'm new to it, so any advice would be appreciated.

I have tried to flatten the strucutre using:

d = pd.DataFrame([t for lst in a for t in lst])

But then I just end up with a df as expect like this:

        0          1     2     3      4
0   2019-12-01  0.03555  0.03  0.03   0.03
1   2019-12-02     0.03  0.03     1   0.03
2   2019-12-01    0.111  0.02  0.03   0.03
3   2019-12-02     0.03  0.03  0.03   0.03

But this isn't suitable

jezrael · Accepted Answer · 2020-02-13 14:15:50Z

1

Use list comprehension for create first level of MultiIndex by range with length of list lst with f-strings.

Then use main list comprehension by all values of list with convert inner list to DateFrames, create index by first column by DataFrame.set_index, then rename columns by DataFrame.add_prefix.

Last join all list of DataFrames by concat with keys parameter for first level of MultiIndex and remove index name 0 by DataFrame.rename_axis:

L = [f'LIST_{i}' for i in range(1, len(lst)+1)]
df = (pd.concat([pd.DataFrame(x).set_index(0).add_prefix('p') for x in lst], axis=1, keys=L)
        .rename_axis(None))
print (df)
             LIST_1                   LIST_2                  
                 p1    p2    p3    p4     p1    p2    p3    p4
2019-12-01  0.03555  0.03  0.03  0.03  0.111  0.02  0.03  0.03
2019-12-02     0.03  0.03     1  0.03   0.03  0.03  0.03  0.03

edited Feb 13, 2020 at 14:15

answered Feb 13, 2020 at 12:44

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Bob Over a year ago

Thank you @jezrael :D - are you able to just edit the post and describe what each section is doing? will help me consolidate my understanding rather than just ctrl-c and ctrl-v - also could you show me an example of how to query such a structure

jezrael Over a year ago

@Bob - I think for selecting you can check this

Bob Over a year ago

Thanks for the edit and will take a look at that post.

Bob Over a year ago

Whats the best way for me to add an index to the first col on the MultiIndex df

jezrael Over a year ago

@Bob - hmmm, I think it is possible by df = df.reset_index()

|

Collectives™ on Stack Overflow

Converting lists of lists of tuples to MultiIndex pandas dataframe

1 Answer 1

7 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

7 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related