Create pandas dataframe from list of dictionary of list

Question

How can I create a dataframe from a list of dictionaries that contain list of rows for each key? Please check example below:

>>> import pandas as pd
>>> rec_set1 = {'col1': [1,2,3], 'col2': [5,3,4], 'col3': ['x','y','z']}
>>> rec_set2 = {'col1': [5,6,7], 'col2': [-4,6,2], 'col3': ['p','q','r']}
>>> rec_set_all = [rec_set1, rec_set2]
>>> df = pd.DataFrame.from_records(rec_set1)
>>> df
   col1  col2 col3
0     1     5    x
1     2     3    y
2     3     4    z

All good so far.
Now I try to append rec_set2 and this is what happens:

>>> df = df.append(rec_set2, ignore_index=True)
>>> df
        col1        col2       col3
0          1           5          x
1          2           3          y
2          3           4          z
3  [5, 6, 7]  [-4, 6, 2]  [p, q, r]

Not what I was expecting. What append function should I use ?
And rather than doing it in a loop, is there a simple one-line way to create the entire dataframe from rec_set_all ?

pd.concat([pd.DataFrame(rec_set1), pd.DataFrame(rec_set2)])? — user2285236
– user2285236, Commented Jan 8, 2020 at 21:08
wouldn't this df = df.append(pd.DataFrame(rec_set2), ignore_index=True) work? as in you just forgot to turn the other dictionary into a dataframe? — Buckeye14Guy
– Buckeye14Guy, Commented Jan 8, 2020 at 21:09
Not what I was expecting. Really? Have you looked at the docs for .append()? — AMC
– AMC, Commented Jan 8, 2020 at 21:22
I forgot to add: Where is this data coming from? Odds are we can avoid this issue entirely. — AMC
– AMC, Commented Jan 8, 2020 at 21:29

ddjanke · Accepted Answer · 2020-01-08 21:28:37Z

2

Assuming you are starting out with a list of dictionaries of lists, you can start by using list comprehension to turn it into a list of DataFrames:

rec_set1 = {'col1': [1,2,3], 'col2': [5,3,4], 'col3': ['x','y','z']}
rec_set2 = {'col1': [5,6,7], 'col2': [-4,6,2], 'col3': ['p','q','r']}
... (etc.)
rec_setn = {...}
rec_set_all = [rec_set1, rec_set2,...,rec_setn]

df_list = [pd.DataFrame(r) for r in rec_set_all]

Next, you can use the simple pd.concat method do combine it all into one DataFrame:

df_all = pd.concat(df_list)

If you want to reset the indexes so that it is coninuous rather than 0,1,2,0,1,2,etc., you can use this to renumber them all from 0:

df.reset_index(inplace=True,drop=True)

The result from your example would be:

    col1 col2 col3
0    1    5     x
1    2    3     y
2    3    4     z
3    5   -4     p
4    6    6     q
5    7    2     r

Edit

Including info from the comment from AMC, it can be written as a one-liner:

df = pd.concat([pd.DataFrame(r) for r in rec_set_all], ignore_index = True)

edited Jan 8, 2020 at 21:28

answered Jan 8, 2020 at 21:20

ddjanke

1215 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

AMC Over a year ago

pandas.concat() has an ignore_index parameter, so you can probably avoid having to do the .reset_index().

AMC Over a year ago

It might be more efficient to use a generator expression instead: df = pd.concat((pd.DataFrame(r) for r in rec_set_all), ignore_index = True)

deeSo Over a year ago

df = pd.concat([pd.DataFrame(r) for r in rec_set_all], ignore_index = True) works perfectly ! Although it looks like there is no simple pd.DataFrame(rec_set_all) type of call that does the iteration internally for you.

AMC Over a year ago

@deeSo How did you end up in this situation? Surely there has to a be a better way of doing things, no?

deeSo Over a year ago

@AMC I used a simple example to provide clarity. I am actually working on a larger data set - multiple input files each having json in the format described in rec_set1

Ali · Accepted Answer · 2020-01-08 21:17:03Z

0

This will also work. Just append the new dict as a DataFrame.

rec_set1 = {'col1': [1,2,3], 'col2': [5,3,4], 'col3': ['x','y','z']}
rec_set2 = {'col1': [5,6,7], 'col2': [-4,6,2], 'col3': ['p','q','r']}
rec_set_all = [rec_set1, rec_set2]
df = pd.DataFrame(rec_set1)

# append as rec_set2 as a DataFrame
df.append(pd.DataFrame(rec_set2))

answered Jan 8, 2020 at 21:17

Ali

3383 silver badges8 bronze badges

1 Comment

AMC Over a year ago

It's better to concatenate than to append repeatedly.

Collectives™ on Stack Overflow

Create pandas dataframe from list of dictionary of list

2 Answers 2

Edit

5 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Edit

5 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related