Take multiple lists into dataframe

Question

How do I take multiple lists and put them as different columns in a python dataframe? I tried this solution but had some trouble.

Attempt 1:

Have three lists, and zip them together and use that res = zip(lst1,lst2,lst3)
Yields just one column

Attempt 2:

percentile_list = pd.DataFrame({'lst1Tite' : [lst1],
                                'lst2Tite' : [lst2],
                                'lst3Tite' : [lst3] }, 
                                columns=['lst1Tite','lst1Tite', 'lst1Tite'])

yields either one row by 3 columns (the way above) or if I transpose it is 3 rows and 1 column

How do I get a 100 row (length of each independent list) by 3 column (three lists) pandas dataframe?

maxymoo · Accepted Answer · 2018-08-14 23:47:47Z

514

I think you're almost there, try removing the extra square brackets around the lst's (Also you don't need to specify the column names when you're creating a dataframe from a dict like this):

import pandas as pd
lst1 = range(100)
lst2 = range(100)
lst3 = range(100)
percentile_list = pd.DataFrame(
    {'lst1Title': lst1,
     'lst2Title': lst2,
     'lst3Title': lst3
    })

percentile_list
    lst1Title  lst2Title  lst3Title
0          0         0         0
1          1         1         1
2          2         2         2
3          3         3         3
4          4         4         4
5          5         5         5
6          6         6         6
...

If you need a more performant solution you can use np.column_stack rather than zip as in your first attempt, this has around a 2x speedup on the example here, however comes at bit of a cost of readability in my opinion:

import numpy as np
percentile_list = pd.DataFrame(np.column_stack([lst1, lst2, lst3]), 
                               columns=['lst1Title', 'lst2Title', 'lst3Title'])

edited Aug 14, 2018 at 23:47

answered May 29, 2015 at 6:40

maxymoo

36.7k12 gold badges97 silver badges121 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

user48956 Over a year ago

Is np.column_stack a view, or does it copy the data. (If copy, it seems like this could be much more efficient (O(1), not O(n)).

joe5 Over a year ago

@maxymoo can column names be automatically set to the list name?

user6386155 Over a year ago

numpy column stack does not work well if the lists are of different datatypes

legoscia · Accepted Answer · 2017-06-16 09:47:41Z

92

Adding to Aditya Guru's answer here. There is no need of using map. You can do it simply by:

pd.DataFrame(list(zip(lst1, lst2, lst3)))

This will set the column's names as 0,1,2. To set your own column names, you can pass the keyword argument columns to the method above.

pd.DataFrame(list(zip(lst1, lst2, lst3)),
              columns=['lst1_title','lst2_title', 'lst3_title'])

edited Jun 16, 2017 at 9:47

legoscia

42.1k23 gold badges123 silver badges179 bronze badges

answered Jun 16, 2017 at 9:22

Abhinav Gupta

1,95816 silver badges15 bronze badges

1 Comment

Sarfraaz Ahmed Over a year ago

In Python 3.8, and Pandas 1.0, we don't need to use list function, since DataFrame expects an iterable, and zip() returns an iterable object. So, pd.DataFrame(zip(lst1, lst2, lst3)) should also do.

oopsi · Accepted Answer · 2018-07-07 08:18:16Z

21

Adding one more scalable solution.

lists = [lst1, lst2, lst3, lst4]
df = pd.concat([pd.Series(x) for x in lists], axis=1)

answered Jul 7, 2018 at 8:18

oopsi

2,0593 gold badges24 silver badges28 bronze badges

2 Comments

ZakS Over a year ago

can you explain this one a bit?

yona bendelac Over a year ago

You join (concat) series vertically (axis=1) to create DataFrame from the list of lists

Reetesh Kumar · Accepted Answer · 2020-07-09 05:47:50Z

21

There are several ways to create a dataframe from multiple lists.

list1=[1,2,3,4]
list2=[5,6,7,8]
list3=[9,10,11,12]

pd.DataFrame({'list1':list1, 'list2':list2, 'list3'=list3})
pd.DataFrame(data=zip(list1,list2,list3),columns=['list1','list2','list3'])

answered Jul 9, 2020 at 5:47

Reetesh Kumar

2992 silver badges5 bronze badges

Comments

Aditya Guru · Accepted Answer · 2017-02-19 18:44:47Z

15

Just adding that using the first approach it can be done as -

pd.DataFrame(list(map(list, zip(lst1,lst2,lst3))))

answered Feb 19, 2017 at 18:44

Aditya Guru

7322 gold badges10 silver badges20 bronze badges

Comments

Wickkiey · Accepted Answer · 2019-01-16 14:55:33Z

12

Adding to above answers, we can create on the fly

df= pd.DataFrame()
list1 = list(range(10))
list2 = list(range(10,20))
df['list1'] = list1
df['list2'] = list2
print(df)

hope it helps !

answered Jan 16, 2019 at 14:55

Wickkiey

4,6822 gold badges42 silver badges47 bronze badges

Comments

a_parida · Accepted Answer · 2019-11-27 14:30:53Z

6

@oopsi used pd.concat() but didn't include the column names. You could do the following, which, unlike the first solution in the accepted answer, gives you control over the column order (avoids dicts, which are unordered):

import pandas as pd
lst1 = range(100)
lst2 = range(100)
lst3 = range(100)

s1=pd.Series(lst1,name='lst1Title')
s2=pd.Series(lst2,name='lst2Title')
s3=pd.Series(lst3 ,name='lst3Title')
percentile_list = pd.concat([s1,s2,s3], axis=1)

percentile_list
Out[2]: 
    lst1Title  lst2Title  lst3Title
0           0          0          0
1           1          1          1
2           2          2          2
3           3          3          3
4           4          4          4
5           5          5          5
6           6          6          6
7           7          7          7
8           8          8          8
...

edited Nov 27, 2019 at 14:30

a_parida

6362 gold badges7 silver badges26 bronze badges

answered Aug 17, 2019 at 4:50

dabru

1,05112 silver badges10 bronze badges

1 Comment

jtlz2 Over a year ago

dicts are not unordered for python > 2

jtlz2 · Accepted Answer · 2022-11-09 17:45:06Z

5

you can simply use this following code

train_data['labels']= train_data[["LABEL1","LABEL1","LABEL2","LABEL3","LABEL4","LABEL5","LABEL6","LABEL7"]].values.tolist()
train_df = pd.DataFrame(train_data, columns=['text','labels'])

edited Nov 9, 2022 at 17:45

jtlz2

8,53511 gold badges74 silver badges128 bronze badges

answered Jun 4, 2020 at 19:28

Shaina Raza

1,66820 silver badges13 bronze badges

Comments

jtlz2 · Accepted Answer · 2022-10-13 19:41:37Z

1

I just did it like this (python 3.9):

import pandas as pd
my_dict=dict(x=x, y=y, z=z) # Set column ordering here
my_df=pd.DataFrame.from_dict(my_dict)

This seems to be reasonably straightforward (albeit in 2022) unless I am missing something obvious...

In python 2 one could've used a collections.OrderedDict().

answered Oct 13, 2022 at 19:41

jtlz2

8,53511 gold badges74 silver badges128 bronze badges

Collectives™ on Stack Overflow

Take multiple lists into dataframe

9 Answers 9

3 Comments

1 Comment

2 Comments

Comments

Comments

Comments

1 Comment

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

9 Answers 9

3 Comments

1 Comment

2 Comments

Comments

Comments

Comments

1 Comment

Comments

Comments

Linked

Related