python pandas flatten a dataframe to a list

Question

I have a df like so:

import pandas
a=[['1/2/2014', 'a', '6', 'z1'], 
   ['1/2/2014', 'a', '3', 'z1'], 
   ['1/3/2014', 'c', '1', 'x3'],
   ]
df = pandas.DataFrame.from_records(a[1:],columns=a[0])

I want to flatten the df so it is one continuous list like so:

['1/2/2014', 'a', '6', 'z1', '1/2/2014', 'a', '3', 'z1','1/3/2014', 'c', '1', 'x3']

I can loop through the rows and extend to a list, but is a much easier way to do it?

possible duplicate of Comprehension for flattening a sequence of sequences? — hlt
– hlt, Commented Aug 22, 2014 at 5:22
i looked at that above answer when searching for an answer. That question isn't a dataframe setting. If that answer solved my problem, I wouldn't have needed to post my question. — jason
– jason, Commented Aug 22, 2014 at 6:22

Saullo G. P. Castro · Accepted Answer · 2021-04-15 21:43:17Z

139

You can use .flatten() on the DataFrame converted to a NumPy array:

df.to_numpy().flatten()

and you can also add .tolist() if you want the result to be a Python list.

Edit

In previous versions of Pandas, the values attributed was used instead of the .to_numpy() method, as mentioned in the comments below.

edited Apr 15, 2021 at 21:43

answered Aug 22, 2014 at 6:02

Saullo G. P. Castro

59.4k28 gold badges191 silver badges244 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Frank Over a year ago

pandas now recommends using .to_numpy() instead of .values.

endolith Over a year ago

@Frank Why? .values already exists, it's a numpy array under the hood. Why call a function?

Frank Over a year ago

@endolith I'm just passing along what the docs say – ask them, not me. Some more context here: stackoverflow.com/a/54508052

meloncholy · Accepted Answer · 2014-08-22 05:55:57Z

20

Maybe use stack?

df.stack().values
array(['1/2/2014', 'a', '3', 'z1', '1/3/2014', 'c', '1', 'x3'], dtype=object)

(Edit: Incidentally, the DF in the Q uses the first row as labels, which is why they're not in the output here.)

edited Aug 22, 2014 at 5:55

answered Aug 22, 2014 at 5:49

meloncholy

2,20220 silver badges16 bronze badges

Comments

Chitrasen · Accepted Answer · 2014-08-22 05:36:13Z

4

You can try with numpy

import numpy as np
np.reshape(df.values, (1,df.shape[0]*df.shape[1]))

answered Aug 22, 2014 at 5:36

Chitrasen

1,72618 silver badges15 bronze badges

Comments

ahmed hindi · Accepted Answer · 2021-07-22 00:08:37Z

4

you can use the reshape method

df.values.reshape(-1)

edited Jul 22, 2021 at 0:08

answered Jul 20, 2021 at 22:55

ahmed hindi

412 bronze badges

1 Comment

Carmoreno Over a year ago

Hi ahmed, you could improve your answer formatting your code, putting links to the official documentation and finally writing the output gotten using your answer.

ZwiTrader · Accepted Answer · 2022-01-28 17:51:19Z

0

The previously mentioned df.values.flatten().tolist() and df.to_numpy().flatten().tolist() are concise and effective, but I spent a very long time trying to learn how to 'do the work myself' via list comprehension and without resorting built-in functions.

For anyone else who is interested, try:

[ row for col in df for row in df[col] ]

Turns out that this solution to flattening a df via list comprehension (which I haven't found elsewhere on SO) is just a small modification to the solution for flattening nested lists (that can be found all over SO):

[ val for sublst in lst for val in sublst ]

answered Jan 28, 2022 at 17:51

ZwiTrader

3451 gold badge4 silver badges16 bronze badges

Collectives™ on Stack Overflow

python pandas flatten a dataframe to a list

5 Answers 5

Edit

3 Comments

Comments

Comments

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Edit

3 Comments

Comments

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related