Modifying DataFrames in loop

Question

Given this data frame:

import pandas as pd
df=pd.DataFrame({'A':[1,2,3],'B':[4,5,6],'C':[7,8,9]})
df
    A   B   C
0   1   4   7
1   2   5   8
2   3   6   9

I'd like to create 3 new data frames; one from each column. I can do this one at a time like this:

a=pd.DataFrame(df[['A']])
a
    A
0   1
1   2
2   3

But instead of doing this for each column, I'd like to do it in a loop.

Here's what I've tried:

a=b=c=df.copy()
dfs=[a,b,c]
fields=['A','B','C']
for d,f in zip(dfs,fields):
    d=pd.DataFrame(d[[f]])

...but when I then print each one, I get the whole original data frame as opposed to just the column of interest.

a
        A   B   C
    0   1   4   7
    1   2   5   8
    2   3   6   9

Update: My actual data frame will have some columns that I do not need and the columns will not be in any sort of order, so I need to be able to get the columns by name.

Thanks in advance!

cs95 · Accepted Answer · 2017-08-04 16:13:37Z

3

A simple list comprehension should be enough.

In [68]: df_list = [df[[x]] for x in df.columns]

Printing out the list, this is what you get:

In [69]: for d in df_list:
    ...:     print(d)
    ...:     print('-' * 5)
    ...:     
   A
0  1
1  2
2  3
-----
   B
0  4
1  5
2  6
-----
   C
0  7
1  8
2  9
-----

Each element in df_list is its own data frame, corresponding to each data frame from the original. Furthermore, you don't even need fields, use df.columns instead.

answered Aug 4, 2017 at 16:13

cs95

406k106 gold badges744 silver badges797 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

BENY · Accepted Answer · 2017-08-04 16:26:31Z

1

Or you can try this, instead create copy of df, this method will return the result as single Dataframe, not a list, However, I think save Dataframe into a list is better

dfs=['a','b','c']
fields=['A','B','C']
variables = locals()
for d,f in zip(dfs,fields):
    variables["{0}".format(d)] = df[[f]]

a
Out[743]: 
   A
0  1
1  2
2  3
b
Out[744]: 
   B
0  4
1  5
2  6
c
Out[745]: 
   C
0  7
1  8
2  9

edited Aug 4, 2017 at 16:26

answered Aug 4, 2017 at 16:18

BENY

324k22 gold badges176 silver badges250 bronze badges

1 Comment

Dance Party2 Over a year ago

Follow-up question posted here: stackoverflow.com/questions/45511995/…

Jon Deaton · Accepted Answer · 2017-08-04 16:12:35Z

1

You should use loc

a = df.loc[:,0]

and then loop through like

for i in range(df.columns.size):
   dfs[i] = df.loc[:, i]

answered Aug 4, 2017 at 16:12

Jon Deaton

4,4997 gold badges31 silver badges45 bronze badges

2 Comments

cs95 Over a year ago

This is overkill, considering you could iterate over columns directly. And use df.loc.

Jon Deaton Over a year ago

Ahh okay, yes your answer is better

Collectives™ on Stack Overflow

Modifying DataFrames in loop

3 Answers 3

Comments

1 Comment

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related