Create multiple dataframes based on the original dataframe columns number

Question

I've search for quite a time, but I haven't found any similar question. If there is, please let me know!

I am currently trying to divide one dataframe into n dataframes where the n is equal to the number of columns of the original dataframe. All the new resulting dataframes must always keep the first column of the original dataframe. An extra would be gather all togheter in a list, for example, for further access.

In order to visualize my intention, here goes an brief example:

 >> original df

 GeneID   A      B      C      D      E
   1     0.3    0.2    0.6    0.4    0.8
   2     0.5    0.3    0.1    0.2    0.6
   3     0.4    0.1    0.5    0.1    0.3
   4     0.9    0.7    0.1    0.6    0.7
   5     0.1    0.4    0.7    0.2    0.5

My desired output would be something like this:

And so on, until all the columns from the original dataframe be covered. What would be the better solution ?

rnso · Accepted Answer · 2018-08-27 03:36:24Z

1

You can use df.columns to get all column names and then create sub-dataframes:

outdflist =[]
# for each column beyond first: 
for col in oridf.columns[1:]:
    # create a subdf with desired columns:
    subdf = oridf[['GeneID',col]]
    # append subdf to list of df: 
    outdflist.append(subdf)

# to view all dataframes created: 
for df in outdflist:
    print(df)

Output:

   GeneID    A
0       1  0.3
1       2  0.5
2       3  0.4
3       4  0.9
4       5  0.1
   GeneID    B
0       1  0.2
1       2  0.3
2       3  0.1
3       4  0.7
4       5  0.4
   GeneID    C
0       1  0.6
1       2  0.1
2       3  0.5
3       4  0.1
4       5  0.7
   GeneID    D
0       1  0.4
1       2  0.2
2       3  0.1
3       4  0.6
4       5  0.2
   GeneID    E
0       1  0.8
1       2  0.6
2       3  0.3
3       4  0.7
4       5  0.5

Above for loop can also be written more simply as list comprehension:

outdflist = [ oridf[['GeneID', col]] 
              for col in oridf.columns[1:] ]

edited Aug 27, 2018 at 3:36

answered Aug 27, 2018 at 2:44

rnso

24.7k26 gold badges127 silver badges270 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

João Fernandes Over a year ago

Thanks, It work just fine. Howerver, I was trying to do it without looping. Wen's answer do it perfectiy

BENY · Accepted Answer · 2018-08-27 02:50:20Z

1

You can do with groupby

d={'df'+ str(x): y for x , y in df.groupby(level=0,axis=1)}
d
Out[989]: 
{'dfA':      A
 0  0.3
 1  0.5
 2  0.4
 3  0.9
 4  0.1, 'dfB':      B
 0  0.2
 1  0.3
 2  0.1
 3  0.7
 4  0.4, 'dfC':      C
 0  0.6
 1  0.1
 2  0.5
 3  0.1
 4  0.7, 'dfD':      D
 0  0.4
 1  0.2
 2  0.1
 3  0.6
 4  0.2, 'dfE':      E
 0  0.8
 1  0.6
 2  0.3
 3  0.7
 4  0.5, 'dfGeneID':    GeneID
 0       1
 1       2
 2       3
 3       4
 4       5}

answered Aug 27, 2018 at 2:50

BENY

324k22 gold badges176 silver badges250 bronze badges

4 Comments

João Fernandes Over a year ago

Thank you! Very breif and simple.Is there any way to allocate the new dataframes in a list instead of a dictionnaire ?

BENY Over a year ago

@JoãoFernandes you just need [ y for x , y in df.groupby(level=0,axis=1)]

BENY Over a year ago

@JoãoFernandes is this what you need ?

João Fernandes Over a year ago

Yes, it is! Thank you!

Joseph Seung Jae Dollar · Accepted Answer · 2018-08-27 03:06:59Z

0

You can create a list of column names, and manually loop through and create a new DataFrame each loop.

>>> import pandas as pd
>>> d = {'col1':[1,2,3], 'col2':[3,4,5], 'col3':[6,7,8]}
>>> df = pd.DataFrame(data=d)
>>> df
   col1  col2  col3
0     1     3     6
1     2     4     7
2     3     5     8
>>> newstuff=[]
>>> columns = list(df)
>>> for column in columns:
...     newstuff.append(pd.DataFrame(data=df[column]))

Unless your dataframe is unreasonably massive, above code should serve its job.

edited Aug 27, 2018 at 3:06

answered Aug 27, 2018 at 2:46

Joseph Seung Jae Dollar

1,0864 gold badges14 silver badges29 bronze badges

1 Comment

rnso Over a year ago

It will help if you can explain how the code is working.

Collectives™ on Stack Overflow

Create multiple dataframes based on the original dataframe columns number

3 Answers 3

1 Comment

4 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

4 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related