3

How to pass df10 and df20 (and even more dataframes) through func simultaneously and keep their names for further use?

import pandas as pd
import numpy as np

df = pd.DataFrame( {
   'A': ['d','d','d','d','d','d','g','g','g','g','g','g','k','k','k','k','k','k'],
   'B': [5,5,6,4,5,6,-6,7,7,6,-7,7,-8,7,-6,6,-7,50],
   'C': [1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2],
   'S': [2012,2013,2014,2015,2016,2012,2012,2014,2015,2016,2012,2013,2012,2013,2014,2015,2016,2014]     
    } );

df10 = (df.B + df.C).groupby([df.A, df.S]).agg(['sum','size']).unstack(fill_value=0)

df20 = (df['B'] - df['C']).groupby([df.A, df.S]).agg(['sum','size']).unstack(fill_value=0)

def func(df):
    df1 = df.groupby(level=0, axis=1).sum()
    new_cols= list(zip(df1.columns.get_level_values(0),['total'] * len(df.columns)))
    df1.columns = pd.MultiIndex.from_tuples(new_cols)
    df2 = pd.concat([df1,df], axis=1).sort_index(axis=1).sort_index(axis=1, level=1)
    df2.columns = ['_'.join((col[0], str(col[1]))) for col in df2.columns]
    df2.columns = df2.columns.str.replace('sum_','')
    df2.columns = df2.columns.str.replace('size_','T')
    return df2

EDIT, per request the dataframes printed;

print(df10) print(df20)

df10:

    sum size
S   2012    2013    2014    2015    2016    2012    2013    2014    2015    2016
A                                       
d   13  6   7   5   6   2   1   1   1   1
g   -11 8   8   8   7   2   1   1   1   1
k   -6  9   48  8   -5  1   1   2   1   1



 df20:

    sum size
S   2012    2013    2014    2015    2016    2012    2013    2014    2015    2016
A                                       
d   9   4   5   3   4   2   1   1   1   1
g   -15 6   6   6   5   2   1   1   1   1
k   -10 5   40  4   -9  1   1   2   1   1

print outs added

2
  • Can you show update your code with a sample of what df10 and df20 look like please? Commented Dec 17, 2016 at 12:08
  • I think it would be easiest with a for loop over a list of all of the DataFrames that you wish you apply the function to. Though it depends what you wish to do with these DataFrames after func. Commented Dec 17, 2016 at 12:57

1 Answer 1

7

Edit: There is probably a much better way to do this; I just thought I would offer this suggestion. If it is not as required, please let me know, and I will delete.

How to pass df10 and df20 (and even more dataframes) through func simultaneously and keep their names for further use?

If all you wanted to do is pass multiple functions through func and all your data frames are the same format, something as follows may work.

For simplicity take the dataframes:

df10 = pd.DataFrame({'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]})
df20 = pd.DataFrame({'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]})
df30 = pd.DataFrame({'one' : [1., 2., 3., 4.], 'two' : [4., 3., 2., 1.]})

and a simple function:

your_func(df):
    #### Perform some action/change to df eg
    df2 = df.head(1)
    return df2

Create a list of your original dataframes:

A = [df10,df20,df30]

A = [   one  two
    0  1.0  4.0
    1  2.0  3.0
    2  3.0  2.0
    3  4.0  1.0,    
        one  two
    0  1.0  4.0
    1  2.0  3.0
    2  3.0  2.0
    3  4.0  1.0,    
        one  two
    0  1.0  4.0
    1  2.0  3.0
    2  3.0  2.0
    3  4.0  1.0]

Then, use a for loop to pass each data-frame through a list e.g. This will keep your original dataframes unchanged.

for i in range(0,len(A)):
    A[i] = your_func(A[i])

Output:

A = [
 one  two
0  1.0  4.0,
 one  two
0  1.0  4.0,
 one  two
0  1.0  4.0]

So, now the list A contains each of the new dataframes. And your original dataframes df10 df20 etc remain unchanged. Merely call the elements of A to access your new dataframes.

Sign up to request clarification or add additional context in comments.

6 Comments

Alternatively, use map: newA = map(your_funct, A) or list comprehension: newA = [your_func(i) for i in A]
@Parfait - Your way is much cleaner and more efficient. I always forget about map. Thanks for the comment: I'm learning too.
Thanks guys, however I need to change the df's and be able to call them individually after warms.
@Zanshin Can you keep track of their position in the list via their index? Then it will be easy to call them afterwards. Alternatively you could create a new, blank dataframe for as many dataframes that you start with.
Do you have an example?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.