pandas Update dataframe after lambda operation

Question

Presently doing an Pandas operation with .apply() function.

fund_table[fund_table.fund_class == 'EQ']['fund_weight'].apply(lambda x: ((x*overall_wts[1])/100))

fund_table[fund_table.fund_class == 'DB']['fund_weight'].apply(lambda x: ((x*overall_wts[0])/100))

fund_table[fund_table.fund_class == 'LQ']['fund_weight'].apply(lambda x: ((x*overall_wts[2])/100))

each code is modifying certain collection of rows, now how do update the main dataframe ,

i tried something like this:

fund_table['fund_weight'] = fund_table[fund_table.fund_class == 'EQ']['fund_weight'].apply(lambda x: ((x*overall_wts[1])/100))
fund_table['fund_weight'] = fund_table[fund_table.fund_class == 'DB']['fund_weight'].apply(lambda x: ((x*overall_wts[0])/100))
fund_table['fund_weight'] = fund_table[fund_table.fund_class == 'LQ']['fund_weight'].apply(lambda x: ((x*overall_wts[2])/100))

but it is failing, all the values of the column 'fund_weight' are changing to Nan

what is the correct way to do it ?

vluzko · Accepted Answer · 2017-11-03 15:04:48Z

When you assign to fund_weight, you overwrite whatever the column previously held, so then the next line is working with the wrong data.

Furthermore, when you filter based on fund_class, you create a smaller dataframe. fund_table[fund_table.fund_class == 'EQ']['fund_weight'] is smaller than fund_table, so the series produced by your apply is smaller. When you try to assign this series to the whole dataframe, pandas fills in the missing values with NaN.

As a result your first line turns every row of fund_weight into NaN, except the rows where fund_class equals 'EQ'. Your next line filters all the rows where fund_class equals 'EQ', so it only sees NaN values, and now all of fund_weight is NaN.

You want something more like:

def calc_new_weight(row):
    if row['fund_class'] == 'EQ':
        overall_wt = overall_wts[1]
    elif row['fund_class'] == 'DB':
        overall_wt = overall_wts[0]
    elif row['fund_class'] == 'LQ':
        overall_wt = overall_wts[2]
    return row['fund_weight'] * overall_wt / 100
fund_table['fund_weight_calc'] = fund_table.apply(calc_new_weight, axis=1)

Andy Hayden · Accepted Answer · 2017-11-03 15:09:06Z

0

You can use .loc:

fund_table.loc[fund_table.fund_class == 'EQ', 'fund_weight'] = fund_table.loc[fund_table.fund_class == 'EQ', 'fund_weight'].apply(lambda x: ((x*overall_wts[1])/100))
# ...

However, this might be better rewritten as a groupby:

wts = dict(zip(["DB", "EQ", "LQ"], overall_wts))
fund_table.groupby("fund_class").apply(lambda x: x * wts[x.name] / 100)

answered Nov 3, 2017 at 15:09

Andy Hayden

378k110 gold badges640 silver badges546 bronze badges

2 Comments

Sriram Arvind Lakshmanakumar Over a year ago

i couldnt get the groupby to work, error occured, so i used the loc method, but what is x.name in groupby method...

Andy Hayden Over a year ago

@Zaiku197212 x.name returns 'EQ' for the EQ group.

Collectives™ on Stack Overflow

pandas Update dataframe after lambda operation

2 Answers 2

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related