Pandas Dataframe: How to update multiple columns by applying a function?

Question

I have a Dataframe df like this:

   A   B   C    D
2  1   O   s    h
4  2   P    
7  3   Q
9  4   R   h    m

I have a function f to calculate C and D based on B for a row:

def f(p): #p is the value of column B for a row. 
     return p+'k', p+'n'

How can I populate the missing values for row 4&7 by applying the function f to the Dataframe?

The expected outcome is like below:

   A   B   C    D
2  1   O   s    h
4  2   P   Pk   Pn
7  3   Q   Qk   Qn
9  4   R   h    m

The function f has to be used as the real function is very complicated. Also, the function only needs to be applied to the rows missing C and D

May you update the question with the complete function in order to reproduce the whole code? — Fabio Lamanna
– Fabio Lamanna, Commented Sep 16, 2015 at 8:22
what is the expected output? sorry but i do not really get your function .. — Colonel Beauvel
– Colonel Beauvel, Commented Sep 16, 2015 at 8:40

Community · Accepted Answer · 2017-05-23 10:34:12Z

20

Maybe there is a more elegant way, but I would do in this way:

df['C'] = df['B'].apply(lambda x: f(x)[0])
df['D'] = df['B'].apply(lambda x: f(x)[1])

Applying the function to the columns and get the first and the second value of the outputs. It returns:

   A  B   C   D
0  1  O  Ok  On
1  2  P  Pk  Pn
2  3  Q  Qk  Qn
3  4  R  Rk  Rn

EDIT:

In a more concise way, thanks to this answer:

df[['C','D']] = df['B'].apply(lambda x: pd.Series([f(x)[0],f(x)[1]]))

edited May 23, 2017 at 10:34

CommunityBot

11 silver badge

answered Sep 16, 2015 at 8:44

Fabio Lamanna

21.7k24 gold badges95 silver badges126 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

John Smith Over a year ago

The function f has to be used as the real function is very complicated. Also, the function only needs to be applied to the rows missing C and D.

Fabio Lamanna Over a year ago

As long as the function returns two arguments it should work in this way.

John Smith Over a year ago

Thanks @Fiabetto. how can we apply the function to only the rows missing values in C and D?

Fabio Lamanna Over a year ago

Ops sorry @ColonelBeauvel, I've been working on that before reading your answer! I've updated the answer with your credits!

Colonel Beauvel Over a year ago

Still, our answer do not answer the question! I edited mine for this purpose.

|

Colonel Beauvel · Accepted Answer · 2015-09-16 11:39:56Z

11

If you want to use your function as such, here is a one liner:

df.update(df.B.apply(lambda x: pd.Series(dict(zip(['C','D'],f(x))))), overwrite=False)

In [350]: df
Out[350]:
   A  B   C   D
2  1  O   s   h
4  2  P  Pk  Pn
7  3  Q  Qk  Qn
9  4  R   h   m

You can also do:

df1 = df.copy()

df[['C','D']] = df.apply(lambda x: pd.Series([x['B'] + 'k', x['B'] + 'n']), axis=1)

df1.update(df, overwrite=False)

edited Sep 16, 2015 at 11:39

answered Sep 16, 2015 at 8:50

Colonel Beauvel

31.3k11 gold badges49 silver badges88 bronze badges

2 Comments

John Smith Over a year ago

This looks nice but it does not use the function f.

Colonel Beauvel Over a year ago

The solution now uses your function f without remodifying it!

patti_jane · Accepted Answer · 2021-06-05 21:40:12Z

9

I have a more easy way to do it, if the table is not so big.

def f(row): #row is the value of row. 
    if row['C']=='':
        row['C']=row['B']+'k'
    if row['D']=='':
        row['D']=row['B']+'n'
    return row
df=df.apply(f,axis=1)

edited Jun 5, 2021 at 21:40

patti_jane

4,0315 gold badges24 silver badges27 bronze badges

answered Sep 26, 2018 at 16:41

Zenith

911 silver badge1 bronze badge

Comments

Michael Dausmann · Accepted Answer · 2021-02-03 04:58:23Z

I found this super confusing but eventually figured out a way of achieving this that didn't hurt my brain. Here it is, sorry if it doesn't match the example well...

dataframe with no index

# function to do the calcs
def f(row):
    my_a = row['a'] # row is a Series, my_a is a scalar string

    if my_a == 'a':  # dummy logic to calc new values based on the row values
        return [1, 2] # return 2 values to update 2 columns
    else:
        return [4, 5]

# simple test frame
input = pd.DataFrame.from_dict({
    'a': ['a', 'd'],
    'b': ['b', 'e'],
    'c': ['c', 'f'],
    'x': [0, 0],
    'y': [0, 0]
})

# apply the function to update the x and y columns with the returned values
input[['x','y']] = input.apply(f, axis=1)

dataframe with an index

if your dataframe has an index.. you need to be a bit more explicit when you are doing the apply to ensure that "list-like results will be turned into columns"...

def f(row): # function to do the calcs
    my_a = row['a'] # row is a Series, my_a is a scalar string
    my_index = row.name # you might also want to use the index value in the calcs

    if my_a == 'a': # dummy logic to calc new values based on the row values
        return [1, 2] # return 2 values to update 2 columns
    else:
        return [4, 5]

input = pd.DataFrame.from_dict({
    'an_index': ['indx1', 'indx2'],
    'a': ['a', 'd'],
    'b': ['b', 'e'],
    'c': ['c', 'f'],
    'x': [0, 0],
    'y': [0, 0]
}).set_index(['an_index'])

# apply the function to update the x and y columns with the returned values
input[['x','y']] = input.apply(f, axis=1, result_type='expand')

Nader Hisham · Accepted Answer · 2015-09-16 10:24:35Z

0

simply by doing the following

df.C.loc[df.C.isnull()] = df.B.loc[df.C.isnull()] + 'k'

df.D.loc[df.D.isnull()] = df.B.loc[df.D.isnull()] + 'n'

check this link indexing-view-versus-copy if you want to know why I've use loc

edited Sep 16, 2015 at 10:24

answered Sep 16, 2015 at 9:50

Nader Hisham

5,4144 gold badges22 silver badges35 bronze badges

Collectives™ on Stack Overflow

Pandas Dataframe: How to update multiple columns by applying a function?

5 Answers 5

6 Comments

2 Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

6 Comments

2 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related