How to call a function on pandas dataframe with multiple argument

Question

I would like to define a function which will be applied to a dataframe whenever it will be called for a specific columns. I don't want to hard code the column names while defining the funtion. Below is my sample code. The lambda function may be complex one but I am trying with a simple one

def add(X, **args):
  for arg in args:
    X[arg].apply(lambda x: x + 10)
  return X

But if I call this function on my function like below I am getting error though I have these columns in my dataframe.

y = add(df_final['ABC', 'XYZ'])

KeyError: ('ABC', 'XYZ')

also I tried calling like below

y = add(df_final, ['ABC', 'XYZ'])

TypeError: add() takes 1 positional argument but 2 were given

It seems that I am missing some basic things here. How to modify the above code to make it working?

it would be helpful if you shared a sample input dataframe with expected output according to your function. — sammywemmy
– sammywemmy, Commented Aug 10, 2020 at 11:36

Rob Raymond · Accepted Answer · 2020-08-10 12:13:37Z

1

You can follow the **kwargs pattern of optional parameters in addition to named parameters. For purpose of demonstration if no source parameter is given use the dest as the column that is being applied to

df = pd.DataFrame({"ABC":[r for r in range(10)], "XYZ":[r for r in range(10)]})

def add(X, dest="", **kwargs):
    c = dest if "source" not in kwargs else kwargs["source"]
        
    X[dest] = X[c].apply(lambda x: x +10)
    return X
 
df = add(df, dest="ABC")
df = add(df, dest="XYZ", source="ABC")
df = add(df, dest="new", source="XYZ")
df = add(df, dest="new", source="new")
df
print(df.to_string(index=False))

output

 ABC  XYZ  new
  10   20   40
  11   21   41
  12   22   42
  13   23   43
  14   24   44
  15   25   45
  16   26   46
  17   27   47
  18   28   48
  19   29   49

answered Aug 10, 2020 at 12:13

Rob Raymond

31.5k3 gold badges19 silver badges34 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Håkan Svensson · Accepted Answer · 2020-08-10 12:13:27Z

The **args definition implies a dict like object to be passed to add. You need to use *args if you want to pass an arbitrary number of value arguments after your mandatory X argument.

In your func you also need to assign the new column to the dataframe, so that it gets saved. So, given

def add(X, *args):
   for arg in args:
      X[arg] = X[arg].apply(lambda x: x + 10)
   return X

You will get the following:

>>> df
    a   b  ABC  XYZ
0   1   1    6    1
1  34  34    5    2
2  34  34    4    4
3  34  34    3    5
4   d  23    2    6
5   2   2    1    7

df = add(df, *['ABC','XYZ'])

>>> df
    a   b  ABC  XYZ
0   1   1   16   11
1  34  34   15   12
2  34  34   14   14
3  34  34   13   15
4   d  23   12   16
5   2   2   11   17

Collectives™ on Stack Overflow

How to call a function on pandas dataframe with multiple argument

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related