Lambda function without passing argument

Question

I have an example dataframe with columns 'one' and 'two' consisting of some random ints. I was trying to understand some code with a lambda function in more depth and was puzzled that the code seems to magically work without providing an argument to be passed to the lambda function.

Initially I am creating a new column 'newcol' with pandas assign() method and pass df into an explicit lambda function func(df). The function returns the logs of the df's 'one' column:

df=df.assign(newcol=func(df))

So far so good.

However, what puzzles me is that the code works as well without passing df.

df=df.assign(newcol2=func)

Even if I don't pass (df) into the lambda function, it correctly performs the operation. How does the interpreter know that df is being passed into the lambda function?

Example code below and output:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randint(1,10,size=16).reshape(8,2),columns=["one","two"])
func=lambda x: np.log(x.one)
df=df.assign(newcol=func(df))
print(df)

#This one works too, but why?
df=df.assign(newcol2=func)
print(df)

Output:
   one  two    newcol   newcol2
0    1    8  0.000000  0.000000
1    6    7  1.791759  1.791759
2    2    6  0.693147  0.693147
3    2    8  0.693147  0.693147
4    4    2  1.386294  1.386294
5    9    3  2.197225  2.197225
6    2    2  0.693147  0.693147
7    4    7  1.386294  1.386294

(Note I could have used the lambda func inline of assign but have it here explicit for the sake of clarity.)

I don't know much about df, but this code: df.assign(newcol=func(df)) means you have already called func with param df. However this: df.assign(newcol2=func) means you are passing func in without calling it, so maybe df can call it when it wants to. — quamrana
– quamrana, Commented Oct 15, 2019 at 9:57
As @quamrana says. In the documentation it says "If the values are callable, they are computed on the DataFrame and assigned to the new columns.... If the values are not callable, (e.g. a Series, scalar, or array), they are simply assigned." so in the second example, it's applying the function. — David Buck
– David Buck, Commented Oct 15, 2019 at 10:01
As a side note, Python code is interpreted, not compiled (typically). — norok2
– norok2, Commented Oct 15, 2019 at 10:14

norok2 · Accepted Answer · 2019-10-15 10:10:01Z

1

If you use pd.DataFrame.assign() and pass on a callable, it assumes that the first argument is actually the dataframe itself.

For example, if you change your code to the following:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randint(1,10,size=16).reshape(8,2),columns=["one","two"])
func=lambda c, x: np.log(x.one + c)
df=df.assign(newcol=func(1, df))
print(df)

#This one will no longer work!
df=df.assign(newcol2=func)
print(df)

the last call to assign() will not work.

This is explained in the official documentation. The line df.assign(newcol=func(1, df)) uses the non-callable pathway, while the line df.assign(newcol=func) uses the callable pathway.

edited Oct 15, 2019 at 10:10

answered Oct 15, 2019 at 10:02

norok2

27.1k6 gold badges83 silver badges110 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Florian Bernard · Accepted Answer · 2019-10-15 10:14:08Z

0

It's not compilation, it's simply how assign source code is written. As mentioned in pandas assign documentation.

Where the value is a callable, evaluated on df:

answered Oct 15, 2019 at 10:14

Florian Bernard

2,5891 gold badge11 silver badges24 bronze badges

Collectives™ on Stack Overflow

Lambda function without passing argument

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related