2

Very simple query but can't seem to find appropriate answer. I want to pass the Pandas method for e.g. .sum() as an input to my function.

def something(dataframe,col_name,func):
    return dataframe.col_name.func

something(df,'a',sum())

TypeError: sum expected at least 1 arguments, got 0.

Python confuses it with inbuilt function sum()

2
  • 1
    I'd just go with return dataframe.groupby(col_name).agg(method1). Commented Aug 15, 2018 at 10:02
  • Please do not change your question after you have received 3 answers. If you have a new question, please ask a new question so it can be answered separately. Commented Aug 15, 2018 at 10:03

2 Answers 2

1

You an use operator.methodcaller for this:

from operator import methodcaller

df = pd.DataFrame({'a': range(11)})

def foo(df, col, method):
    return methodcaller(method)(df[col])

res_sum = foo(df, 'a', 'sum')   # 55
res_avg = foo(df, 'a', 'mean')  # 5.0

The reason for your error is you are trying to pass the result of a called function with no arguments, one that happens to require arguments to work.

The benefit of passing strings is you rely on tried-and-tested methods built into the Pandas framework, e.g. pd.Series.sum, pd.Series.mean, etc. While you can attempt to use Python built-ins and NumPy functions directly with Pandas series, you may find discrepancies versus what you might expect. Stick with documented Pandas methods where possible.

Sign up to request clarification or add additional context in comments.

1 Comment

Works as intended. Thanks for the explanation
0

Do not recommend acquiring functions by this method in a general case, but here is a solution without any additional imports. Python has a getattr builtin function which "[r]eturn the value of the named attribute of object." Its usage is getattr(object, name[, default]). So you need to rewrite your function as the following.

def something(dataframe,col_name,func):
    return getattr(dataframe[col_name], func)

something(df,'a',"sum")

If you want to get the result of the function call sum, replace the function definition with return getattr(dataframe.col_name, func)().

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.