0

I am finding the indexes of some values above certain cutoffs in a pandas DataFrame . So far I have achieved that using a series of lambda functions.

data.apply([lambda v:v[v>=0.25].idxmin(),
                                 lambda v:v[v>=0.25].idxmin(),
                                 lambda v:v[v>=0.50].idxmin(),
                                 lambda v:v[v>=0.75].idxmin(),
                                 lambda v:v[v>=0.90].idxmin()])

I have attempted to parametrize a lambda function to an arbitrary list of cutoff values. However, if I use the following, results are not correct as all lambda functions have the same name and basically only the last one is present in the dataframe returned by apply. How to parametrize these lambda correctly?

 cutoff_values=[25,50,100]
 agg_list=[lambda v,c:v[v>=(float(c)/100.0)].idxmin() for c in cutoff_values]
 data.apply(agg_list)

What would be a pythonic-pandasque better approach?

4
  • Is there a specific reason why you are using multiple lambdas and not a function? And maybe can you elaborate on my cutoff list is changing? Commented Dec 28, 2021 at 11:08
  • would a named function be any different? Commented Dec 28, 2021 at 11:09
  • regarding my cutoff list is changing: I need to make my cutoff parameters Commented Dec 28, 2021 at 11:09
  • 1
    With a function it would be much easier to just hand over a set of cutoff values instead of copy and pasting the lambda functions. And with a named function you are not running in the problem of the fact that only the last one is executed. There you can do everything in one function. Or like @jezrael supposed in an answer use nested lambdas Commented Dec 28, 2021 at 11:16

2 Answers 2

3

For me working nested lambda functions like:

q = lambda c: lambda x: x[x>=c].idxmin()
cutoff_values=[25,50,90]
print (data.apply([q((float(c)/100.0)) for c in cutoff_values]))
Sign up to request clarification or add additional context in comments.

Comments

1

You can use this:

df = pd.DataFrame(data={'col':[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]})
df = df[['col']].apply(lambda x: [x[x >= (float(c) / 100.0)].idxmin() for c in cutoff_values])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.