2

I would like to perform a specific operation for every column of my DataFrame, specifically apply a given operation to all but the last column.

I have done this with google help and it works but seems quite creepy to me.

Can you help me to improve it?

d = {
    'col1': [1, 2, 4, 7], 
    'col2': [3, 4, 9, 1], 
    'col3': [5, 2, 11, 4], 
    'col4': [True, True, False, True]
}
df = pd.DataFrame(data=d)

def do_nothing(x):
    return x

def minor(x):
    return x<2

def multi_func(functions):
    def f(col):
        return functions[col.name](col)
    return f

result = df.apply(multi_func({'col1': minor, 'col2': minor,
                               'col3': minor, 'col4': do_nothing}))

Thank you all

2
  • What is the actual problem you wish to solve? Changing the hard-coded 2 in minor()? Commented Feb 5, 2021 at 10:44
  • I hope that exist a clever way of doing that. All three functions are quite overkilling in my opinion. For example, I know that you can pass a list to an operator for example df <[2,2,2,2]; It could be possible say do nothing in the last column? Commented Feb 5, 2021 at 10:48

1 Answer 1

1

Use the aggregate function instead, which allows more options for the func parameter:

res = df.aggregate({'col1': minor, 'col2': minor, 'col3': minor, 'col4': do_nothing})

print(res)

Output (in the context of the script in question):


    col1   col2   col3   col4
0   True  False  False   True
1  False  False  False   True
2  False  False  False  False
3  False   True  False   True

An option to write all this a bit “smarter” is to make the literal 2 a variable and to replace do_nothing by a name that better reflects the way the input is handled:

import pandas as pd
 
d = {
    'col1': [1, 2, 4, 7], 
    'col2': [3, 4, 9, 1], 
    'col3': [5, 2, 11, 4], 
    'col4': [True, True, False, True]
}
df = pd.DataFrame(data=d)

# identity function:
copy = lambda x: x

# lt (less than arg). returns a function that compares to the bound argument:
def lt(arg):
    return lambda x: x < arg

res = df.aggregate({'col1': lt(2), 'col2': lt(2), 'col3': lt(2), 'col4': copy})

print(res)

Same output as above.

Sign up to request clarification or add additional context in comments.

3 Comments

sry I thought accepting will mean upvoting too
Thanks @mat this way I learned something about about pandas, at least that it deserves definitely a deeper look :)
@mat upvoting and accepting quite different, votes can be changed only under certain circumstances while the accept status may be changed at will. Maybe "the tour" could be a good start to recap the workings.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.