I want to filter a dataframe by a more complex function based on different values in the row.
Is there a possibility to filter DF rows by a boolean function like you can do it e.g. in ES6 filter function?
Extreme simplified example to illustrate the problem:
import pandas as pd
def filter_fn(row):
if row['Name'] == 'Alisa' and row['Age'] > 24:
return False
return row
d = {
'Name': ['Alisa', 'Bobby', 'jodha', 'jack', 'raghu', 'Cathrine',
'Alisa', 'Bobby', 'kumar', 'Alisa', 'Alex', 'Cathrine'],
'Age': [26, 24, 23, 22, 23, 24, 26, 24, 22, 23, 24, 24],
'Score': [85, 63, 55, 74, 31, 77, 85, 63, 42, 62, 89, 77]}
df = pd.DataFrame(d, columns=['Name', 'Age', 'Score'])
df = df.apply(filter_fn, axis=1, broadcast=True)
I found something using apply() but this actually returns only False/True filled rows using a bool function, which is expected.
My workaround would be returning the row itself when the function result would be True and returning False if not. But this would require an additional filtering after that.
Name Age Score
0 False False False
1 Bobby 24 63
2 jodha 23 55
3 jack 22 74
4 raghu 23 31
5 Cathrine 24 77
6 False False False
7 Bobby 24 63
8 kumar 22 42
9 Alisa 23 62
10 Alex 24 89
11 Cathrine 24 77
