0

I have a Dataframe which I get from csv file using

df = pd.read_csv('data.csv')

I want to select some of the rows of this Dataframe and create a new Dataframe but the logic to select those rows is complex and needs to be inside a function. And this filter logic uses data from that row only, not from any other rows in the Dataframe. So how can I create a new Dataframe by using this filter function so select rows from this Dataframe?

1 Answer 1

1

why are you not just a boolean mask like

idxs = df[df['foo'] == 'bar'].index.to_list()
df_slice = df.loc[idxs].copy()

alternatively

df_slice = df.query('col1 > 2 and col2 .....').copy()

If you really need to apply a function to a row i would do it like this:

# Define your function here which gets a series as input.

def check_condition(s)
   if condition: 
        return 1
   return 0

df['matches_cond'] = df[['foo', 'bar'...]].apply(
    lambda x: check_condition(x), axis=1)

And then you can slice again using loc or query.

If you need something different please add a short example of you data and the desired output

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.