2

I would like to create a custom function to filter a pandas dataframe.

def df_filter(df, first_elem, col1, col2, other_elems):
    '''
    df: main dataframe
    first_elem: first element to search
    col1: first column to search for the first element
    col2: second column to search for first element
    other_elements: list with other elements to search for
    '''
    first_flt = df.loc[(df[col1] == first_elem) | (df[col2] == first_elem)]
    second_flt = first_flt.loc[(first_flt[col1] == other_elems[0]) | (first_flt[col1] == other_elems[1])] 
    return second_flt

the first filter is to filter the dataframe by searching for the occurrence of the first element in the col1 and col2 and picking these rows to create first_flt and it works.

In the second filter I would like to search for more values provided in a list (other_elems) and filter again. The crucial point is the nr of items in this list can be different based on what I plug in. other_elems = ['one', 'two', 'three'] or other_elems = ['one', 'two', 'three', four']

Thefore this part has to be created based on the nr of elements in other_elems:

first_flt.loc[(first_flt[col1] == other_elems[0]) | (first_flt[col1] == other_elems[1])...] 

Any ideas how to do this?

1
  • Can you provide a small example you would like to achieve? Commented Dec 4, 2019 at 16:11

2 Answers 2

2

If other_elems is an iterable, you can use DataFrame isin method.

In your example:

second_flt = first_flt.loc[(first_flt[col1].isin(other_elems)]
Sign up to request clarification or add additional context in comments.

1 Comment

yes it is an iterable and this works! Thanks. There is just one extra paranthesis that needs to be removed: second_flt = first_flt.loc[first_flt[col1].isin(other_elems)]
0

You just want to create this single filter by combining two individual filters:

def df_filter(df, first_elem, col1, col2, other_elems):
    '''
    df: main dataframe
    first_elem: first element to search
    col1: first column to search for the first element
    col2: second column to search for first element
    other_elements: list with other elements to search for
    '''
    filt1 = (df[col1] == first_elem) | (df[col2] == first_elem) # rows where col1 or col2 match first_elem
    filt2 = (df[col1] == other_elems[0]) | (df[col1] == other_elems[1]) # rows where col1 = other_elem[0] or col2 = other_elem[1]
    filt_final = filt1 & filt2
    return df[filt_final]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.