1

I want to filter a dataframe using two different condition.

But I want to omit rows which doesn't satisfy the condition and only want to keep values which occur at least twice in column A

so this is the sample data

df1 = df[(df['A-B occurrence'] >= 3) & (df['A occurrence'] >= 2)]

Above is the code I am using and this is the output I get:

enter image description here

So as in column A, 17 is satisfying condition in one row only so I want to omit 17 all together as it is not meeting the condition, which means I only want to keep duplicate values which are present in column A 2 or more than 2 times

6
  • "so as in coloumn A, 17 is satisfying condition in one row only" is not True. 'A occurrence' >= 2) Commented Oct 28, 2021 at 14:09
  • I don't fully understand what you mean. Do you want an 'or' statement rather than an `and'? Commented Oct 28, 2021 at 14:10
  • Unless I'm mistaken, row 17 in df was filtered out. Commented Oct 28, 2021 at 14:11
  • I think OP wants to keep only the duplicated As (see my answer) Commented Oct 28, 2021 at 14:13
  • i dont want to filter out row 17 but 17 number Commented Oct 28, 2021 at 14:13

1 Answer 1

1

IIUC you want to keep only the rows for which A has duplicates.

You can use:

df2 = df1[df1['A'].duplicated(keep=False)]

output: this should remove rows with index 14 (A=17) and 19 (A=19)

NB. you can apply the same strategy on the other columns if needed

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.