1

Why does df.query("ColumnB > '6'") evaluate to the second row of the dataframe only? Expected answer was second to fifth row because those are rows where the values in Column B are greater than 6.


enter image description here

5
  • Ideally for that comparison to work, ColumnB should be of type int or float. So you might want to remove quotes from 6. Should be: df.query("ColumnB > 6") Commented May 23, 2020 at 12:34
  • Probably because it's doing string comparison? You quoted 6 Commented May 23, 2020 at 12:34
  • You can also try doing df[df['ColumnB'] > 6]. Check if this gets you the right result. Commented May 23, 2020 at 12:36
  • Thank you for pointing that out. However when I unquote 6, I get a type error. TypeError: '>' not supported between instances of 'str' and 'int' As a side note, although I'm aware this can be done through boolean indexing, I am currently exploring ways how to filter data with the query function specifically. Any ideas how to make it evaluate correctly? Commented May 23, 2020 at 12:39
  • 1
    Do this: df['ColumnB'] = df['ColumnB'].astype(int) and then run df.query("ColumnB > 6"). Commented May 23, 2020 at 12:44

1 Answer 1

1

You need to convert ColumnB into int:

df['ColumnB'] = df['ColumnB'].astype(int)
df.query("ColumnB > 6")
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.