Why does df.query("ColumnB > '6'") evaluate to the second row of the dataframe only? Expected answer was second to fifth row because those are rows where the values in Column B are greater than 6.
Ideally for that comparison to work, ColumnB should be of type int or float. So you might want to remove quotes from 6. Should be: df.query("ColumnB > 6")
Thank you for pointing that out. However when I unquote 6, I get a type error. TypeError: '>' not supported between instances of 'str' and 'int' As a side note, although I'm aware this can be done through boolean indexing, I am currently exploring ways how to filter data with the query function specifically. Any ideas how to make it evaluate correctly?
ColumnBshould be of typeintorfloat. So you might want to remove quotes from6. Should be:df.query("ColumnB > 6")df[df['ColumnB'] > 6]. Check if this gets you the right result.df['ColumnB'] = df['ColumnB'].astype(int)and then rundf.query("ColumnB > 6").