2

I am making a python PyQt5 CSV comparison tool project and the user can add conditions for querying the pandas dataframe one by one before they are executed.

At the moment I have a nested list of conditions with each element containing the field, operation (==,!=,>,<), and value for comparison as strings. With just one condition I can use .query as it takes a string condition:

data.query('{} {} {}'.format(field,operation,value))

But as far as I can tell the formatting for multiple queries would use loc similar to this:

data.loc[(data.query('{} {} {}'.format(field[0],operation[0],value[0]))) & (data.query('{} {} {}'.format(field[1],operation[1],value[1]))) & ...]

Firstly I wanted to make sure my understanding of the loc function was correct (do I need a primary key at the end maybe?).

And secondly, how would I represent this multiple condition query with an unknown number of conditions set?

Thanks

0

2 Answers 2

3

Would this work?

conds = [
    f'{f} {o} {v}' for f, o, v in zip(field, operation, value)
]
data.query(' and '.join(conds))
Sign up to request clarification or add additional context in comments.

Comments

1

Warning: Not tested, more like a comment but put here for proper format:

data.query returns a dataframe, you can't just do dataframe1 & dataframe2. You would do something like

data.query(' AND '.join(['{} {} {}'.format(f, o, v) 
                         for f, o, v in zip(fields, operations, values)
                       ])
          )

3 Comments

Hi, thanks for the reply and help with the query function, I didn't realise you could use the literal 'AND' as opposed to separation with &. From what I can see your code works when I have one condition but not multiple. One thing to mention is the the format of my conditions is actually [[f,o,v],[f,o,v]] not [f,f] [o,o] [v,v] (the example I gave was misleading) but I thing referencing the list instead of a zip is the same logic there. However, I'm getting an error with multiple conditions and the 'AND' separator: ``` SyntaxError: invalid syntax: id <100 AND id >95``` Any idea why?
Never mind, your answer works perfectly if you just change the 'AND' to '&' but it can still be a string! Thanks :)
@BarnieGill using & and and like that, in a query string are identical. Pandas literally replaces & (in that context) with and before processing it. I accidentally discovered that :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.