1

Code snippet 1

import pandas as pd  
df = pd.read_csv("filename.txt", sep='\t', header = 0, names = ['E', 'S', 'D'])  
Result = df.query(df.E.head(**n=100**) == 0)

Code Snippet 1 works as expected and returns a dataframe with df.E value equal to 0. But,

Code Snippet 2

import pandas as pd  
df = pd.read_csv("filename.txt", sep='\t', header = 0, names = ['E', 'S', 'D'])  
Result = df.query(df.E.head(**n=101**) == 0)

Code Snippet 2 does not work and throws error as

"SyntaxError: ('invalid syntax', ('<unknown>', 1, 602, '[True ,True
,True ,True ,True ,True ,True ,True ,True ,True ,True ,True ,True
,True ,True ,True ,... ,True ,True ,True ,True ,True ,True ,True ,True
,True ,True ,True ,True ,True ,True ,True ,...]\n'))"

Please note that only change between 2 sets of code is n=100 and n=101.

The error is still present with .head(n=101) removed. Have tried for many values greater than 100, throws same error.

4
  • Why are you using query that way? You're supposed to pass a string, not an actual condition. See the documentation. Commented Oct 27, 2014 at 7:33
  • Why are the ** there? Is that a typo? Commented Oct 27, 2014 at 7:38
  • @BrenBarn actual query is df.query(df.E == dfEgounique[eachEgo]). If i change it to df.query('df.E == dfEgounique[eachEgo]'), throws error as "NotImplementedError". Commented Oct 27, 2014 at 7:44
  • @Paul ** was added when the question was edited. Will remove it. Commented Oct 27, 2014 at 7:45

1 Answer 1

1

df.query accepts a string query. you are not passing valid python (it accepts a slight superset of python actually). so I wouldn't expect either of your code snippets to work at all, hence the SyntaxError.

Straight out of the doc-string

Parameters
----------
expr : string
    The query string to evaluate.  You can refer to variables
    in the environment by prefixing them with an '@' character like
    ``@a + b``.


In [14]: pd.set_option('max_rows',10)

In [15]: np.random.seed(1234)

In [16]: df = DataFrame(np.random.randint(0,10,size=100).reshape(-1,1),columns=list('a'))

In [17]: df
Out[17]: 
    a
0   3
1   6
2   5
3   4
4   8
.. ..
95  9
96  2
97  9
98  1
99  3

[100 rows x 1 columns]

In [18]: df.query('a==3')
Out[18]: 
    a
0   3
21  3
26  3
28  3
30  3
32  3
51  3
60  3
99  3

In [19]: var = 3

In [20]: df.query('a==@var')
Out[20]: 
    a
0   3
21  3
26  3
28  3
30  3
32  3
51  3
60  3
99  3
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks Jeff. df.query('e==0') works. Further, i want to compare it with a variable instead of 0, like df.query('e==var').
Thanks. I had been trying df(df.e==var) instead of df[df.e==var]. Silly mistake. Out of curiosity, i do wonder, how my first snippet of code works, if the syntax is wrong.
with the information you have provided it is impossible to tell what var is (and show df.info() and pd.show_versions()
version = '0.15.0' short_version = '0.15.0' var contains int variable

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.