0

One issue may be that python is not reading my back tick but instead treating it like a blank space:

'BACKTICK_QUOTED_STRING_PAY__LPAR_£M_SLASH_m_RPAR_'
                                                 ^
"`PAY (£M/m)` < '3'"
            ^

I'm searching a CSV file with spaces within the columns (cannot change this naming decision).

I also need to use query as it's a dynamic query that changes depending on inputs. (I haven't included this code as I am just trying to get it to work manually)

I've tried this without '3' and nothing changed.

I'm performing a query with the following code:

data = df.query("`PAY (£M/m)` < '3'")

SyntaxError: Could not convert 'BACKTICK_QUOTED_STRING_PAY__LPAR_£M_SLASH_m_RPAR_' to a valid Python identifier.

Weirdly in the above error we have __ between the left parenthesis and the word PAY although below is the consoles output + the code written manually into the query.

Below is my CSV file exported and copied from a txt:

PAY (£M/m)  INITIALS   ID
         2        ZE  223
         5        NY  532
         1        MA  122
         3        ON  873
         3        LS  235
13
  • 1
    Is there an issue with doing df = df[df['PAY (£M/m)'] < 3]? Commented Feb 9, 2023 at 15:35
  • @TYZ I want to keep df as it's used multiple times, it is also a dynamic search query which can change (I didn't include this code as it's working and not relevant). Commented Feb 9, 2023 at 16:09
  • 1
    It's the pound sign that's causing the problem (£). I'm surprised no one else mentioned that. Have you considered replacing it with something like GBP? The column could then be PAY (GBP M/m). That's the ISO 4217 code for it, btw. Commented Feb 9, 2023 at 17:05
  • 1
    @user19077881 That's not the issue. The error happens when trying to do a .query() on that column name. Commented Feb 9, 2023 at 17:16
  • 1
    @OttoLuck It's in the docs you linked: "For other characters that fall outside the ASCII range (U+0001..U+007F) and those that are not further specified in PEP 3131, the query parser will raise an error." Commented Feb 9, 2023 at 17:20

1 Answer 1

1

The issue was the £ sign. I replaced it with

df.columns = df.columns.str.replace('£', 'GBP')

and changed the query to suit this new change and the error was fixed.

From the Pandas documentation:

During parsing a number of disallowed characters inside the backtick quoted string are replaced by strings that are allowed as a Python identifier. These characters include all operators in Python, the space character, the question mark, the exclamation mark, the dollar sign, and the euro sign. For other characters that fall outside the ASCII range (U+0001..U+007F) and those that are not further specified in PEP 3131, the query parser will raise an error.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.