0

My problem is quite simple. I want to parse a string like this one :

string = 'SENT (ADVWH Pourquoi) (NP (DET ce) (NC theme)) (PONCT ?)'

I want to use regex (I am not an expert, I have used it few times before). I want to extract the first level of brackets, i.e. I want the result to be :

(ADVWH Pourquoi)
(NP (DET ce) (NC theme))
(PONCT ?)

I used this regex, that I tested successfully on regex101, but it doesn't even want to compile :

re.compile(r"\(([^()]|(?R))*\)")

I also tried these ones that still work on regex101:

re.compile(r"\(([^\(\)]|(?R))*\)")
re.compile(r"\((([^\(\)]|(?R))*)\)")

I always get the same answer from python : unexpected end of pattern.

I really don't see what is the problem here, and why does it work on regex101 and not with python.

Thanks a lot in advance!

1 Answer 1

1

re does not support recursion (the (?R) group) - you need to use the PyPi package regex

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.