2

I need to parse (not to evaluate) Xpath expressions in Python to change them, e.g. I have expressions like

//div[...whatever...]//some-other-node...

and I need to change them to (for example):

/changed-node[@attr='value' and ...whatever...]/another-changed-node[@attr='value' ...

As it seems to me I need to split the original expression to steps and the steps to axes+nodes and predicates. Is there some tool I can do it with or is there a nice and easy way to do it without one?

The catch is I can't be sure that the predicates of original expressions won't contain something like [@id='some/value/with/slashes'] so I can't parse them with naive regexes.

2

1 Answer 1

3

You might be able to use the REx parser generator from Gunther Rademacher. See http://www.bottlecaps.de/rex/ This will generate a parser for any grammar from a suitable BNF, and suitable BNF for various XPath versions is available. REx is a superb piece of technology spoilt only by extremely poor documentation. It can generate parsers in a number of languages including Javascript, XQuery, and XSLT. It's used in the Saxon-JS product to parse dynamic XPath expressions within the browser.

Another approach is to use the XQuery to XQueryX converters available from W3C (XPath is a subset of XQuery so these will handle XPath as well. These produce a representation of the syntax tree in XML).

Sign up to request clarification or add additional context in comments.

1 Comment

2024 Update: Rex is now (finally!) available as open source at github.com/GuntherRademacher/rex-parser-generator

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.