2

Given a flexible string of regex pattern, I need to find all attributes attached.

Example string: /html/body/div[1]/div/a/(@title|@href)

It needs to return ['@title', '@href']

I did some research and created a regex pattern like this: /@\w+/g

Tried it on regex101 and it did seems to works: https://regex101.com/r/cO8lqs/9124

But when I coded it in python

import re
xpath = "/html/body/div[1]/div/a/(@title|@href)"

print(re.findall("/@\w+/g", xpath)) # should have been worked

It returns []

As mentioned above, it needs to return ['@title', '@href']

Did I missed something?

3
  • 2
    Python's regex syntax is different. Try this re.findall("@\w+", xpath) Commented Mar 24, 2019 at 9:32
  • @FailSafe how stupid I am :) you should get the rep. thankyou so much. Commented Mar 24, 2019 at 9:34
  • No worries, man. Commented Mar 24, 2019 at 9:36

2 Answers 2

2

As suggested by @FailSafe on the question's comment, turns out I need to change the regex pattern from /@\w+/g to @\w+.

Sign up to request clarification or add additional context in comments.

Comments

1

You can also try another XPath expression to get the same output

/html/body/div[1]/div/a/@*[name()="title" or name()="href"]

1 Comment

I kindly appreaciate your answer. The given input is flexible so the input might be like what you suggested, and then my old regex pattern will not be able to filter it. Something to reconsider for me. Thankyou.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.