1

for string "//div[@id~'objectnavigator-card-list']//li[@class~'outbound-alert-settings']", I want to find "@..'...'" like "@id~'objectnavigator-card-list'" or "@class~'outbound-alert-settings'". But when I use regex ((@.+)\~(\'.*?\')), it find "@id~'objectnavigator-card-list']//li[@class~'outbound-alert-settings'". So how to modify the regex to find the string successfully?

1
  • Please format question properly. Commented May 15, 2017 at 3:13

3 Answers 3

3

Use non-capturing, non greedy, modifiers on the inner brackets and search for not the terminating character, e.g.:

 re.findall(r"((?:@[^\~]+)\~(?:\'[^\]]*?\'))", test)

On your test string returns:

 ["@id~'objectnavigator-card-list'", "@class~'outbound-alert-settings'"]
Sign up to request clarification or add additional context in comments.

Comments

1

Limit the characters you want to match between the quotes to not match the quote:

>>> re.findall(r'@[a-z]+~\'[-a-z]*\'', x)

I find it's much easier to look for only the characters I know are going to be in a matching section rather than omitting characters from more permissive matches.

Comments

1

For your current test string's input you can try this pattern:

import re 

a = "//div[@id~'objectnavigator-card-list']//li[@class~'outbound-alert-settings']"
# find everything which begins by '@' and neglect ']'
regex = re.compile(r'(@[^\]]+)')
strings = re.findall(regex, a)
# Or simply:
# strings = re.findall('(@[^\\]]+)', a)

print(strings)

Output:

["@id~'objectnavigator-card-list'", "@class~'outbound-alert-settings'"]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.