3
Pattern srcAttrPattern = Pattern.compile("(?i)(?<=src=\")[^\"]*(?<!\")");
Matcher srcMatcher=srcAttrPattern.matcher("src=\"\"");
System.out.println(srcMatcher.find());

This prints false. How do I interpret the above code ? Is there any modification needed to include src="" for the above code to serve purpose of empty as well as filled string. This statement is basically to match the src tag in <img> of html contents.

1
  • 1
    You can remove the last assertion, but doesn't guarantee a dbl quote at the end, change it to (?="). But why go to all the trouble with a slow as paint dry assertions.. Use something more reasonable, src="(.*?)" Commented Mar 18, 2016 at 21:42

1 Answer 1

2

Note that to parse HTML, you'd better use some dedicated parser (e.g. Jsoup).

As for the current issue of matching a src="" string, the final negative lookbehind requires the character before the current location to be other than a quote. Since you are using a negated character class [^"]* (0+ characters other than ") you just do not need that lookbehind.

Remove (?<!") and you will match the empty string in src="" with the "(?i)(?<=src=\")[^\"]*".

See the regex demo

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.