0

Ey stackoverflow, I got this href:

<a href="/view?view=1"></a>

I am looking for a regexp that gets the first href content where href starts with /view. How am I supposed to do it ? Been looking everywhere.

3
  • 1. Don't parse HTML with regex. 2. using XPath or similar would be a much better idea Commented Jan 22, 2015 at 22:52
  • Try a dom parser instead in the language you're using Commented Jan 22, 2015 at 22:53
  • 1
    var regex = /^<a href="(\/view\?view=1)"><\/a>$/; "<a href=\"/view?view=1\"></a>".match(regex)[1] Commented Jan 22, 2015 at 22:54

1 Answer 1

1

As stated by others in comments, don't use regex to parse HTML, use a proper parser. Check: RegEx match open tags except XHTML self-contained tags

$ echo '<a href="/view?view=1"></a>' |
    xmlstarlet sel -t -v '//a/@href[starts-with(., "/view")]' -

or

$ echo '<a href="/view?view=1"></a>' |
    saxon-lint --xpath 'string(//a/@href[starts-with(., "/view")])' -

OUTPUT:

/view?view=1

Check or saxon-lint

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.