1

My requirement is to get multiple regex patterns in a given String.

"<a href=\"https://page1.google.com/ab-cd/ABCDEF\”>Hello</a> hiiii <a href=\"https://page2.yahoo.com/gr\”>page</a><img src=\"https://image01.google.com/gr/content/attachment/987654321\” alt=\”demo image\”></a><a href=\"https://page3.google.com/hr\">"

With this below code:

val p = Pattern.compile("href=\"(.*?)\"")
    val m = p.matcher(str)
    while(m.find()){
      println(m.group(1))
    }

I am getting output:

https://page1.google.com/ab-cd/ABCDEF
https://page2.yahoo.com/gr
https://page3.google.com/hr

With change in Pattern:

val p = Pattern.compile("img src=\"(.*?)\"")

I am getting output:

https://image01.google.com/gr/content/attachment/987654321

But with Pattern:

val p = Pattern.compile("href=\"(.*?)\"|img src=\"(.*?)\"")

I am getting output:

https://page1.google.com/ab-cd/ABCDEF
https://page2.yahoo.com/gr
Null
https://page3.google.com/hr 

Please let me know, how to get multiple regex pattern or is their any other easy way to do this?

Thanks

0

1 Answer 1

1

You may use

val rx = "(?:href|img src)=\"(.*?)\"".r
val results = rx.findAllMatchIn(s).map(_ group 1)
// println(results.mkString(", ")) prints:
//  https://page1.google.com/ab-cd/ABCDEF, 
//  https://page2.yahoo.com/gr, 
//  https://image01.google.com/gr/content/attachment/987654321, 
//  https://page3.google.com/hr

See the Scala demo

Details

  • (?:href|img src)=\"(.*?)\" matches either href or img src, then a =", and then captures any 0+ chars other than line break chars as few as possible into Group 1, and then a " is matched
  • With .findAllIn, you get all matches, then .map(_ group 1) only fetches Group 1 values.
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.