90

Let's say I have this code:

val string = "one493two483three"
val pattern = """two(\d+)three""".r
pattern.findAllIn(string).foreach(println)

I expected findAllIn to only return 483, but instead, it returned two483three. I know I could use unapply to extract only that part, but I'd have to have a pattern for the entire string, something like:

 val pattern = """one.*two(\d+)three""".r
 val pattern(aMatch) = string
 println(aMatch) // prints 483

Is there another way of achieving this, without using the classes from java.util directly, and without using unapply?

5 Answers 5

121

Here's an example of how you can access group(1) of each match:

val string = "one493two483three"
val pattern = """two(\d+)three""".r
pattern.findAllIn(string).matchData foreach {
   m => println(m.group(1))
}

This prints "483" (as seen on ideone.com).


The lookaround option

Depending on the complexity of the pattern, you can also use lookarounds to only match the portion you want. It'll look something like this:

val string = "one493two483three"
val pattern = """(?<=two)\d+(?=three)""".r
pattern.findAllIn(string).foreach(println)

The above also prints "483" (as seen on ideone.com).

References

Sign up to request clarification or add additional context in comments.

1 Comment

You can also use pattern.findAllMatchIn(string).foreach... instead
54
val string = "one493two483three"
val pattern = """.*two(\d+)three.*""".r

string match {
  case pattern(a483) => println(a483) //matched group(1) assigned to variable a483
  case _ => // no match
}

4 Comments

This is the simplest way by far. You use the regex object ("pattern") in a match/case and extracts the group into the variable a483. The problem withthis case is that the pattern should have wildcards on both sides: val pattern = """.*two(\d+)three.*""".r
Yes. I don't think the above is immediately clear, but once you understand that it's assigning the digit matching group to the variable 'a483', then it makes more sense. Perhaps rewrite in a clearer fashion ?
This is the scala way with regex. For people don't understand the magic behind this answer, try search "scala regex extractor" or "scala unapply regex" etc.
the semantics is unclear. is this the first, last, or a random match from the string?
23

Starting Scala 2.13, as an alternative to regex solutions, it's also possible to pattern match a String by unapplying a string interpolator:

"one493two483three" match { case s"${x}two${y}three" => y }
// String = "483"

Or even:

val s"${x}two${y}three" = "one493two483three"
// x: String = one493
// y: String = 483

If you expect non matching input, you can add a default pattern guard:

"one493deux483three" match {
  case s"${x}two${y}three" => y
  case _                   => "no match"
}
// String = "no match"

Comments

16

You want to look at group(1), you're currently looking at group(0), which is "the entire matched string".

See this regex tutorial.

1 Comment

can you illustrate on the input I provided? I tried to call group(1) on what's returned by findAllIn but I get an IllegalStateException.
5
def extractFileNameFromHttpFilePathExpression(expr: String) = {
//define regex
val regex = "http4.*\\/(\\w+.(xlsx|xls|zip))$".r
// findFirstMatchIn/findAllMatchIn returns Option[Match] and Match has methods to access capture groups.
regex.findFirstMatchIn(expr) match {
  case Some(i) => i.group(1)
  case None => "regex_error"
}
}
extractFileNameFromHttpFilePathExpression(
    "http4://testing.bbmkl.com/document/sth1234.zip")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.