6

I got this problem. I have a

val line:String = "PE018201804527901"

that matches with this

regex : (.{2})(.{4})(.{9})(.{2})

I need to extract each group from the regex to an Array.

The result would be:

Array["PE", "0182","018045279","01"]

I try to do this regex:

val regex =  """(.{2})(.{4})(.{9})(.{2})""".r
val x= regex.findAllIn(line).toArray

but it doesn't work!

3 Answers 3

8
regex.findAllIn(line).subgroups.toArray
Sign up to request clarification or add additional context in comments.

Comments

5

Note that findAllIn does not automatically anchor the regex pattern, and will find a match inside a much longer string. If you need to only allow matches inside 17 char strings, you can use a match block like this:

val line = "PE018201804527901"
val regex =  """(.{2})(.{4})(.{9})(.{2})""".r
val results = line match {
  case regex(g1, g2, g3, g4) => Array(g1, g2, g3, g4)
  case _ => Array[String]()
}
// Demo printing
results.foreach { m =>
  println(m)
} 
// PE
// 0182
// 018045279
// 01

See a Scala demo.

It also handles no match scenario well initializing an empty string array.

If you need to get all matches and all groups, then you will need to grab the groups into a list and then add the list to a list buffer (scala.collection.mutable.ListBuffer):

val line = "PE018201804527901%E018201804527901"
val regex =  """(.{2})(.{4})(.{9})(.{2})""".r
val results = ListBuffer[List[String]]()

val mi = regex.findAllIn(line)
while (mi.hasNext) {
  val d = mi.next
  results += List(mi.group(1), mi.group(2), mi.group(3), mi.group(4))
}
// Demo printing
results.foreach { m =>
  println("------")
  println(m)
  m.foreach { l => println(l) }
}

Results:

------
List(PE, 0182, 018045279, 01)
PE
0182
018045279
01
------
List(%E, 0182, 018045279, 01)
%E
0182
018045279
01

See this Scala demo

2 Comments

Is there no more succinct way than regex(g1, g2, g3, g4) => Array(g1, g2, g3, g4)?
@Narfanator No if you want to do it with regex pattern matching.
5

Your solution @sheunis was very helpful, finally I resolved it with this method:

def extractFromRegex (regex: Regex, line:String): Array[String] = {
   val list =  ListBuffer[String]()
   for(m <- regex.findAllIn(line).matchData;
      e <- m.subgroups)
   list+=e
list.toArray

}

Because your solution with this code:

val line:String = """PE0182"""
val regex ="""(.{2})(.{4})""".r  
val t = regex.findAllIn(line).subgroups.toArray

Shows the next exception:

Exception in thread "main" java.lang.IllegalStateException: No match available
at java.util.regex.Matcher.start(Matcher.java:372)
at scala.util.matching.Regex$MatchIterator.start(Regex.scala:696)
at scala.util.matching.Regex$MatchData$class.group(Regex.scala:549)
at scala.util.matching.Regex$MatchIterator.group(Regex.scala:671)
at scala.util.matching.Regex$MatchData$$anonfun$subgroups$1.apply(Regex.scala:553)
at scala.util.matching.Regex$MatchData$$anonfun$subgroups$1.apply(Regex.scala:553)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at scala.util.matching.Regex$MatchData$class.subgroups(Regex.scala:553)
at scala.util.matching.Regex$MatchIterator.subgroups(Regex.scala:671)

1 Comment

or in more functional syntax: val list = regex.findAllIn(line).matchData.flatMap(_.subgroups)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.