0

So here's what I want to do:

input string: "abc From: blah" I want to split this so that the result is

["abc" "From: blah"] or ["abc" "From" "blah"

I have several other patterns to match ["abcd" "To:" "blah"] etc

So I have the following regex

val datePattern = """((.*>.*)|(.*(On).*(wrote:)$)|(.*(Date):.*(\+\d\d\d\d)?$)|(.*(From):.*(\.com)?(\]|>)?$))"""
val reg = datePattern.r

If I do a match the result comes out fine. If I do a split on the same regex I get an empty list.

inputStr match {
      case reg(_*) => return "Match"
      case _ => return "Output: None"
}

on the input string :

"abc From: blah blah"

returns Match

Split

inputStr.split(datePattern)

returns an empty array. What am I possibly missing ?

3
  • You cannot have one and the same regex to match and to split a string, unless you expect to get different results. Commented Dec 3, 2015 at 14:18
  • Sorry didn't quite get that. So the split regex has to be different ? I tried just doing the split without the match - even then the split failed. Scala noob here. Commented Dec 3, 2015 at 14:20
  • I think you can do that with "abc From: blah".split("\\W+"). Commented Dec 3, 2015 at 14:40

1 Answer 1

1

Since the regexp matches the string, split will remove the entire string (considered as a separator).
The default behavior is not to return two empty strings, but an empty array in this case, as given by the split signification.

https://stackoverflow.com/a/14602089/1287856

Concerning why your regex matches in its entirety, you might find this website useful (it concerns your example directly)

https://regex101.com/r/zY0lX9/1

Split finds the whole regexp and removes all its occurences from the string, returning the interleaved strings as an array. You may want to split on something like "(?=From:)" so that it does not remove anything.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you. That was very helpful. I'm still a bit stumped as to why "abc From: def" is matched in it's entirety. I'm expecting on the 'From:' substring to match as evidenced by the regex. So shouldn't I get "abc" "From: def" or something like that upon splitting ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.