1

I'm currently working my way through a Scala project, and converting it into Java. Everythings going fine, but I'm stumbled into this snippet:

    Pattern fileNamePattern = Pattern.compile("^(\\w+).*(_\\w+\\.xml)$");


    new File(filePath).getName match {
        case FileNamePattern(first, last) => return first + last
        case n => return n
    }

I understand the regex, One or more letters, numbers or punctuation, followed by 0 or more characters, followed by One or more letters, numbers or punctuation. The purpose of this function is to get the file name from a file path, but that is really very straight forward in Java, so I would have thought the Scala developer wouldn't make it so needlessly complex.

The problem is, I don't want to march ahead and assume the developer was an idiot, when maybe they're trying to do something a little more clever, and my lack of experience with Scala is stopping me from seeing it. So could someone please explain:

  • The syntax with match
  • Where the hell first and last came from
  • The equivalent / documentation that leads too the Java equivalent of this snippet
def getFileName(filePath: String): String = {

    if(filePath == null || filePath.trim.length == 0) {
      return filePath
    }

    val FileNamePattern = new Regex("^(\\w+).*(_\\w+\\.xml)$")

    new File(filePath).getName match {
         case FileNamePattern(first, last) => return first + last
         case n => return n
    }
}
2
  • Minor thing but \w is most certainly not whitespace. Commented Aug 7, 2013 at 11:23
  • Lord, all this Scala is frying my brain. Apologies for the mistake. Commented Aug 7, 2013 at 11:26

1 Answer 1

3

The match construct is used to do pattern matching. Basically, the left-hand side of the match is the object to match against the patterns present in the right-hand side. The generated code tests each pattern in the order they appear, and if a pattern matches the object, the code after the => is executed.

The first and last variables in the match expression are variables bound by the pattern matching machinery, when the pattern, they appear in, matches the object. Their values will be the corresponding value in the object graph being matched. In other word, they are implicitly declared by the pattern, and will be properly initialized in the "consequence" clause after =>.

Classic example:

trait Expr

case class Const(val value: Int) extends Expr
case class Add(val left: Expr, val right: Expr) extends Expr

def evaluate(exp: Expr): Int = exp match {
    case Const(cv) => cv
    case Add(exp1, exp2) => evaluate(exp1) + evaluate(exp2)
    case _ => throw new IllegalArgumentException("did not understand: " + exp)
}

Scala's regular expression objects have provide special support for pattern matching (via their unapply/unapplySeq method). In this case, if the regular expression matches against the string returned by getName, then the variables first and last will be bound to the substring matching the first subgroup of the regular expression, and the second subgroup respectively.

Java has no match like language construct. The equivalent Java code would be quite longish and troublesome. It might look something like

final String name = (new File(filePath).getName());
final Matcher matcher = FileNamePattern.match(name);

if (matcher.matches()) {

    final String first = matcher.group(1);
    final String last = matcher.group(2);
    return first + last;

} else {

    return name;
}
Sign up to request clarification or add additional context in comments.

2 Comments

Good explanation, and functioning replacement code (aside from groups being private) +1
Oops. Good catch. I meant group, not groups obviously.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.