scala - error: ')' expected but '(' found

Question

I'm new to Scala and I cannot find out what is causing this error, I have searched similar topics but unfortunately, none of them worked for me. I've got a simple code to find the line from some README.md file with the most words in it. The code I wrote is:

    val readme = sc.textFile("/PATH/TO/README.md")
    readme.map(lambda line :len(line.split())).reduce(lambda a, b: a if (a > b) else b)

and the error is:

    Name: Compile Error
    Message: <console>:1: error: ')' expected but '(' found.
    readme.map(lambda line :len(line.split()) ).reduce( lambda a, b: a                 
    if (a > b) else b )        ^

    <console>:1: error: ';' expected but ')' found.
    readme.map(lambda line :len(line.split()) ).reduce( lambda a, b: a 
    if (a > b) else b )                       ^

It doesn't work because it has nothing in common with the Scala syntax — simpadjo
– simpadjo, Commented Nov 28, 2017 at 13:54
Not every language is Python. There are different languages and Scala is one of those. If you want to use Scala - learn Scala syntax. — SergGr
– SergGr, Commented Nov 28, 2017 at 20:52

Mike Allen · Accepted Answer · 2017-11-28 15:31:38Z

Your code isn't valid Scala.

I think what you might be trying to do is to determine the largest number of words on a single line in a README file using Spark. Is that right? If so, then you likely want something like this:

val readme = sc.textFile("/PATH/TO/README.md")
readme.map(_.split(' ').length).reduce(Math.max)

That last line uses some argument abbreviations. This alternative version is equivalent, but a little more explicit:

readme.map(line => line.split(' ').length).reduce((a, b) => Math.max(a, b))

The map function converts an RDD of Strings (each line in the file) into an RDD of Ints (the number of words on a single line, delimited - in this particular case - by spaces). The reduce function then returns the largest value of its two arguments - which will ultimately result in a single Int value representing the largest number of elements on a single line of the file.

After re-reading your question, it seems that you might want to know the line with the most words, rather than how many words are present. That's a little trickier, but this should do the trick:

readme.map(line => (line.split(' ').length, line)).reduce((a, b) => if(a._1 > b._1) a else b)._2

Now map creates an RDD of a tuple of (Int, String), where the first value is the number of words on the line, and the second is the line itself. reduce then retains whichever of its two tuple arguments has the larger integer value (._1 refers to the first element of the tuple). Since the result is a tuple, we then use ._2 to retrieve the corresponding line (the second element of the tuple).

I'd recommend you read a good book on Scala, such as Programming in Scala, 3rd Edition, by Odersky, Spoon & Venners. There's also some tutorials and an overview of the language on the main Scala language site. Coursera also has some free Scala training courses that you might want to sign up for.

@KarlBielefeldt or readme.map(_.split(' ').length).max which uses a lot less memory/storage. I was just relating my answer to his question. ;-)

Collectives™ on Stack Overflow

scala - error: ')' expected but '(' found

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related