1

I am trying to run a simple word count program with spark-submit and getting an exception.

Exception in thread "main" java.lang.BootstrapMethodError: java.lang.NoClassDefFoundError: scala/collection/mutable/ArraySeq$ofRef at SparkWordCount$.main(SparkWordCount.scala:18)

The code, starting with line 18 is

val count = input.flatMap(line ⇒ line.split(" "))
    .map(word ⇒ (word, 1))
    .reduceByKey(_ + _)

My environment:

  • Windows 10
  • java version "1.8.0_221"
  • spark-shell shows : Spark version 2.4.4 (Using Scala version 2.11.12)
  • scala -version command shows Scala code runner version 2.13.1
3
  • 3
    2.11 and 2.13 are not compatible, but scala -version isn't necessarily relevant. If you have build.sbt file, what does it look like? Commented Sep 27, 2019 at 9:22
  • 1
    Oh ok, got it. I don't use build.sbt. I removed system Scala (v 2.13.1) and installed Scala v 2.11 with scala-2.11.12.msi. Now I compile with Scala 2.11.12 and spark-submit to Spark 2.4.4 which is using Scala 2.11.12. Now the program runs. Commented Sep 27, 2019 at 9:26
  • 1
    @user1575148 I would recommend you to use SBT manually compiling a project will be a nightmare. Commented Sep 27, 2019 at 12:15

2 Answers 2

2

As stated in the comments, the solution is to use for development the same version of Scala that you will use on the cluster.

Sign up to request clarification or add additional context in comments.

Comments

0

I had the same issue and here how I could fix it i ran spark-shell in cmd (windows) and it worked because the SPARK_HOME environment variable is set in my system/or user environment variable. In the same cmd I could see the scala version and spark version the i navigated to the target/build.sbt inside the base directory of my scala project and changed the scala version to be the same as the scala version mentioned above like this

ThisBuild / scalaVersion := "2.12.15"

and I had this dependency

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "3.3.4", // Spark Core
  "org.apache.spark" %% "spark-sql" % "3.3.4"   // Spark SQL (optional, if you need SQL support)
) 

and 3.3.4 is the same version as my spark installation them ran the command of spark-submit again like this

spark-submit --class "main.wordcountapp" --master "local[*]" "F:\path\to\base_directory\target\scala-2.12\myscalaproj_2.12-0.1.0-SNAPSHOT.jar" 

and it ran as expected the wordcountapp is the name of my object that had the main function inside it and the

"F:\path\to\base_directory\target\scala-2.12\myscalaproj_2.12-0.1.0-SNAPSHOT.jar"

is the name of the generated jar file to generate this file run sbt package inside the base_directory hope it helps.

1 Comment

As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.