1

I'm running spark 2.4.5 in my mac. When I execute spark-submit --version

      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.4.5
      /_/

Using Scala version 2.11.12, OpenJDK 64-Bit Server VM, 1.8.0_242
Branch HEAD
Compiled by user centos on 2020-02-02T19:38:06Z
Revision cee4ecbb16917fa85f02c635925e2687400aa56b
Url https://gitbox.apache.org/repos/asf/spark.git
Type --help for more information.

Note it's using scala version 2.11.12. However, my app is using 2.12.8 and this is throwing me the well known java.lang.NoSuchMethodError: scala.Product.$init$(Lscala/Product;)V error.

My question is how to make my spark 2.4.5 use scala 2.12 as indicated in their official webiste under Download section: Spark 2.4.5 uses Scala 2.12

I tried brew search apache-spark and got

==> Formulae
apache-spark ✔

and brew info apache-spark returned me

apache-spark: stable 2.4.5, HEAD
Engine for large-scale data processing
https://spark.apache.org/
/usr/local/Cellar/apache-spark/2.4.4 (1,188 files, 250.7MB) *
  Built from source on 2020-02-03 at 14:57:17
From: https://github.com/Homebrew/homebrew-core/blob/master/Formula/apache-spark.rb
==> Dependencies
Required: openjdk ✔
==> Options
--HEAD
    Install HEAD version
==> Analytics
install: 7,150 (30 days), 15,180 (90 days), 64,459 (365 days)
install-on-request: 6,900 (30 days), 14,807 (90 days), 62,407 (365 days)
build-error: 0 (30 days)

Appreciate if any advice is given!

1
  • I don't know who vote down my question. If you think there is something to be improved, please kindly support your vote with a message. Commented Feb 26, 2020 at 6:12

2 Answers 2

1

Spark community provides older versions of spark in this website, you can choose any version according your OS, For windows you can use tgz extension file.

https://archive.apache.org/dist/spark/

Sign up to request clarification or add additional context in comments.

Comments

0

You can build any custom version of Spark locally.

  • Clone https://github.com/apache/spark locally
  • Update pom file, focusing on scala.version, hadoop.version, scala.binary.version, and artifactId in https://github.com/apache/spark/blob/master/pom.xml
  • mvn -DskipTests clean package (from their README)
  • After successful build, find all jars in assembly/target/scala-2.11/jars, external/../target, and other external jars you desire, which may be in provided scope of your jars submitted.
  • Create a new directory and export SPARK_HOME="/path/to/directory_name" so that https://github.com/apache/spark/blob/master/bin/spark-submit will detect it (see the source to see why)
  • Copy the jars into $SPARK_HOME/jars and make sure there are no conflicting jars
  • The bin/ scripts should be the same, but if needed, specifically reference those and possibly even unlink the brew ones if you no longer need them

1 Comment

Thank you, @Elinda. I think this solves local environment setup really nicely. Would you be able to extend this further from CI/CD perspective? I'm planning to deployment solution into an EC2 instance where spark would installed via command invoked by jeankin's pipeline.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.