1

I tried to run a spark job on a yarn cluster written in Scala, and run into this error:

[!@#$% spark-1.0.0-bin-hadoop2]$ export HADOOP_CONF_DIR="/etc/hadoop/conf"
[!@#$% spark-1.0.0-bin-hadoop2]$ ./bin/spark-submit --class "SimpleAPP" \
>     --master yarn-client \
>     test_proj/target/scala-2.10/simple-project_2.10-0.1.jar
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Exception in thread "main" java.lang.ClassNotFoundException: SimpleAPP
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:270)
    at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:289)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

And this is my sbt file:

[!@#$% test_proj]$ cat simple.sbt 
name := "Simple Project"

version := "0.1"

scalaVersion := "2.10.4"

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.0.0"

// We need to be able to write Avro in Parquet
// libraryDependencies += "com.twitter" % "parquet-avro" % "1.3.2"

resolvers += "Akka Repository" at "http://repo.akka.io/releases/"

this is my SimpleApp.scala program, it is the canonical one:

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf

object SimpleApp{
  def main(args: Array[String]) {
    val logFile = "/home/myname/spark-1.0.0-bin-hadoop2/README.md" // Should be some file on your system
    val conf = new SparkConf().setAppName("Simple Application")
    val sc = new SparkContext(conf)
    val logData = sc.textFile(logFile, 2).cache()
    val numAs = logData.filter(line => line.contains("a")).count()
    val numBs = logData.filter(line => line.contains("b")).count()
    println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
  }
}

sbt package is as following:

[!@#$% test_proj]$ sbt package
[info] Set current project to Simple Project (in build file:/home/myname/spark-1.0.0-bin-hadoop2/test_proj/)
[info] Compiling 1 Scala source to /home/myname/spark-1.0.0-bin-hadoop2/test_proj/target/scala-2.10/classes...
[info] Packaging /home/myname/spark-1.0.0-bin-hadoop2/test_proj/target/scala-2.10/simple-project_2.10-0.1.jar ...
[info] Done packaging.
[success] Total time: 12 s, completed Mar 3, 2015 10:57:12 PM

As suggested, I did the following:

jar tf simple-project_2.10-0.1.jar | grep .class

Something as followed shows up:

SimpleApp$$anonfun$1.class
SimpleApp$.class
SimpleApp$$anonfun$2.class
SimpleApp.class
6
  • And your jar contains SimpleAPP? Is that the full namespace, also? Commented Mar 3, 2015 at 22:48
  • I have done sbt package: Commented Mar 3, 2015 at 23:03
  • please see updated quesiton. It seems have added to jar, right? Commented Mar 3, 2015 at 23:03
  • run an sbt clean package Commented Mar 4, 2015 at 15:59
  • I did sbt clean package , nothing changed...:( Commented Mar 4, 2015 at 16:19

1 Answer 1

1

Verify if the name is SimpleAPP in the jar.

Do this:

jar tf simple-project_2.10-0.1.jar | grep .class

And check if the name of the class is right.

Sign up to request clarification or add additional context in comments.

1 Comment

You should say SimpleApp and not SimpleAPP. ./bin/spark-submit --class "SimpleApp"

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.