18

When running a Scala file that uses the Spark Dataset type I get the following stack trace:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/Dataset
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
    at java.lang.Class.privateGetMethodRecursive(Class.java:3048)
    at java.lang.Class.getMethod0(Class.java:3018)
    at java.lang.Class.getMethod(Class.java:1784)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:125)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.Dataset
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 6 more

I find this strange because I have the following import:

import org.apache.spark.sql._

Also, in my build.sbt I have the following added to libraryDependencies:

  "org.apache.spark" %% "spark-core" % "1.6.2" % "provided",
  "org.apache.spark" %% "spark-sql" % "1.6.2" % "provided",
2
  • 2
    How are you running this? If you are submitting to a cluster, is it possible that the spark version is not correct there? Commented Jul 8, 2016 at 14:23
  • 2
    where are you running this? You are excluding the Spark core and sql libraries from your package in your build file. Commented Jul 8, 2016 at 14:27

4 Answers 4

46

If you are executing this standalone you can try removing provided from your dependencies. Provided means that you expect the dependencies to already be on the classpath when you run this application. So the Spark dependencies won't be included in your jar if you use provided.

Sign up to request clarification or add additional context in comments.

1 Comment

I have a similar problem running on Intellij but the error will not go away even after removing "provided" . The error includes Error: A JNI error has occurred, please check your installation and try again Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/Dataset – Gakuo just now edit
10

In IntelliJ 2020.3.2 community edition, go to menu run then edit configurations. Finally, in Modify options select 'Include dependencies with "Provided" scope'.

Comments

9

Select checkbox 'Include dependencies with "Provided" scope' in Run/Debug Configurations.

image of dropdown & checkbox

1 Comment

This worked for me but, curious what it did and why it worked.
2

Your build.sbt file specified that the spark dependencies are provided to the application's classpath, but it wasn't able to locate them. If you're not running on a cluster, then you can try removing the "provided" from your build.sbt, or put the Spark dependencies on your classpath.

1 Comment

I have a similar problem on Intellij but the error will not go away even after removing "provided" . The error includes Error: A JNI error has occurred, please check your installation and try again Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/Dataset

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.