1

I'm following the documentation example Example: Estimator, Transformer, and Param

And I got error msg

15/09/23 11:46:51 INFO BlockManagerMaster: Registered BlockManager Exception in thread "main" java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaUniverse$JavaMirror; at SimpleApp$.main(hw.scala:75)

And line 75 is the code "sqlContext.createDataFrame()":

import java.util.Random

import org.apache.log4j.Logger
import org.apache.log4j.Level

import scala.io.Source

import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.rdd._


import org.apache.spark.ml.classification.LogisticRegression
import org.apache.spark.ml.param.ParamMap
import org.apache.spark.mllib.linalg.{Vector, Vectors}
import org.apache.spark.mllib.recommendation.{ALS, Rating, MatrixFactorizationModel}
import org.apache.spark.sql.Row
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.DataFrame
import org.apache.spark.sql.functions._

object SimpleApp {
     def main(args: Array[String]) {
       val conf = new SparkConf().setAppName("Simple Application").setMaster("local[4]");
       val sc = new SparkContext(conf)
       val sqlContext = new SQLContext(sc)
       val training = sqlContext.createDataFrame(Seq(
         (1.0, Vectors.dense(0.0, 1.1, 0.1)),
         (0.0, Vectors.dense(2.0, 1.0, -1.0)),
         (0.0, Vectors.dense(2.0, 1.3, 1.0)),
         (1.0, Vectors.dense(0.0, 1.2, -0.5))
       )).toDF("label", "features")
    }
}

And my sbt is like below:

lazy val root = (project in file(".")).
  settings(
    name := "hello",
    version := "1.0",
    scalaVersion := "2.11.4"
  )

libraryDependencies ++= {
    Seq(
        "org.apache.spark" %% "spark-core" % "1.4.1" % "provided",
        "org.apache.spark" %% "spark-sql" % "1.4.1" % "provided",
        "org.apache.spark" % "spark-hive_2.11" % "1.4.1",
        "org.apache.spark"  % "spark-mllib_2.11" % "1.4.1" % "provided",
        "org.apache.spark" %% "spark-streaming" % "1.4.1" % "provided",
        "org.apache.spark" %% "spark-streaming-kinesis-asl" % "1.4.1" % "provided"
    )
}

I tried to search around and found this post which is very similar to my issue, and I tried to change my sbt setting for spark versions (spark-mllib_2.11 to 2.10, and spark-1.4.1 to 1.5.0), but it came even more dependency conflicts.

My intuition is it's some version problem but cannot figure it out myself, could anyone please help? thanks a lot.

9
  • 1
    You should add spark-sql to the dependencies. Commented Sep 23, 2015 at 20:05
  • @zero323 thanks I'll add it and try it out Commented Sep 23, 2015 at 20:10
  • @zero323 it did not work, still same error. I'm thinking maybe it's because here "val sqlContext = new SQLContext(sc) val training = sqlContext.createDataFrame()", where I new the wrong class with no such method? Commented Sep 23, 2015 at 20:17
  • 1
    @keypoint: 1. Try scala 2.10.2 2. Then replace % "spark-mllib_2.11" by %% "spark-mllib" 3. Add spark sql as zero323 suggested. Commented Sep 23, 2015 at 20:31
  • @zero323: It is an error at runtime and not a compilation error. So a missing import should be noticed on compile. Commented Sep 23, 2015 at 20:32

1 Answer 1

2

It's working now for me, and just for the record, referencing @MartinSenne answer.

what I did is as below:

  1. clear all compile files under folder "project"
  2. scala version 2.10.4 (previously using 2.11.4)
  3. change spark-sql to be: "org.apache.spark" %% "spark-sql" % "1.4.1" % "provided"
  4. change MLlib to be: "org.apache.spark" %% "spark-mllib" % "1.4.1" % "provided"

@note:

  1. I've already started a Spark cluster and I use "sh spark-submit /path_to_folder/hello/target/scala-2.10/hello_2.10-1.0.jar" to submit jar to Spark master. If use sbt to run by command "sbt run" will fail.
  2. when changing from scala-2.11 to scala-2.10, remember that the jar file path and name will also change from "scala-2.11/hello_2.11-1.0.jar" to "scala-2.10/hello_2.10-1.0.jar". when I re-packaged everything, I forgot to change the submit job command for the jar name, so I package into "hello_2.10-1.0.jar" but submitting "hello_2.10-1.0.jar" which caused me extra problem...
  3. I tried both "val sqlContext = new org.apache.spark.sql.SQLContext(sc)" and "val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)", both are working with method createDataFrame()
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.