0

I have the following code from the Spark example website, trying to run it from Eclipse, but it seems the code doesn't even compile.

import org.apache.spark._
import org.apache.spark.SparkContext._

object DataFrameExample {

  def main(args: Array[String]) {

    case class Person(name: String, age: Int)

    val conf = new SparkConf().setAppName("wordCount"); //.setMaster("local")
    conf.setMaster("local");

    val sc = new SparkContext(conf)
    val sqlContext = new org.apache.spark.sql.SQLContext(sc)

    import sqlContext._
    import sqlContext.implicits._

    val people = sc.textFile("examples/src/main/resources/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt)).toDF()
    people.registerTempTable("people")

    val teenagers = sqlContext.sql("SELECT name, age FROM people WHERE age >= 13 AND age <= 19")

    // The results of SQL queries are DataFrames and support all the normal RDD operations.
    // The columns of a row in the result can be accessed by field index:
    teenagers.map(t => "Name: " + t(0)).collect().foreach(println)

    // or by field name:
    teenagers.map(t => "Name: " + t.getAs[String]("name")).collect().foreach(println)

    // row.getValuesMap[T] retrieves multiple columns at once into a Map[String, T]
    teenagers.map(_.getValuesMap[Any](List("name", "age"))).collect().foreach(println)
    // Map("name" -> "Justin", "age" -> 19)
  }
}

But then I got the following errors. Did I miss anything here? Thanks!

enter image description here

The same error (as text, from IntelliJ)

Error:(18, 93) No TypeTag available for Person val people = sc.textFile("examples/src/main/resources/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt)).toDF() ^

1
  • I have added the error message as text (as shown by IntelliJ). Could you please add the exact message as text from Eclipse as well? Commented Dec 1, 2015 at 20:18

1 Answer 1

2

Move the definition of class Person:

case class Person(name: String, age: Int)

object DataFrameExample {
  def main(args: Array[String]) {
    // [...]
  }
}

That definition must be outside of the method using it.

As for the reason: Have a look at this, a quote from there:

2- Move case class outside of the method:

case class, by use of which you define the schema of the DataFrame, should be defined outside of the method needing it.

And it references https://issues.scala-lang.org/browse/SI-6649

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.