3

First I convert a CSV file to a Spark DataFrame using

val df = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").load("/usr/people.csv")

after that type df and return I can see

res30: org.apache.spark.sql.DataFrame = [name: string, age: string, gender: string, deptID: string, salary: string]

Then I use df.registerTempTable("people") to convert df to a Spark SQL table.

But after that when I do people Instead got type table, I got

<console>:33: error: not found: value people

Is it because people is a temporary table?

Thanks

0

2 Answers 2

7

When you register an temp table using the registerTempTable command you used, it will be available inside your SQLContext.

This means that the following is incorrect and will give you the error you are getting :

scala> people.show
<console>:33: error: not found: value people

To use the temp table, you'll need to call it with your sqlContext. Example :

scala> sqlContext.sql("select * from people")

Note : df.registerTempTable("df") will register a temporary table with name df correspond to the DataFrame df you apply the method on.

So persisting on df wont persist the table but the DataFrame, even thought the SQLContext will be using that DataFrame.

Sign up to request clarification or add additional context in comments.

Comments

0

The above answer is right for Zeppelin too. If you want to run println to see data, you have to send it back to the driver to see output.

val querystrings = sqlContext.sql("select visitorDMA, 
        visitorIpAddress, visitorState, allRequestKV
    from {redacted} 
    limit 1000")

querystrings.collect.foreach(entry => {
    print(entry.getString(3).toString() + "\n")
})

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.