0

I have below code running in spark env::

import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.spark.sql.SQLContext
import sqlContext.implicits._
import java.util.Properties

val conf = new SparkConf().setAppName("test").setMaster("local").set("spark.driver.allowMultipleContexts", "true");
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
val df = sqlContext.read.format("jdbc").option("url","jdbc:sqlserver://server_IP:port").option("databaseName","DB_name").option("driver","com.microsoft.sqlserver.jdbc.SQLServerDriver").option("dbtable","tbl").option("user","uid").option("password","pwd").load()

val df2 = df.sqlContext.sql("SELECT col1,col2 FROM tbl LIMIT 5")
exit()

When I am trying to execute the above code, I get the error as "org.apache.spark.sql.AnalysisException: Table not found: tbl;", however, if I remove df2, and execute the code, I can see the content of the table tbl successfully. IS there anything am doing wrong? I am using spark 1.6.1, so I checked the documentation, the syntax to fire the sql query through sqlcontext is rightly placed by me "https://spark.apache.org/docs/1.6.0/sql-programming-guide.html", please refer "Running SQL Queries Programmatically" topic.

Following are the only trace from the full trace error ::

conf: org.apache.spark.SparkConf = org.apache.spark.SparkConf@5eea8854
sc: org.apache.spark.SparkContext = org.apache.spark.SparkContext@7790a6fb
sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@a9f4621
df: org.apache.spark.sql.DataFrame = [col1: int, col2: string, col3: string, col4: string, col5: string, col6: string, col7: string, col8: string, col9: timestamp, col10: timestamp, col11: string, col12: string]
org.apache.spark.sql.AnalysisException: Table not found: tbl;

1 Answer 1

1

the df in your code is a DataFrame.

If you want to do any select operations do like df.select().

If you want to execute query by using sqlcontext.sql() you have first register the dataframe as temporary table with df.registerTempTable(tableName: String).

Sign up to request clarification or add additional context in comments.

1 Comment

your suggestion works like charm, I tried both the options and they are working for me, just to give the syntax for record for future visitor to this post, for 1st suggestion of df.select(), here you just need to mention the name of the colums that user wants to see, i.e. df.select(col1, col2).show(). If somebody wants to limit the rows to display, just mention the number in the show(). for 2nd suggestion, I wrote as "df.registerTempTable("test"), and then df.sqlContext.sql("select * from test").collect.foreach(println). It worked.Thank you lot. We can close the thread now.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.