I have set a test Cassandra + Spark cluster. I am able to successfully query Cassandra from spark if I do the following:
import org.apache.spark.sql.cassandra.CassandraSQLContext
import import sqlContext.implicits._
val cc = new CassandraSQLContext(sc)
val dataframe = cc.sql("select * from my_cassandra_table")
dataframe.first
I would now like to query data from a python we app. All the docs on the web seem to show how to use spark's python shell (where the context, 'sc', is implicitly provided).
I need to be able to run spark SQL from an independent python script, perhaps one which serves web pages.
I haven't found any docs, no help on apache-spark irc channel. Am I just thinking about this wrong? Are there other tools which provide spark SQL to less technical users? I'm completely new to spark.
spark-submitis just a convenience wrapper. As long as all settings are correct it is not really required. What you see in the docs is a valid standalone application.