Use JDBC (eg Squirrel SQL) to query Cassandra with Spark SQL

Question

I have a Cassandra cluster with a co-located Spark cluster, and I can run the usual Spark jobs by compiling them, copying them over, and using the ./spark-submit script. I wrote a small job that accepts SQL as a command-line argument, submits it to Spark as Spark SQL, Spark runs that SQL against Cassandra and writes the output to a csv file.

Now I feel like I'm going round in circles trying to figure out if it's possible to query Cassandra via Spark SQL directly in a JDBC connection (eg from Squirrel SQL). The Spark SQL documentation says

Connect through JDBC or ODBC.

A server mode provides industry standard JDBC and ODBC connectivity for
business intelligence tools.

The Spark SQL Programming Guide says

Spark SQL can also act as a distributed query engine using its JDBC/ODBC or
command-line interface. In this mode, end-users or applications can interact
with Spark SQL directly to run SQL queries, without the need to write any 
code.

So I can run the Thrift Server, and submit SQL to it. But what I can't figure out, is how do I get the Thrift Server to connect to Cassandra? Do I simply pop the Datastax Cassandra Connector on the Thrift Server classpath? How do I tell the Thrift Server the IP and Port of my Cassandra cluster? Has anyone done this already and can give me some pointers?

Kaushal · Accepted Answer · 2015-12-10 16:24:48Z

2

Configure those properties in spark-default.conf file

spark.cassandra.connection.host    192.168.1.17,192.168.1.19,192.168.1.21
# if you configured security in you cassandra cluster
spark.cassandra.auth.username   smb
spark.cassandra.auth.password   bigdata@123

Start your thrift server with spark-cassandra-connector dependencies and mysql-connector dependencies with some port that you will connect via JDBC or Squirrel.

sbin/start-thriftserver.sh --hiveconf hive.server2.thrift.bind.host 192.168.1.17 --hiveconf hive.server2.thrift.port 10003 --jars <shade-jar>-0.0.1.jar --driver-class-path <shade-jar>-0.0.1.jar

For getting cassandra table run Spark-SQL queries like

CREATE TEMPORARY TABLE mytable USING org.apache.spark.sql.cassandra OPTIONS (cluster 'BDI Cassandra', keyspace 'testks', table 'testtable');

answered Dec 10, 2015 at 16:24

Kaushal

3,3773 gold badges32 silver badges49 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

Matt Over a year ago

The "USING" could be the missing link I couldn't figure out - I'll give it a go and see if it works!

Matt Over a year ago

With the Thrift server turned on, can I connect to it directly using JDBC UI eg Squirrel SQL? Do I need a specific client jar on my JDBC UI classpath in order to connect to the thrift server?

Kaushal Over a year ago

spark/bin/beeline -u jdbc:hive2://192.168.1.14:10000

Matt Over a year ago

Sorry, I was travelling for most of December so have only just had a chance to try this out. Beeline connects perfectly! Adding the jars was annoying as there are a number of dependencies (cassandra-connector, guava, cassandra-core, etc) so I created a big shaded bundle with maven, and voila!

Matt Over a year ago

Now to see if I can connect to it from Squirrel SQL :)

|

AlexL · Accepted Answer · 2015-12-10 13:40:36Z

1

why don`t you use the spark-cassandra-connector and cassandra-driver-core? Just add the dependencies, specify the host address/login in your spark context and then you can read/write to cassandra using sql.

answered Dec 10, 2015 at 13:40

AlexL

7611 gold badge6 silver badges20 bronze badges

1 Comment

Matt Over a year ago

I have done this in a job jar, but I still need to use the spark-submit script to submit my SQL job onto spark, which requires command-line access. Is it possible to run SQL directly from a PC, connecting to Spark/Cassandra?

Collectives™ on Stack Overflow

Use JDBC (eg Squirrel SQL) to query Cassandra with Spark SQL

2 Answers 2

10 Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

10 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related