Trying to create a table and load data into same table using Databricks and SQL

Question

I Googled for a solution to create a table, using Databticks and Azure SQL Server, and load data into this same table. I found some sample code online, which seems pretty straightforward, but apparently there is an issue somewhere. Here is my code.

CREATE TABLE MyTable
USING org.apache.spark.sql.jdbc 
OPTIONS (
  url "jdbc:sqlserver://server_name_here.database.windows.net:1433;database = db_name_here",
  user "u_name",
  password "p_wd",
  dbtable "MyTable"
);

Now, here is my error.

Error in SQL statement: SQLServerException: Invalid object name 'MyTable'.

My password, unfortunately, has spaces in it. That could be the problem, perhaps, but I don't think so.

Basically, I would like to get this to recursively loop through files in a folder and sub-folders, and load data from files with a string pattern, like 'ABC*', and load recursively all these files into a table. The blocker, here, is that I need the file name loaded into a field as well. So, I want to load data from MANY files, into 4 fields of actual data, and 1 field that captures the file name. The only way I can distinguish the different data sets is with the file name. Is this possible? Or, is this an exercise in futility?

mauridb · Accepted Answer · 2019-10-08 14:06:42Z

1

my suggestion is to use the Azure SQL Spark library, as also mentioned in documentation:

https://docs.databricks.com/spark/latest/data-sources/sql-databases-azure.html#connect-to-spark-using-this-library

The 'Bulk Copy' is what you want to use to have good performances. Just load your file into a DataFrame and bulk copy it to Azure SQL

https://docs.databricks.com/data/data-sources/sql-databases-azure.html#bulk-copy-to-azure-sql-database-or-sql-server

To read files from subfolders, answer is here:

How to import multiple csv files in a single load?

answered Oct 8, 2019 at 14:06

mauridb

1,58911 silver badges13 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

ASH Over a year ago

I started with Spark and switched to SQL, only because I couldn't get Spark to do what I needed. I'll take a second look at that option. Thanks!!

ASH · Accepted Answer · 2019-10-09 13:24:11Z

0

I finally, finally, finally got this working.

val myDFCsv = spark.read.format("csv")
   .option("sep","|")
   .option("inferSchema","true")
   .option("header","false")
   .load("mnt/rawdata/2019/01/01/client/ABC*.gz")

myDFCsv.show()
myDFCsv.count()

Thanks for a point in the right direction mauridb!!

answered Oct 9, 2019 at 13:24

ASH

20.5k28 gold badges115 silver badges247 bronze badges

1 Comment

mauridb Over a year ago

My pleasure :) :) :)

Collectives™ on Stack Overflow

Trying to create a table and load data into same table using Databricks and SQL

2 Answers 2

1 Comment

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related