We are porting some code from SSIS to Python. As part of this project, I'm recreating some packages but I'm having issues with the database access. I've managed to query the DB like this:
employees_table = (spark.read
.format("jdbc")
.option("url", "jdbc:sqlserver://dev.database.windows.net:1433;database=Employees;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;")
.option("query", query)
.option("user", username)
.option("password", password)
.load()
)
With this I make API calls, and want to put the results into the DB, but everything I've tried causes errors.
df.write.method("append") \
.format("jdbc") \
.option("url", "jdbc:sqlserver://dev.database.windows.net:1433;database=Employees;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;") \
.option("v_SQL_Insert", query) \
.option("user", username) \
.option("password", password) \
.option("query",v_SQL_Insert) \
.save()</p>
Gives me the error AttributeError: 'DataFrame' object has no attribute 'write'.
I get the same error using spark.write or if I try creating an actual dataframe, populate it and try to use the write function.
My question is, what is the best way to populate a db table from Python? Is it to create a dataframe and save it to the DB, or create an SQL command? And how do we go about sending that data in?

