2

I am using the Crealytics Spark library to read an Excel Workbook into a Spark Dataframe using a Databricks Python notebook.

Hardcoded like this works fine:

df = spark.read.format("com.crealytics.spark.excel")
     .option("useHeader","true")
     .option("dataAddress","'Sheet1'!")
     .load("/FileStore/tables/Test.xlsx")

I would like to read a dynamic list of options from a table into a PySpark structure (such as list or dict) and pass these to the DataFrame as varargs.

However, it fails even when trying to pass in just one option:

test = {"useHeader":"True"}

df = spark.read.format("com.crealytics.spark.excel")
     .option(*test)
     .option("dataAddress","'Sheet'!")
     .load("/FileStore/tables/Test.xlsx")

TypeError: option() takes exactly 3 arguments (2 given)

1 Answer 1

2

Use options not option

options(**options)

Adds input options for the underlying data source.

As you can see from the signature, it takes keyword arguments, hence dictionary unpacking will be a valid way to provide these.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.