In Databricks, I uploaded my source data under a volume folder (e.g., /Volumes/my_catalog/my_schema/landing_source/). Now, I want to create a DataFrame or table using this volume path as the source.
1. Using Spark API:
I am able to successfully create a DataFrame using the spark.read API:
df = spark.read.format("csv") \
.option("header", "true") \
.load("/Volumes/my_catalog/my_schema/landing_source/")
2. Using SQL CREATE TABLE with datasource:
When I try to create a table directly using SQL with the volume path as the data source:
CREATE TABLE IF NOT EXISTS my_catalog.my_schema.my_table
USING CSV
OPTIONS (
path "/Volumes/my_catalog/my_schema/landing_source/",
header "true",
inferSchema "true"
)
I get the following exception:
[RequestId=4800ec53-1a9d-946e-b61f-0ab7c93b88fc ErrorClass=INVALID_PARAMETER_VALUE.INVALID_PARAMETER_VALUE] Missing cloud file system scheme JVM stacktrace: com.databricks.sql.managedcatalog.UnityCatalogServiceException at com.databricks.managedcatalog.ErrorDetailsHandler.wrapServiceException(ErrorDetailsHandler.scala:111) at com.databricks.managedcatalog.ErrorDetailsHandler.wrapServiceException$(ErrorDetailsHandler.scala:66) at com.databricks.managedcatalog.ManagedCatalogClientImpl.wrapServiceException(ManagedCatalogClientImpl.scala:266) at com.databricks.managedcatalog.ManagedCatalogClientImpl.recordAndWrapExceptionBase(ManagedCatalogClientImpl.scala:7309) at com.databricks.managedcatalog.ManagedCatalogClientImpl.recordAndWrapException(ManagedCatalogClientImpl.scala:7295) at com.databricks.managedcatalog.ManagedCatalogClientImpl.generateTemporaryPathCredentials(ManagedCatalogClientImpl.scala:6277)
Currently i am using Databricks trial account for my learning. How can i fix this issue.
OPTIONS, try using theLOCATIONparameter