Py4JJavaError: An error occurred while calling o389.csv

Question

I'm new to pyspark. I'm running pyspark using databricks. My data is stored in Azure Data Lake Service.I'm trying to read csv file from ADLS to pyspark data frame. So I wrote following code

import pyspark
from pyspark import SparkContext 
from pyspark import SparkFiles

df = sqlContext.read.csv(SparkFiles.get("dbfs:mycsv path in ADSL/Data.csv"), 
   header=True, inferSchema= True)

But I'm getting error message

Py4JJavaError: An error occurred while calling o389.csv.

Can you suggest me to rectify this error?

Alex Ott · Accepted Answer · 2020-10-06 14:12:54Z

1

The SparkFiles class is intended for accessing the files shipped as part of the Spark job. If you just need access to the CSV file available on ADLS, then you just need to use spark.read.csv, like:

df = spark.read.csv("dbfs:mycsv path in ADSL/Data.csv", 
  header=True, inferSchema=True)

it's better not to use sqlContext, it's kept for compatibility reasons.

answered Oct 6, 2020 at 14:12

Alex Ott

88.1k10 gold badges110 silver badges157 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Py4JJavaError: An error occurred while calling o389.csv

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related