0

I am unable to load a CSV file directly from Azure Blob Storage into a RDD by using PySpark in a Jupyter Notebook.

I have read through just about all of the other answers to similar problems but I haven't found specific instructions for what I am trying to do. I know I could also load the data into the Notebook by using Pandas, but then I would need to convert the Panda DF into an RDD afterwards.

My ideal solution would look something like this, but this specific code give me the error that it can't infer a schema for CSV.

#Load Data source = <Blob SAS URL> elog = spark.read.format("csv").option("inferSchema", "true").option("url",source).load()

I have also taken a look at this answer: reading a csv file from azure blob storage with PySpark but I am having trouble defining the correct path.

Thank you very much for your help!

4
  • Well, what do you get by removing the inferSchema option? Commented Apr 27, 2019 at 22:28
  • It says that it can't infer the Schema either way. Commented Apr 29, 2019 at 13:55
  • Have you tried manually defining one? Commented Apr 29, 2019 at 14:11
  • Not yet, I was hoping for a more flexible solution. But if that is the only way I can try it. Commented May 2, 2019 at 20:50

1 Answer 1

1

Here is my sample code with Pandas to read a blob url with SAS token and convert a dataframe of Pandas to a PySpark one.

First, to get a Pandas dataframe object via read a blob url.

import pandas as pd

source = '<a csv blob url with SAS token>'
df = pd.read_csv(source)
print(df)

Then, you can convert it to a PySpark one.

from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("testDataFrame").getOrCreate()
spark_df = spark.createDataFrame(df)
spark_df.show()

Or, the same result with the code below.

from pyspark.sql import SQLContext
from pyspark import SparkContext

sc = SparkContext()
sqlContest = SQLContext(sc)
spark_df = sqlContest.createDataFrame(df)
spark_df.show()

Hope it helps.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.