3

Iam new to AWs glue.

I am facing issue in converting glue data frame to pyspark data frame :

Below is the crawler configuration i created for reading csv file glue_cityMapDB="csvDb" glue_cityMapTbl="csv table"

datasource2 = glue_context.create_dynamic_frame.from_catalog(database = glue_cityMapDB, table_name = glue_cityMapTbl, transformation_ctx = "datasource2")

datasource2.show()

print("Show the data source2 city DF")
cityDF=datasource2.toDF()
cityDF.show()

Output:

Here i am getting output from the glue dydf - #datasource2.show() But after converting to the pyspark DF, iam getting following error

S3NativeFileSystem (S3NativeFileSystem.java:open(1208)) - Opening 's3://s3source/read/names.csv' for reading 2020-04-24 05:08:39,789 ERROR [Executor task launch worker for task

Appreciate if anybody can help on this?

1 Answer 1

2

Make use of a file are of UTF-8 encoded. You can check using file or convert using inconv or any other text editor like sublime.

You can also read the files as a dataframe using:

df = spark.read.csv('s3://s3source/read/names.csv')

then convert to dynamic frames using fromDF()

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.