8

I want to use AWS Glue to convert some csv data to orc.
The ETL job I created generated the following PySpark script:

import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job

args = getResolvedOptions(sys.argv, ['JOB_NAME'])

sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)

datasource0 = glueContext.create_dynamic_frame.from_catalog(database = "tests", table_name = "test_glue_csv", transformation_ctx = "datasource0")

applymapping1 = ApplyMapping.apply(frame = datasource0, mappings = [("id", "int", "id", "int"), ("val", "string", "val", "string")], transformation_ctx = "applymapping1")

resolvechoice2 = ResolveChoice.apply(frame = applymapping1, choice = "make_struct", transformation_ctx = "resolvechoice2")

dropnullfields3 = DropNullFields.apply(frame = resolvechoice2, transformation_ctx = "dropnullfields3")

datasink4 = glueContext.write_dynamic_frame.from_options(frame = dropnullfields3, connection_type = "s3", connection_options = {"path": "s3://glue/output"}, format = "orc", transformation_ctx = "datasink4")
job.commit()

It takes the csv data (from the location of which the Athena table tests.test_glue_csv points to) and outputs to s3://glue/output/.

How can I insert in this script some SQL manipulations?

Thanks

2 Answers 2

11

You should first create a temp view/table from your dynamic frame

dyf.toDF().createOrReplaceTempView("view_dyf")

Here, dyf is your dynamic frame.

Then, use your spark object to apply sql queries on it

sqlDF = spark.sql("select * from view_dyf")
sqlDF.show()
Sign up to request clarification or add additional context in comments.

Comments

0

You can use toDF()

df = datasource0.toDF() df.printSchema()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.