- save Spark DataFrame to S3 as CSV with GZip compression
( df.write .option("header", True) .option("encoding", "UTF-8") .mode(mode) .csv(s3_uri, compression=compression)) - set tag
Content-Encodingtogzip
- execute postgres extension to
COPYfrom S3 to a tableSELECT aws_s3.table_import_from_s3( 'public.mytable1', '', '(format csv, header true)', aws_commons.create_s3_uri('my-bucket-1', 'my/object/key/part-00000-...-1-c000.csv', 'us-east-1') );
Add a comment
|
1 Answer
AWS documentation gives incorrect instructions to set object metadata. if you manually set metadata, it will simply treat your tag as an arbitrary string instead of recognizing Content-Encoding as a reserved keyword.
The default metadata behavior will cause error:

Force system-defined tags (rather than the default User-defined tags):

Wasted hours and 4 peoples' time on this. Feedback to AWS docs team has been submitted.