2

I've been trying to directly read a csv file from AWS S3 to numpy. I've used:

s3 = boto3.client(service_name = 's3')

def s3_read(filename):
    s3_obj = s3.get_object(Bucket = 'bucket-name', Key = filename)
    body = s3_obj['Body']
    return body.read()

as an attempt to pull the data but I'm running into an issue of formatting from AWS that I don't know how to handle.

When I print out the data that is being returned from that there is a weird header before the data:

b{\n "name":"filename",\n "data":{\n "type":"Buffer,\n "data:[\n 114,\n 97,...]}}

So there's a bunch of \n's and the weird header. Would this have something to do with the way I uploaded the file to AWS or is there something I'm messing up with the reading of the file?

1 Answer 1

4

body.read() returns bytes.

import json
j = json.loads(s3_obj['Body'].read().decode('utf-8'))

decode will turn bytes to string, json.loads will parse the string to dictionary.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.