2

I have a requirement that my lambda functions creates a new CSV file and uploads it after writing to it . I am using Python boto3 for this purpose

with open('mycsv.csv', 'w', newline ='') as f:
     thewriter = csv.writer(f)
     thewriter.writerow(['col1','col2','col3'])
     s3_client = boto3.client('s3')
    response = s3_client.upload_file('/tmp/' + mycsv.csv, 'my-bucket', 'myfoleder/'+mycsv.csv)

Note that the file 'mycsv.csv' does not exist already and I want to be able to create it on the fly as part of the lambda function. Is this even possible? I get the following error when the lambda is triggered

[Errno 30] Read-only file system: 'mycsv.csv'

2 Answers 2

2

On lambda, the filesystem is mostly readonly with the exception of the /tmp directory. When you open the file for writing it needs to go to /tmp/mycsv.csv:

with open('/tmp/mycsv.csv', 'w', newline ='') as f:
     thewriter = csv.writer(f)
     thewriter.writerow(['col1','col2','col3'])
s3_client = boto3.client('s3')
response = s3_client.upload_file('/tmp/mycsv.csv', 'my-bucket', 'myfolder/mycsv.csv')

You might also consider using Python's tempfile.NamedTemporaryFile, which will automatically write to /tmp and will delete the file once you exit the context manager block.

Sign up to request clarification or add additional context in comments.

2 Comments

Careful... The upload should be outside of the with, otherwise the file is still open.
Thanks for pointing that out! If the upload is inside with, it does not work.
1

You can skip the intermediate file and process your data completely in memory. This has the advantage of being faster and allows you to handle larger data. Currently, Lambda offers only 512 MB of disk space in /tmp, but up to 3 GB of memory.

import csv
import io

buffer = io.StringIO()
writer = csv.writer(buffer)
writer.writerow(['col1', 'col2', 'col3'])

buffer.seek(0)
s3_client = boto3.client('s3')
s3_client.upload_fileobj(buffer, 'my-bucket', 'my-folder/mycsv.csv')

Consider also compressing your CSV files. This will result in faster and cheaper transfers to/from S3.

import gzip

# ...

buffer.seek(0)
compressed = io.BytesIO(gzip.compress(buffer.getvalue().encode('utf-8')))

s3_client = boto3.client('s3')
s3_client.upload_fileobj(compressed, 'my-bucket', 'my-folder/mycsv.csv.gz')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.