1

I am writing a lambda to read some data from a csv into a dataframe, manipulate said data then convert it back to a csv and make an api call with the new csv all on a python lambda.

I am running into an issue using pandas.read_csv command. It ends my lambdas trigger execution with no errors.

os.chdir('/tmp')
for root, dirs, files in os.walk('/tmp', topdown=True):
    for name in files:
        if '.csv' in name:
            testdic[name] = root
            print(os.path.isfile('/tmp/' + name))
            print(os.path.isfile(name))
            df = pd.read_csv(name)
            df = pd.read_csv('/tmp/' + name)

Both os.path.isfile return true and i have tried both versions of read_csv, both do not work and end the lambda prematurely without error.

I have confirmed the csv is downloaded into the lambda tmp directory, I can read and print off rows of the csv in tmp. However when i run = pd.read_csv('/tmp/file.csv') or changing my directory to /tmp and doing = pd.read_csv('file.csv') it ends the lambda with no error and does not pass that point in the code. I am using pandas 0.23.4 as that is what I need to use and the code works locally. Any suggestions would be helpful

Expected results should be the csv being read into a dataframe so I can manipulate it.

FIXED: Could not just use '/tmp/' + filename. Had to use os.path.join(root, filename), also had to increase the timeout of my lambda due to file size.

5
  • Use file_path = os.path.join(root, name) and then pd.read_csv(file_path)? Commented Jun 3, 2019 at 16:57
  • Why is chdir needed. It can be done directly without it. Commented Jun 3, 2019 at 17:01
  • I used chdir based on other stack overflow advice. The os.path.join allowed the smaller file to read in which showed me the issue was also my timeout was too short. Thanks! Commented Jun 4, 2019 at 16:02
  • Did it work finally? Commented Jun 4, 2019 at 16:13
  • Yes increasing the lambda timeout and using the os.path.join got it working. Thank you Commented Jun 5, 2019 at 17:05

1 Answer 1

1

os.path.join - works for different platforms

Use

file_path = os.path.join(root, name)

and then

pd.read_csv(file_path)

NOTE: Increase the AWS lambda timeout as suggested in comments by @Gabe Maurer

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.