2

I have some requirements for which I need my lambda functions to generate a text file with logs in json format and upload it to S3, from S3 we have a Spectrum table which run later processes on. I know there might be different ways of doing this using cloudwatch, but we have to stick to this because of the later processes we run on the logs. Now... the problem is, I got the code working fine on my machine. When I test the function, it works... only the first time. After some troubleshooting, here is a way to replicate it:

import json
import os
import os
import platform
from urllib.request import urlopen
import logging
import sys
from datetime import datetime
from os import listdir
from os.path import isfile, join

def log_start(log_prefix):
    now = datetime.now()
    log_id = str(now).replace(':', '').replace(' ', '').replace('.', '').replace('-', '')[:14]
    log_name = '/tmp/{}_{}.txt'.format(log_prefix, log_id)

    root = logging.getLogger()
    if root.handlers:
        for handler in root.handlers:
            root.removeHandler(handler)

    logging.basicConfig(level=logging.INFO, filename=log_name, filemode='a+',
                        format='''{{"log_id":"{}", "created_date":"%(asctime)s.%(msecs)03d", "action_text":"%(message)s"}}'''.format(
                            log_id),
                        datefmt="%Y-%m-%dT%H:%M:%S")
    root = logging.getLogger()
    root.setLevel(logging.INFO)

    handler = logging.StreamHandler(sys.stdout)
    handler.setLevel(logging.INFO)
    formatter = logging.Formatter(
        '''{{"log_id":"{}", "created_date":"%(asctime)s.%(msecs)03d", "action_text":"%(message)s"}}'''.format(
            log_id),
        datefmt="%Y-%m-%dT%H:%M:%S")
    handler.setFormatter(formatter)
    root.addHandler(handler)

    return log_name, log_id

def lambda_handler(event, context):
    
    log_name, log_id = log_start('log_prefix')
    logging.info('test')
    print(log_name)
    print(log_id)
    print(isfile(log_name))

If you run this several times, you will notice that the last print statement returns true the first time, and then it is always false. If i run it again in say, an hour, same thing, first true then false all the time. Why is this happening? Why does it work just one time?

EDIT: This can't be because of /tmp running out of space. This issue also happens if I mount an EFS drive and try to direct the log file to the EFS drive.

4
  • Not sure, but it might be due to the fact that AWS Lambda functions only have 512MB of storage available in /tmp/. It is generally a good idea to delete files created in /tmp/ before exiting the function. Commented Aug 6, 2020 at 0:16
  • @JohnRotenstein Hi! Please see my edit, i forgot to mention this also happens if i try to use an EFS drive instead of /tmp. Commented Aug 6, 2020 at 0:32
  • @rodrigocf Add first print: print(os.listdir("/tmp")). You will see no new log files are created. Only the first one is created. Thus you get only false later on. Commented Aug 6, 2020 at 0:36
  • Thus either your log_start has incorrect logic that it does not create any subsequent files, or this is some inherent logging feature. Commented Aug 6, 2020 at 0:37

1 Answer 1

2

After a lot of troubleshooting, I realized the problem was here:

if root.handlers:
    for handler in root.handlers:
        root.removeHandler(handler)

The reason being, removeHandler removes an item from handlers (a list) in the middle of the loop. So in the second run, it doesn't remove all the handlers correctly. The solution was to change it to:

if root.handlers:
    handlers = []

And problem solved. :)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.