2

I was trying to invoke a Lambda function through the Python SDK in a synchronous fashion in a Jupyter notebook. An event I am sending is such that it takes more than the maximum possible timeout limit (15 min) to complete.

I noticed that the event sometimes (not always) is being re-sent to the lambda upon the timeout error. This keeps going on and on until I shutdown the lambda by setting its concurrency to 0. This never happens if I lower the timeout limit (e.g., 10 minutes), meaning, the event is never being re-sent, there is only one invocation in the log, only one error and no activity afterwards.

What is going on? How do I rationalize these observations?

4
  • Hi, have you looked at setting a maximum retry allocation as well as configuration of a DLQ? More information here: aws.amazon.com/about-aws/whats-new/2019/11/… Commented Jun 3, 2020 at 17:04
  • 1
    @mokugo-devops Yes, of course. But that applies only to asynchronous invocations in my understanding. Commented Jun 3, 2020 at 17:06
  • 1
    Alternatively can you try looking at putting it into a step function and invoking that instead? Commented Jun 3, 2020 at 17:08
  • 1
    @mokugo-devops thanks, I will give it a try Commented Jun 3, 2020 at 17:09

2 Answers 2

1

I recommend turning on DEBUG level debugging and examining CloudWatch logs when you see it get executed more than once. I've seen this sometimes, and when I do I usually see log entries that come from the SDK code itself that tell me it has some built-in retry logic that is executing. If the call to invoke the lambda doesn't get a proper response, it may retry the call again--but it is possible the service received the original request and executed it, yet something went wrong with the response and so the caller re-at

Check out what is said at this link: https://aws.amazon.com/premiumsupport/knowledge-center/lambda-function-retry-timeout-sdk/

Note: API calls can take longer than expected when network connection issues occur. Network issues can also cause retries and duplicated API requests. To prepare for these occurrences, your Lambda function must always be idempotent.

If you make an API call using an AWS SDK and the call fails, the SDK automatically retries the call. How long and how many times the SDK retries is determined by settings that vary among each SDK.

That article has tips for troubleshooting or changing config settings.

Also see https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html

Sign up to request clarification or add additional context in comments.

2 Comments

Great info, thanks so much - investigating. I will the question open for the time being in case someone else contributes.
Are you still waiting for more answers on this one?
1

Try looking at step functions, by doing this you can control the retry logic of Lambda and mark it as a failure.

IF your Lambda function is taking 15 minutes, determine whether you can break it down into smaller Lambda functions and invoke each of these in turn in your Lambda function.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.