7

I have been going at this for a while trying to get python, numpy and pytz added to AWS Lambda as Layers rather than having to zip and throw it at AWS with my .py file. I was able to follow multiple tutorials and all of them failed.

I have resorted to following this guide if I am to go with pandas, numpy or pytz for any functionality (AWS Lambda with Pandas and NumPy - Ruslan Korniichuk - Medium). So this is good but I do not want to have to recreate a zip each time if things change with my function needs ect. especially as my company is growing. We are simply trying to automate some tasks with Lamba using Cloudwatch to run jobs periodically. Nothing spectacular and I know there may be route with S3 and other instances. However, I have been able to successfully create layers for other libraries except for Pandas, Numpy and Pytz.

So, I am worried about scalability with this method. I am working on a mac and I am not sure what else to do: I have tried using Docker, I have tried building from wheels. Is there any viable tutorials that explain how to do this in detail?

Here are some of the tutorials I have tried. Yes, it does not mean I followed them correctly but I did not succeed in the end with most of them:

You name it and I might have already went through the steps, especially in these articles, to complete this task. And a lot of Stack question comments as well, which have been very helpful and insightful.

Thanks in advance for any advice, just here to learn!

3 Answers 3

11

This is probably not the answer you want to hear, but honestly the pain around getting certain compiled libraries into lambda layers was enough for my company to just stop using them. Instead we use tend to use either fargate or ECS with docker containers.

Besides the issues of compiling packages for lambdas, we also ran into major issues with the max size of lambdas. We regularly were hitting that cap and having to get more and more hacky to remove files in order to make them fit.

Update: AWS now lets you run Lambdas from containers in ECR, which solves this problem nicely.

Sign up to request clarification or add additional context in comments.

2 Comments

I agree that this can be a bit of a pain, especially with GCC version requirements in Lambda. We did have to pin certain versions of our ML dependencies to ensure they worked with the Lambda runtime. IMHO, this is a failure of the ML libraries more than the Lambda runtime though.
From what I have researched, I fear this is correct. I have only been in industry a month and knowing how far I can take my role with my skill set I am fearful of this ceiling with AWS. Thank you all for your insights.
3

You should not need to recompile the layer every time you deploy. We have a lambda layer specifically for ML libraries like numpy, pandas, and fbprophet. It works great b/c our lambda deployment zip files are tiny, speeding up development and deployment.

I'm happy to help further. Can you give more information about what you tried and what was going wrong?

1 Comment

I have layers up in Lambda for libraries such as pygsheets that I know that I do not need to recompile. What is happening is that I need to use Pandas and have not been able to successfully get it onto Lambda as a layer. I have successfully deployed it in a zip with my .py file. So far I believe I have been able to 'layerize' pandas only and numpy will then continuously come up as saying it cannot find that module (sometimes pytz will come up as well). I have to look back at my notes to recreate that situation, where Pandas was successful, but thats as far as I have gotten successfully.
3

In case anyone else stumbles across this post, there are now pre-built layers in AWS that you can access:

https://github.com/keithrozario/Klayers

For those of you who want to make your own (like I did) this one comment in the docker packaging script of that repo solved my problem: python (build with python3.6 or python3.7 not python3)

That comment clicked in for me, build with the version of python you're using in AWS. In my original Makefile for the lambda layer, I was using python3 to build the packages, and that would always give me the same error when I tried running the lambda:

Runtime.ImportModuleError Unable to import module 'function' .... Original error was: No module named 'numpy.core._multiarray_umath'

However when I switched to this:

python3.8 -m pip install -r requirements.txt -t "$(ARTIFACTS_DIR)/python"

which matches my AWS Lambda Runtime. I was able to run numpy, pandas, and openpyxl without issue.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.