2

I'm trying to use scrapy in AWS lambda function as a layer.

I used pip to install scrapy in my directory:

pip install scrapy 

the directory format is as all layers I already have working. I zipped and uploaded into layers. I included the layer to the lambda function. I import the scrapy:

import scrapy

and when I run the project I obtain this error:

{
  "errorMessage": "Unable to import module 'lambda_function'"
}

and

Unable to import module 'lambda_function': /opt/python/lxml/etree.so: invalid ELF header
1
  • scrapy uses lxml. lxml requires native code (etree.so). Not sure it can be don with Lambda Commented Mar 6, 2019 at 11:54

1 Answer 1

3

As the comment by @balderman suggests, you need native libraries for scrapy to run. This is very much doable, I'll try to explain as simply as possible.

The binaries for scrapy has to be compiled in the same environment as a lambda instance. Lambda gets booted up using AWS Linux.

You can either boot up an EC2 running AmazonLinux or use docker, easiest way is to boot up a docker container.

$ sudo docker run -it amazonlinux bash

Now you need to download/unpack all .so files into a directory then zip it. Also, make sure to keep all .so files inside a folder called lib inside the zip. After zipping, the zip should look something similar to this:

.
├── lib
│   ├── libcrypto.so.10
│   ├── libcrypto.so.1.0.2k
│   ├── libfontconfig.so.1
│   ├── libfontconfig.so.1.7.0
.......

Then you can just zip it and upload it as a layer. It will be uploaded to /opt/ in your Lambda Container. AWS looks for library files under /opt/lib amongst many other locations.

The challenging part for you would be to figure out how to get all the required .so files in order for scrapy to run properly.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.