2

The issue I have is I am trying to run a script which uses Selenium and specifically webdriver.

driver = webdriver.Firefox(executable_path='numpy-test/geckodriver', options=options, service_log_path ='/dev/null')

My issue is that the function requires geckodriver in order to run. Geckodriver can be found in the zip file I have uploaded to AWS but I have no idea as to how to get the function to access it on AWS. Locally its not an issue as it is in my directory and so everything runs.

I get the following error message when running the function via serverless:

{ "errorMessage": "Message: 'geckodriver' executable needs to be in PATH. \n", "errorType": "WebDriverException", "stackTrace": [ [ "/var/task/handler.py", 66, "main", "print(TatamiClearanceScrape())" ], [ "/var/task/handler.py", 28, "TatamiClearanceScrape", "driver = webdriver.Firefox(executable_path='numpy-test/geckodriver', options=options, service_log_path ='/dev/null')" ], [ "/var/task/selenium/webdriver/firefox/webdriver.py", 164, "init", "self.service.start()" ], [ "/var/task/selenium/webdriver/common/service.py", 83, "start", "os.path.basename(self.path), self.start_error_message)" ] ] }

Error --------------------------------------------------

Invoked function failed

Any help would be appreciated.

EDIT:

def TatamiClearanceScrape():
    options = Options()
    options.add_argument('--headless')

    page_link = 'https://www.tatamifightwear.com/collections/clearance'
    # this is the url that we've already determined is safe and legal to scrape from.
    page_response = requests.get(page_link, timeout=5)
    # here, we fetch the content from the url, using the requests library
    page_content = BeautifulSoup(page_response.content, "html.parser")

    driver = webdriver.Firefox(executable_path='numpy-test/geckodriver', options=options, service_log_path ='/dev/null')
    driver.get('https://www.tatamifightwear.com/collections/clearance')

    labtnx = driver.find_element_by_css_selector('a.btn.close')
    labtnx.click()
    time.sleep(10)
    labtn = driver.find_element_by_css_selector('div.padding')
    labtn.click()
    time.sleep(5)
    # wait(driver, 50).until(lambda x: len(driver.find_elements_by_css_selector("div.detailscontainer")) > 30)
    html = driver.page_source
    page_content = BeautifulSoup(html)
    # we use the html parser to parse the url content and store it in a variable.
    textContent = []

    tags = page_content.findAll("a", class_="product-title")

    product_title = page_content.findAll(attrs={'class': "product-title"})  # allocates all product titles from site

    old_price = page_content.findAll(attrs={'class': "old-price"})

    new_price = page_content.findAll(attrs={'class': "special-price"})

    products = []
    for i in range(len(product_title) - 2):
        #  groups all products together in list of dictionaries, with name, old price and new price
        object = {"Product Name": product_title[i].get_text(strip=True),
                  "Old Price:": old_price[i].get_text(strip=True),
                  "New Price": new_price[i].get_text(), 'date': str(datetime.datetime.now())
                  }
        products.append(object)



    return products
8
  • 1
    Can you post the lambda function code that tries to launch selenium? Commented Jan 20, 2019 at 19:09
  • Is the problem that the executable cant be accessed Commented Jan 20, 2019 at 19:14
  • If it is, do you have paths in the S3 buckets? Commented Jan 20, 2019 at 19:16
  • 1
    Have you tried passing an absolute path as the executable_path argument? Commented Jan 20, 2019 at 19:25
  • I did consider that. I'll give a go. Thanks. Commented Jan 20, 2019 at 19:29

1 Answer 1

1

You might want to have a look at AWS Lambda Layers for this. With Layers you Lambda can you can use libraries without needing to include them in your deployment package for you functions. Layers do so that you do not have to upload dependencies on every change of your code, you just create an additional layer with all required packages.

Read here for more details on AWS Lambda Layers

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.