2

website is: https://www.jao.eu/auctions#/

you see 'OUT AREA' dropdown (I see a lot of ReactSelect...)

I need to get the full list of items contained in that list [AT, BDL-GB, BDL-NL, BE...].

Can you please help me?

wait = WebDriverWait(driver, 20)
driver.get('https://www.jao.eu/auctions#/')

first = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, '.css-1739xgv-control')))

first.click()

                                                                          
second = wait.until(......

2 Answers 2

2

Logging ones network traffic reveals that the page makes several requests to REST APIs, one endpoint being getcorridors, whose response is JSON and contains all values from the dropdown(s). All you need to do is imitate that HTTP POST request. No Selenium required:

def get_corridors():
    import requests
    from operator import itemgetter

    url = "https://www.jao.eu/api/v1/auction/calls/getcorridors"

    headers = {
        "Accept": "application/json",
        "Accept-Encoding": "gzip, deflate",
        "Content-Type": "application/json",
        "User-Agent": "Mozilla/5.0"
    }

    response = requests.post(url, headers=headers, json={})
    response.raise_for_status()

    return list(map(itemgetter("value"), response.json()))
    

def main():

    for corridor in get_corridors():
        print(corridor)
    
    return 0


if __name__ == "__main__":
    import sys
    sys.exit(main())

Output:

IT-CH
HU-SK
ES-PT
FR-IT
SK-CZ
NL-DK
IT-FR
HU-HR
FR-ES
IT-GR
CZ-AT
DK-NL
SI-AT
CH-DE
...
Sign up to request clarification or add additional context in comments.

Comments

1

Try the following to fetch the required list of items from that site using requests module:

import requests

link = 'https://www.jao.eu/api/v1/auction/calls/getcorridors'

with requests.Session() as s:
    s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36'
    res = s.post(link,json={})
    items = [item['value'] for item in res.json()]
    print(items)

Output are like (truncated):

'IT-CH', 'HU-SK', 'ES-PT', 'FR-IT', 'SK-CZ', 'NL-DK', 'IT-FR', 'HU-HR'

3 Comments

@paul-m Thank you, this is awesome. But I also need to scrape the tables present in jao.eu/auctions# after I select OUT AREA, IN AREA, TYPE, AUCTION ID, etc... Can you please help me?
That is beyond the scope of your question you asked in the first place. Please, try creating another post describing your new requirement. Thanks.
Thank you, please check my next post on this regard. I need your help please

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.