1

I have been practicing my web-scraping skills recently and came across this fantastic piece by Fábio Neves: If you like to travel, let Python help you scrape the best cheap flights!

Instead of scraping the 'Kayak' site like Fábio, I decided to try and create a bot which would scrape the Ryanair site.

My approach:

I take a users input for their 'airport of departure'. I then select the 'From' text-box which prompts a dropdown list to appear. This dropdown list contains 234 locations.

Ryanair Filter Options

city_from = input('From which city? ') #Takes users input

The next step I was trying to implement was to find the match for the users input with the options in the dropdown list. And the proceed to click that matching option.

elements_list = driver.find_elements_by_xpath('//div [@class="core-list-ref"]') ##Finds all Elements/Cities in the dropdown list

list_pos = [value for value in elements_list].index(str(city_from)) #Finds the value(city name) for each element in the dropdown list and tries to locate the position of the inputed 'airport of departure' in the list.

elements_list[list_pos].click() #I then try to select this option.

However...

It seems that not all 234 cities appear when I use the following code:

driver.find_elements_by_xpath('//div [@class="core-list-ref"]')

Only the first 79 appear Aalborg-Genoa, the other cities seem to be 'hidden'. I have found that when I manually scroll down to the bottom of the dropdown list, and try to re-run the code they appear. So I then tried to implement .move_to_element(element), to make the bot scroll down to the last airport in the dropdown list. But this still only allows me to scroll as far as the 79th airport (Genoa). This makes my bot crash when the user inputs airports like 'Zurich'.

This is my first attempt at scraping. How can I overcome this issue, or is there a better way to select an 'airport of departure'. Please let me know if you need any more details.

2 Answers 2

1

please find below solution:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.action_chains import ActionChains


driver = webdriver.Chrome(executable_path=r"C:\New folder\chromedriver.exe")
driver.maximize_window()
wait = WebDriverWait(driver, 20)
driver.get("https://www.ryanair.com/ie/en/cheap-flights/?from=DUB&out-from-date=2020-03-31&out-to-date=2021-03-31&budget=150")
inputBox = wait.until(EC.element_to_be_clickable((By.XPATH, "//div[@name='departureInput']//div[@class='disabled-overlay']")))

actionChains = ActionChains(driver)
actionChains.move_to_element(inputBox).click().perform()

list = wait.until(EC.presence_of_all_elements_located((By.XPATH, "//div[@class='core-list']")))

for element in list:
     print element.text

Output:

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

1

If you scroll all the way down the From list you will see you have 256 elements matching the xpath you mentioned in the question //div [@class="core-list-ref"], and only 253 of these are unique airports (have a close look in the dev console and you will see what I am talking about). To deal with this, and for the sake of creativity, below is a different angle to tackle the problem - get all airports from the Map View.

WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.XPATH,"//*[@icon-id='glyphs.earth']")))
driver.find_element_by_xpath("//*[@icon-id='glyphs.earth']").click() #click Map View

WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.CLASS_NAME,"airports")))
airports_root=driver.find_element_by_class_name('airports')
airport_tags=airports_root.find_elements_by_tag_name('text')

airport_names=[]
for airport in airport_tags:
    airport_names.append(airport.get_property('innerHTML'))

Note these are airport names, not necessarily the same as city names e.g. type in "Murcia" and it will autocomplete to "Murcia International".

You can compare your user input against this list (avoid == due to the note above amongst other reasons) to ensure it's valid, and enter it in the From/To fields. Note the code below does not include the data validation check:

#From
valid_city_from = input('From which city? ')
departure=driver.find_element_by_xpath("//div[@name='departureInput']//div[@class='disabled-wrap']/input")
driver.execute_script("arguments[0].value = '"+ valid_city_from + "';", departure)

# To
valid_city_to = input('To which city? ')
destination=driver.find_element_by_xpath("//div[@name='destinationInput']//div[@class='disabled-wrap']/input")
driver.execute_script("arguments[0].value = '"+ valid_city_to + "';", destination)

Btw, you need these imports for WebdriverWait:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.