5

I am trying to navigate the PADI liveaboard page to scrape some boat, departure date and price info. I was able to get the xpath from chrome debug console and have selenium find it. But i want to make it better by using relative path and I am not sure how to do that. This is what I got so far:

from selenium import webdriver
import pdb

browser = webdriver.Chrome()


browser.get('https://travel.padi.com/s/liveaboards/caribbean/')
assert 'Caribbean' in browser.title


elem2 = browser.find_elements_by_xpath('//*[@id="search-la"]/div/div[3]/div/div[2]/div[3]/div')

print(elem2)
print(len(elem2))

browser.close()

So as you can see the code will go to PADI, find all the cards for each dive boat and give it back to me in a list. The xpath used here is from the nearest id available but from that point on it is all absolute paths div/div/div etc. I was wondering if I could change that into relative path somehow.

thanks.

2
  • You Can Also Use Css Selector And Please Give Us Website Link So We Can Answer You Better! Commented Nov 30, 2019 at 6:15
  • 1
    use classes and ids instead of long list of div. And you can use // to skip some elements in xpath. To use relative path you have to start with dot ./ and use some other element instead of browser - ie elem2[0].xpath('.//div') Commented Nov 30, 2019 at 6:23

2 Answers 2

7

You should use class and/or id to make shorter xpath.

When you find cards then you can use every card with xpath which starts with ./ - so it will be xpath relative to this element and it will search only inside this element.

You can also use // in any part of xpath to skip some tags which are not important.

You can use other find_element_by_ and find_elements_by_ with card and it will also search only inside this element - so it will be relative.

import selenium.webdriver

driver = selenium.webdriver.Chrome() # Firefox()

driver.get('https://travel.padi.com/s/liveaboards/caribbean/')

all_cards = driver.find_elements_by_xpath('//div[@class="boat search-page-item-card "]')

for card in all_cards:
    title = card.find_element_by_xpath('.//a[@class="shop-title"]/span')
    desc  = card.find_element_by_xpath('.//p[@class="shop-desc-text"]')
    price = card.find_element_by_xpath('.//p[@class="cur-price"]/strong/span')

    print('title:', title.text)
    print('desc:',  desc.text)
    print('price:', price.text)

    all_dates = card.find_elements_by_css_selector('.cell.date')

    for date in all_dates:
        day, month = date.find_elements_by_tag_name('span')
        print('date:', day.text, month.text)

    print('---')

Example result (you can have price in different currency)

title: CARIBBEAN EXPLORER II
desc: With incredible, off-the-beaten path itineraries that take guests to St Kitts, Saba and St Maarten, this leading liveaboard spoils divers with five dives each day, scenic geography and a unique slice of Caribbean culture.
Dates do not match your search criteria
price: PLN 824
date: 7 DEC
date: 14 DEC
date: 21 DEC
date: 28 DEC
---
title: BAHAMAS AGGRESSOR
desc: Featuring five dives a day, the well-regarded Bahamas Aggressor liveaboard is the ideal choice for divers who want to spend as much time under the water as possible then relax in an onboard Jacuzzi.
Dates do not match your search criteria
price: PLN 998
date: 7 DEC
date: 14 DEC
date: 21 DEC
date: 28 DEC
---
Sign up to request clarification or add additional context in comments.

3 Comments

So prefix './/' is the answer?
@SamGinrich yes, prefix .// (or rather ./) is the answer. Without ./ it searchs always from the beginning of data and it finds always the same element(s). In .// you have two elements ./ (dot) to search in relative path, and // to skip some elements in path.
Thank you for clarification! this search would still be recursive? I ask, because in some WebDriver implementation of XPATH only immediate descendants seem to be considered with './/'
0

You need to use classes in items ./

I Simply Code For You You Can Try!

from selenium import webdriver
import pdb

browser = webdriver.Chrome()

browser.get('https://travel.padi.com/s/liveaboards/caribbean/')

items = browser.find_elements_by_xpath('//div[@class="boat-info"]')

for item in items :
    title = item.find_element_by_xpath('.//a[@class="shop-title"]/span')
    description = item.find_element_by_xpath('.//p[@class="shop-desc-text"]')
    price = item.find_element_by_xpath('.//p[@class="cur-price"]/strong/span')
    print('TITLE: ', title.text)
    print('DESCRIPTION: ', description.text)
    print('PRICE: ', price.text)
    print('------------------NEW-RECORD------------------------')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.