0

Building a basic webscraper and I cannot retrieve the class value that contains a product's price. The class name for all of the prices listed on the google search is qptdjc.

Here is the HTML tag for the price <div class="qptdjc">$179.99</div>

import selenium
from selenium import webdriver as wb
import pandas as pd
import time

browser = wb.Chrome(executable_path='C:/Users/ethan/Downloads/chromedriver_win32(1)/chromedriver')
browser.get('https://www.google.com/search?source=hp&ei=nnZ6X8SDO4WE9PwPj5KC4AQ&q=144hz+monitor&oq=144hz+monitor&gs_lcp=CgZwc3ktYWIQAzIFCAAQsQMyBQgAELEDMgIIADICCAAyAggAMgIIADICCAAyAggAMgIIADICCAA6DggAEOoCELQCEJoBEOUCOgsILhDHARCvARCTAjoFCC4QsQM6CAguEMcBEK8BOgsILhCxAxDHARCjAjoICAAQsQMQgwE6CAguELEDEIMBOg4ILhCxAxCDARDHARCvAVDyDligHmDsH2gBcAB4AIABiwGIAZMHkgEEMTIuMZgBAKABAaoBB2d3cy13aXqwAQY&sclient=psy-ab&ved=0ahUKEwjEyoykppzsAhUFAp0JHQ-JAEwQ4dUDCAk&uact=5')

productInfoList = browser.find_elements_by_class_name('qptdjc')
prices = browser.find_elements_by_xpath('//td[@class="qptdjc"]')

prices_list = []
for p in range(len(prices)):
    prices_list.append(prices[p].text)

print(len(productInfoList))
print(*prices_list, sep = ", ")
print(*prices, sep = ", ")
1
  • Have a small question do you want the previous price or not? Commented Oct 5, 2020 at 3:28

1 Answer 1

1

So I waited for all the elements to appear. Grabbed all of them, looped and got their inner HTML.

productInfoList=WebDriverWait(driver, 20).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "div.r4awE > span")))
prices=WebDriverWait(driver, 20).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, "div.qptdjc")))

prices_list = []
for price in prices:
    prices_list.append(price.get_attribute('innerHTML').split('<')[0].strip())
pprint(prices_list)
print(len(productInfoList))

Outputs

['$229.99',
 '$187.52',
 '$249.99']
3

Import

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC
from pprint import pprint
Sign up to request clarification or add additional context in comments.

2 Comments

thank you, why do you use by.CSS_SELECTOR instead of by.XPATH? All of the tutorials I have been using use the latter.
CSS_SELECTOR tends to be faster than xpath.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.