0

I'm trying to scrape a webpage that has a React element that hides the dropdown after few seconds.

This is what you see when you first get to the page and the tab I would like to scrape.

Dropdown element I want to scrape. Specifically the '24' people are viewing this event line.

I'm trying to scrape the part that says Don't miss out! 24 people are viewing this event

After few seconds, the tab disappears and gets replaced by another dropdown element that says Get notified at the right price!

new dropdown that replaces the one i want to scrape. It hides the previous dropdown

The source code reveals the view count drop down as being hidden after few seconds. The top of the code shows the new dropdown while the bottom with the 'hide' in the div class being the dropdown I want to scrape.

The source code showing the hidden dropdown code

I've tried getting the div class = "urgency-component-container but due to it being hidden, it returns nothing. I've also tried getting the div class = "dropdown-header-item" but that was returning nothing as well.

I've tried getting the XPath to the dropdown-header-item (//*[@id="dropdown-header"]/div/div1) but that didn't work either.

How can I scrape the dropdown that "hides" after few seconds? Thanks

EDIT:

the website url is : https://www.stubhub.com/anaheim-ducks-tickets-anaheim-ducks-anaheim-honda-center-11-14-2019/event/104217448/?sort=price%20asc

The code I used was

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time

driver = webdriver.Chrome()
url = 'https://www.stubhub.com/anaheim-ducks-tickets-anaheim-ducks-anaheim-honda-center-11-14-2019/event/104217448/?sort=price+asc'
driver.get(url)

content = driver.find_element_by_class_name('dropdown-header-item')

If I execute the code straightaway I get an error

NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":".dropdown-header-item"}

but if I wait few seconds and run it then I get

Get notified at the right price!Set price alert
5
  • Can i get the application url? Commented Oct 8, 2019 at 20:01
  • Please read why a screenshot of code is a bad idea. Paste the code and properly format it instead. Commented Oct 8, 2019 at 21:08
  • @YosuvaA I pasted the url in the body edit Commented Oct 8, 2019 at 22:20
  • @JeffC Thanks for the info. I've made some edit and added my code Commented Oct 8, 2019 at 22:20
  • @uclaastro I have added my answer with sample code and it works fine for me. Please have a look. Commented Oct 8, 2019 at 22:32

2 Answers 2

1

Please try this and let me know how it goes.

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time

driver = webdriver.Chrome('/usr/local/bin/chromedriver')  # Optional argument, if not specified will search path.
driver.delete_all_cookies()
driver.implicitly_wait(15)
driver.maximize_window()
url = 'https://www.stubhub.com/anaheim-ducks-tickets-anaheim-ducks-anaheim-honda-center-11-14-2019/event/104217448/?sort=price+asc'
driver.get(url)
driver.refresh()

content = driver.find_element_by_xpath("//div[@class='urgency-wrapper']//div[@class='dropdown-header-item']").text
print content

driver.quit()

output:

Don't miss out. 28 people are viewing this event.

Sign up to request clarification or add additional context in comments.

10 Comments

I'm getting a blank return. When I print the content it's printing blank
@uclaastro Looks like that message appear only first time. it is not showing when i ran 2nd time.Also when i checked the html code 2nd time, i couldn't find those tags
How to get that message again?
You have to refresh the page
@uclaastro I have updated my code. can you try now?
|
0

Looks like site doesn't like scraping and throws up captcha for starters. If there wasn't a captcha then you want the .textContent property of the last element matched with class dropdown-header-item

from selenium import webdriver

d = webdriver.Chrome()
d.get('https://www.stubhub.com/anaheim-ducks-tickets-anaheim-ducks-anaheim-honda-center-11-14-2019/event/104217448/?sort=price%20asc')
elems = d.find_elements_by_css_selector('.dropdown-header-item')
if len(elems) > 0:
    print(d.find_elements_by_css_selector('.dropdown-header-item')[-1].get_attribute('textContent').replace('\xa0', ' '))
else:
    print('Nada')

2 Comments

This returns 'Nada'. I tried printing elems and it's returning a blank result
Did you observe the loaded page? Did it hit a captcha?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.