2

I just started using selenium yesterday to help scrape some data and I'm having a difficult time wrapping my head around the selector engine. I know lxml, BeautifulSoup, jQuery and Sizzle have similar engines. But what I'm trying to do is:

  1. Wait 10 seconds for page to completely load
  2. Make sure there are the presence of ten or more span.eN elements (two load on intitial page load and more after)
  3. Then start processing the data with beautifulsoup

I am struggling with the selenium conditions of either finding the nth element or locating the specific text that only exists in an nth element. I keep getting errors (timeout, NoSuchElement, etc)

    url = "http://someajaxiandomain.com/that-injects-html-after-pageload.aspx"
    wd = webdriver.Chrome()
    wd.implicitly_wait(10)
    wd.get(url)
    # what I've tried
    # .find_element_by_xpath("//span[@class='eN'][10]"))
    # .until(EC.text_to_be_present_in_element(By.CSS_SELECTOR, "css=span[class='eN']:contains('foo')"))
2
  • It's hard to provide any solution without knowing the html ! Provide some html if possible Commented May 11, 2015 at 21:11
  • Here is example of the prettified HTML: paste.ee/p/hR3f6 - I am after span.eN or tbody.EventBody being greater than 10 OR for a span.eN to contain "Triple Jump" (usually the last to load). It's really just the tabular data I'm interested in. Initially only 4 or 5 tbod[ies] load and then the rest is injected after the initial pageload. Commented May 11, 2015 at 21:37

1 Answer 1

4

You need to understand the concept of Explicit Waits and Expected Conditions to wait for.

In your case, you can write a custom Expected Condition to wait for elements count found by a locator being equal to n:

from selenium.webdriver.support import expected_conditions as EC

class wait_for_n_elements_to_be_present(object):
    def __init__(self, locator, count):
        self.locator = locator
        self.count = count

    def __call__(self, driver):
        try:
            elements = EC._find_elements(driver, self.locator)
            return len(elements) >= self.count
        except StaleElementReferenceException:
            return False

Usage:

n = 10  # specify how many elements to wait for

wait = WebDriverWait(driver, 10)
wait.until(wait_for_n_elements_to_be_present((By.CSS_SELECTOR, 'span.eN'), n))

Probably, you could have also just used a built-in Expected Condition such as presence_of_element_located or visibility_of_element_located and wait for a single span.eN element to be present or visible, example:

wait = WebDriverWait(driver, 10)
wait.until(presence_of_element_located((By.CSS_SELECTOR, 'span.eN')))
Sign up to request clarification or add additional context in comments.

8 Comments

@Saifur I really hope you are not leveraging your self appointed prowess or calling someone a "wrong person" to be a troll -- it's an honest question and I am grateful for those who are trying to help.
@user1645914 nono, we've just exchanged a couple of jokes and removed off-topic comments - the only one left so it's out of the context. Saifur is definitely here on SO to help.
@user1645914 My apology. I personally respect alecxe and all his efforts and of course ANYONE who asks questions on SO. It's the best place on the earth to get help when you are ALONE in dark.
@alecxe Thank you for your help. I am putting this into my code and still running into selenium.common.exceptions.TimeoutException: Message: '' -- it's probably on my end still but I will accept your detailed answer once I get it to work on my end. Thank you.
@Saifur thank you! Sorry for leaving your comment alone. Though, it was a funny coincidence. The comment totally changed it's sense without a context :)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.