9

I'm trying to use Selenium (in Python) to extract some information from a website. I've been selecting elements with XPaths but am having trouble using the following-sibling selector. The HTML is as follows:

<span class="metadata">
    <strong>Photographer's Name: </strong>
    Ansel Adams
</span>

I can select "Photographer's Name" with

In [172]: metaData = driver.find_element_by_class_name('metadata')

In [173]: metaData.find_element_by_xpath('strong').text
Out[173]: u"Photographer's Name:"

I'm trying to select the section of text after the tag ('Ansel Adams' in the example). I assumed I could use the following-sibling selector but I receive the following error:

In [174]: metaData.find_element_by_xpath('strong/following-sibling::text()')
ERROR: An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line statement', (328, 0))
... [NOTE: Omitted the traceback for brevity] ...
InvalidSelectiorException: Message: u'The given selector strong/following-sibling::text() is either invalid or does not result in a WebElement. The following error occurred:\n[InvalidSelectorError] The result of the xpath expression "strong/following-sibling::text()" is: [object Text]. It should be an element.' 

Any ideas as to why this isn't working?

3 Answers 3

8

@RossPatterson is correct. The trouble is that the text 'Ansel Adams' is not a WebElement, so you cannot use find_element or find_elements. If you change your HTML to

<span class="metadata">
    <strong>Photographer's Name: </strong>
    <strong>Ansel Adams</strong>
</span>

then find_element_by_xpath('strong/following-sibling::*[1]').text returns 'Ansel Adams'.

Sign up to request clarification or add additional context in comments.

3 Comments

Unfortunately, I don't have control over the HTML content. It's strange though, since the code works in online [XPath testers]. Well, this leads me to a second question: is it possible to get all of the contents of <span class="metadata"> (tags and text)? I can select it by find_elements_by_class_name('metadata') but can not figure out how to get both the text with the <strong> tags intact.
You could always use driver.page_source to get the HTML of the whole page, and then use something other than webdriver to parse it.
Great, I didn't know about driver.page_source, this makes my day, thanks!
3

This is documented in this Selenium bug report: http://code.google.com/p/selenium/issues/detail?id=5459

"Your xpath doesn't return an element; it returns a text node. While this might have been perfectly acceptable in Selenium RC (and by extension, Selenium IDE), the methods on the WebDriver WebElement interface require an element object, not just any DOM node object. WebDriver is working as intended. To fix the issue, you'd need to change the HTML markup to wrap the text node inside an element, like a ."

1 Comment

Unfortunately it's hard to find actual documentation that documents the intention that "the methods on the WebDriver WebElement interface require an element object, not just any DOM node object," contrary to the case with Selenium RC. I finally found something here: seleniumhq.github.io/selenium/docs/api/java/org/openqa/selenium/… WebElement, the type returned by findElement, "Represents an HTML element".
2

To get the text "Ansel Adams", just use metaData.get_text(). I don't believe find_element_by_* will allow you to find a text node.

1 Comment

Seems like metaData.get_text() would give you Photographer's Name: Ansel Adams. According to the documentation at release.seleniumhq.org/selenium-remote-control/0.9.2/doc/dotnet/…, "This command uses either the textContent (Mozilla-like browsers) or the innerText (IE-like browsers) of the element, which is the rendered text shown to the user."

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.