2

In the following example:

<tr>
    <td>
    </td>

    <td>
    </td>

    <td>
    </td>

    <td>
    </td>

    <td>
        text1
        <br>
        <img>
        <br>
        text2
    </td>
</tr>

When I try to get the text in the 5th td like so:

something = elem.find_element_by_xpath('./td[5]').text

I get both texts in the same variable. I can split them but I was wondering if I can somehow get them in individual variables so I don't bother with a split. However when I try something like this:

something = elem.find_element_by_xpath('./td[5]/text()[1]')

I get the following error message:

InvalidSelectorException: invalid selector: 
The result of the xpath expression "./td[5]/text()[1]" is: [object Text]. 
It should be an element.

Can I get around this error somehow?

1
  • Because Selenium requires the return result of find_element must be Element Node, your /td[5]/text()[1] will return a Text Node, this why you get the error. For What's Element/ Text Node, you can read HTML DOM document, for node in DOM Tree, it has 3 types, Element and Text is two types of the 3 types. Commented Mar 28, 2018 at 10:32

2 Answers 2

4

You can try below code to get two separate text nodes:

something = elem.find_element_by_xpath('./td[5]')
text1 = driver.execute_script('return arguments[0].firstChild.textContent;', something).strip()
text2 = driver.execute_script('return arguments[0].lastChild.textContent;', something).strip()
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks you solution worked wonderfully. If you have time and don't mind I'd love it if you walk me through the code. It would help immensely. Particularly what does the script part and the strip part do.
Each variable is a result of JavaScript code. arguments[0] is a placeholder for something element. So the first code means return the text value of the first child node of td, second is the same for last child node of td. Note that child nodes of td are: 1. "text1", 2. br, 3. "" 4. img, 5.br, 6. "", 7. "text2". strip() just allows you to get rid of leading and trailing new-line characters and spaces. Also firstChild/lastChild can be replaced with explicit index e.g. childNodes[0], childNodes[6]
My wholehearted thanks. How would it have been written if it was the second and fourth child instead of the first and last for which there are specific functions?
I've updated my previous comment. You can use arguments[0].childNodes[N] to get N-th node
1

In your initial code trial when you used :

something = elem.find_element_by_xpath('./td[5]').text

You got both the elements text1 and text2 as both the text were part of <td[5]>

In your next code trial when you used :

something = elem.find_element_by_xpath('./td[5]/text()[1]')

Raised InvalidSelectorException because, though ./td[5]/text() is a valid xpath expression but currently is not supported by Selenium. Hence the error is raised.

To extract the texts text1 and text2 from the HTML you have provided you can use the str.splitlines method as follows :

text1 = driver.find_element_by_xpath("//tr//following-sibling::td[5]").get_attribute("innerHTML").splitlines()[1]
text2 = driver.find_element_by_xpath("//tr//following-sibling::td[5]").get_attribute("innerHTML").splitlines()[5]

3 Comments

Thanks for the answer its more clear than the other answer to me but as far as I understand it, it will only work with properly formatted html. If lets say the texts were at the same line as the tags, what would happen than? Would it break?
I am afraid. I thought you tagged Python but not JavaScript my solution is more Pythonic indeed. Perhaps you wanted to look at str.splitlines
@cybera Yes, you are right when you say it will only work with properly formatted html. Factually, the HTML DOM is always in a formated state. It all boils down how the end user interprets it. Incase texts were at the same line as the tags we would have fine tuned our approach but the algorithm would have been same being Pythonic.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.