0

I am trying to extract specific text using a CSS selector. Here's a screenshot of the part that I would like to extract

enter image description here

I tried

div[id="Section3"]:first-child

but this doesn't return anything. I can't depend on locating the element by the text because I need to extract that text as shown.

This is the relevant HTML

<div class="ad24123fa4-c17c-4dc5-9aa5-ea007a8db30e-5" style="top:8px;left:218px;width:124px;height:31px;text-align:center;">
    <table width="113px" border="0" cellpadding="0" cellspacing="0">
        <tbody>
            <tr>
                <td>
                    <table width="100%" border="0" cellpadding="0" cellspacing="0">
                        <tbody>
                            <tr>
                                <td align="center">
                                    <span class="fcb900b29f-64d7-453d-babf-192e86f17d6f-7">نظامي</span>
                                </td>
                            </tr>
                        </tbody>
                    </table>
                </td>
            </tr>
        </tbody>
    </table>
</div>

The full HTML is here.

This is my try

            On Error Resume Next
            Set ele = .FindElementByXPath("//span[text()='ãäÇÒá']")
            If ele Is Nothing Then sStatus = "äÙÇãí" Else sStatus = "ãäÇÒá"
        On Error GoTo 0

While inspecting the element I noticed that there is a hint of using $0 in the console .. Can this be useful? enter image description here

As for the two possible texts "نظامي" and "منازل"

4
  • What does the HTML look like at that point? Commented Dec 12, 2018 at 20:37
  • I have updated the post Commented Dec 12, 2018 at 20:46
  • That's some pretty unfriendly HTML for automation. I'm assuming you've stripped out a lot of the related HTML because those two tables are pretty empty. Are there any text labels around the desired text that you can use as an anchor? I can't read Arabic but something like... First name: John where you want the text 'John'? Commented Dec 12, 2018 at 21:16
  • I am using selenium for this part .. Commented Dec 12, 2018 at 21:20

1 Answer 1

1

To use xpath with multiple possible search values use the following syntax:

//*[text()='نظامي' or text()='منازل']

CSS selectors (that work for me):

driver.findElementByCss("#ctl00_ContentPlaceHolder1_CrystalReportViewer1 div.ad071889d2-8e6f-4755-ad7d-c44ae0ea9fca-5 table span").text

which is an abbreviation of the full selector:

#ctl00_ContentPlaceHolder1_CrystalReportViewer1 > tbody > tr > td > div > div.crystalstyle > div.ad071889d2-8e6f-4755-ad7d-c44ae0ea9fca-5 > table > tbody > tr > td > table > tbody > tr > td > span

You can also index into table nodeList

Set matches = html.querySelectorAll("#ctl00_ContentPlaceHolder1_CrystalReportViewer1 div.crystalstyle table")
ActiveSheet.Cells(1, 1) = matches.item(80).innerText

Otherwise:

Reading in from html file I can take the last index of the matches based on class selector. For selenium you would switch to:

driver.FindElementsByCss(".fc180999a8-04b5-46bc-bf86-f601317d19c8-7").count

VBA:

Option Explicit
Public Sub test()
    Dim html As HTMLDocument, matches As Object
    Dim fStream  As ADODB.Stream
    Set html = New HTMLDocument
    Set fStream = New ADODB.Stream
    With fStream
        .Charset = "UTF-8"
        .Open
        .LoadFromFile "C:\Users\User\Desktop\Output6.html"
        html.body.innerHTML = .ReadText
        .Close
    End With

    Set matches = html.querySelectorAll(".fc180999a8-04b5-46bc-bf86-f601317d19c8-7")

    ActiveSheet.Cells(1, 1) = matches.item(matches.Length - 1).innerText
End Sub
Sign up to request clarification or add additional context in comments.

7 Comments

Thanks a lot for reply. As for .count I got 0 .. and the class '.fc180999a8-04b5-46bc-bf86-f601317d19c8-7' is not fixed, it changes each time .. Is the xpath will be easier?
There are two possible text. Is it possible to include both and check for the existing element for both of them? I have posted my try but doesn't work properly and I got incorrect results for the string variable
This is perfect //*[text()='نظامي' or text()='منازل'] but is it possible to make it more specific. Thanks a lot for that solution
I mean to put some other anchors to refer to that part or around ... and have you seen the tip of $0? Is that may be useful?
Thank you very much my tutor. I am very grateful for all this awesome help
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.