1

I want to extract the projectstatus of a project which I can find on a website. See below for an example how the html is parsed. I want to extract the text Start which is the text between td and /td. See below the html my code.

 <div id="ProjectStatus">
 <tr>
 <th>
 <span id="ProjectStatus_Label1" title="De status van het project">Projectstatus</span>
 </th>
 <td>Start</td>
 </tr>

Below you'll find the code that I have at this moment. This code only gives me the string "Projectstatus", which is not what I want. How can I extract the word "Start"?

Private Sub btnClick()

Dim ieApp As InternetExplorer
Set ieApp = New InternetExplorer
Set ieApp = CreateObject("internetexplorer.application")

With ieApp
 .Navigate "url"
 .Visible = True
End With

Do While ieApp.Busy
    DoEvents
Loop 

Set getStatus = ieApp.Document.getElementById("ProjectStatus_Label1")

strStatus = getStatus.innerText

MsgBox (strStatus) 'gives met the text "Projectstatus, but I need the text "Start"

ieApp.Quit
Set ieApp = Nothing

End Sub

1 Answer 1

1

Achieving this, starting from the ProjectStatus_Label1, will require some DOM navigation.

Use the following:

Do While ieApp.Busy
    DoEvents
Loop
Set labelSpan = ieApp.Document.getElementById("ProjectStatus_Label1")
Set tableHeader = labelSpan.Parent
Set tableRow = tableHeader.Parent
For Each child In tableRow.Children
    If child.tagName = "TD" 'This is the element you're looking for
         Debug.Print child.innerText
         Exit For
    End If
Next

Of course, I highly recommend you revise this code and use explicit declarations and Option Explicit, but you haven't in your question so I won't in my answer.

Also, I've used a number of assignments (labelSpan, tableHeader) for demonstrative purposes. You can use Set tableRow = ieApp.Document.getElementById("ProjectStatus_Label1").Parent.Parent and remove those other declarations.

Or you can use the code-golfy, harder-to-understand approach, starting from the ProjectStatus div:

Debug.Print ieApp.Document.getElementById("ProjectStatus").GetElementsByTagName("td")(0).innerText
Sign up to request clarification or add additional context in comments.

2 Comments

Hi Erik, the code you provided works. I had to change .Parent into .ParentElement. Thanks!
That's somewhat strange. We must be using a different version of IE (I was working with IE11, and using ShDocVw.InternetExplorer with late binding instead of CreateObject). It certainly is .Parent for me. Documentation on IE automation is sparse, and there's a lot of version incompatibility, so it's mostly trial and error. Glad it worked.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.