0

I'm working on extracting information from the html through vb. The html file looks like:

<div class='titlebar'><h1>Event Log started at 02/06/2015 13:07:30</h1></div>
<div class='Information'><h1>02/06/2015 13:09:30 | Log has opened</h1></div>
<div class='Interest'><h1>02/06/2015 13:13:03 | finished!</h1></div>
<div class='Interest'><h1>02/06/2015 13:17:12 | finished!</h1></div>
<div class='Interest'><h1>02/06/2015 13:21:35 | finished!</h1></div>
<div class='Interest'><h1>02/06/2015 13:24:58 | finished!</h1></div>
<div class='Warning'><h1>02/06/2015 17:04:33 | Failed to stop, retrying...</h1></div>
<div class='Warning'><h1>02/06/2015 17:04:56 | Error during mix

From this, I need to be able to extract information into different listbox's for class =interest, class=warning and class=information. So through my research I obtained the below code:

Private Function getHtml(ByVal Adress As String) As String
    Dim rt As String = ""
    Dim wRequest As WebRequest
    Dim wResponse As WebResponse
    Dim SR As StreamReader
    wRequest = WebRequest.Create(Adress)
    wResponse = wRequest.GetResponse
    SR = New StreamReader(wResponse.GetResponseStream)
    rt = SR.ReadToEnd
    SR.Close()
    Return rt
End Function

Private Sub btn_lookup_Click(sender As Object, e As EventArgs) Handles btn_lookup.Click
    TextBox2.Text = getHtml(TextBox1.Text)
End Sub

The above code will copy the whole source information into the textbox. Is it possible only copy the specific information. So for

 <div class='Interest'><h1>02/06/2015 13:24:58 | finished!</h1></div>

I need to copy 02/06/2015 13:24:58 | finished!

Is this possible?

Thank You

1 Answer 1

1

I would suggest to use a HTML parser like HtmlAgilityPack:

Dim html As String = File.ReadAllText("C:\Temp\html.txt")
Dim doc As New HtmlAgilityPack.HtmlDocument()
doc.LoadHtml(html)
Dim interestDivs = doc.DocumentNode.SelectNodes("//div[contains(@class,'Interest')]")
Dim warningDivs = doc.DocumentNode.SelectNodes("//div[contains(@class,'Warning')]")
Dim informationDivs = doc.DocumentNode.SelectNodes("//div[contains(@class,'Information')]")

Dim lines = From div In interestDivs Select div.InnerText
lines = lines.Concat(From div In warningDivs Select div.InnerText)
lines = lines.Concat(From div In informationDivs Select div.InnerText)
TextBox2.Lines = lines.ToArray()

If you are more familiar with LINQ than XPath you can also use these queries:

Dim interests = From div In doc.DocumentNode.Descendants("div")
                Where div.GetAttributeValue("class", "") = "Interest"
                Select div.InnerText
Dim warnings = From div In doc.DocumentNode.Descendants("div")
               Where div.GetAttributeValue("class", "") = "Warning"
               Select div.InnerText
Dim infos = From div In doc.DocumentNode.Descendants("div")
            Where div.GetAttributeValue("class", "") = "Information"
            Select div.InnerText
TextBox2.Lines = interests.Concat(warnings).Concat(infos).ToArray()
Sign up to request clarification or add additional context in comments.

6 Comments

Just to mention I'm only familiar with VB at begineer's level . I will be reading from Html, so if I changed html = "C:\Desktop\test.html" do I need to modify anything, because doing so gives me an error because Lines are Null.
@SatvirSingh: the first line of my code shows how to read a text file. Just replace ReadAllText("C:\Temp\html.txt") with ReadAllText("C:\Desktop\test.html")
I modified your code a bit and now I can collect the data into the specific richtextboxes. Thank you for all the help
@SatvirSingh: glad that you've got it working. If you have further questions ask. You can edit your question to provide your modifications if you want. But i think it's not needed for future readers.
I have another question. Sometimes the data is not available in the html so for example I want to verify if this lines.Concat(From div In warningDivs Select div.InnerText) is not Null. How would I write.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.