I'm trying to scrape a product price from a webpage using Excel VBA. The following code is working when using VBA Internet Explorer navigate request. However I would like to use an XML HTTP request instead to speed up the scraping process.
In the IE request code I tell the application to wait for 3 seconds to have the page fully load and be able to scrape the product price. If this line is not included it won't find the price.
I tried to change this with an XML HTTP request (see the second code) but without success. No price output was found. It seems that the code tries to scrape the page before it has been fully loaded.
How can I adjust the XML HTTP request code so that it will find the product price (and only start searching/scraping when the page (and scripts) are fully loaded?
The following IE request code is working: (immediate debug.prints a price of the product)
Sub Get_Product_Price_AH_IE()
Dim IE As New SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Dim AHArticles As MSHTML.IHTMLElementCollection
Dim AHArticle As MSHTML.IHTMLElement
Dim AHEuros As MSHTML.IHTMLElementCollection
Dim AHCents As MSHTML.IHTMLElementCollection
Dim AHPriceEuro As Double
Dim AHPriceCent As Double
Dim AHPrice As Double
IE.Visible = False
IE.navigate "https://www.ah.nl/producten/product/wi3640/lu-bastogne-biscuits-original"
Do While IE.readyState <> READYSTATE_COMPLETE
Loop
Set HTMLDoc = IE.document
'wait for the page to fully load to be able to get price data
Application.Wait Now + #12:00:03 AM#
Set AHArticles = HTMLDoc.getElementsByTagName("article")
For Each AHArticle In AHArticles
If AHArticle.getAttribute("data-sku") = "wi3640" Then
Set AHEuros = AHArticle.getElementsByClassName("price__integer")
Set AHCents = AHArticle.getElementsByClassName("price__fractional")
AHPriceEuro = AHEuros.Item(0).innerText
AHPriceCent = AHCents.Item(0).innerText
AHPrice = AHPriceEuro + (AHPriceCent / 100)
Debug.Print AHPrice
Exit For
End If
Next AHArticle
IE.Quit
End Sub
The following XML HTTP request is not giving the desired output (no price is printed in the immediate debug screen):
Sub Get_Product_Price_AH_XML()
Dim XMLReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim AHArticles As MSHTML.IHTMLElementCollection
Dim AHArticle As MSHTML.IHTMLElement
Dim AHEuros As MSHTML.IHTMLElementCollection
Dim AHCents As MSHTML.IHTMLElementCollection
Dim AHPriceEuro As Double
Dim AHPriceCent As Double
Dim AHPrice As Double
XMLReq.Open "GET", "https://www.ah.nl/producten/product/wi3640/lu-bastogne-biscuits-original", False
XMLReq.send
If XMLReq.Status <> 200 Then
MsgBox "Problem" & vbNewLine & XMLReq.Status & " - " & XMLReq.statusText
Exit Sub
End If
HTMLDoc.body.innerHTML = XMLReq.responseText
Application.Wait Now + #12:00:03 AM#
Set AHArticles = HTMLDoc.getElementsByTagName("article")
For Each AHArticle In AHArticles
If AHArticle.getAttribute("data-sku") = "wi3640" Then
Set AHEuros = AHArticle.getElementsByClassName("price__integer")
Set AHCents = AHArticle.getElementsByClassName("price__fractional")
AHPriceEuro = AHEuros.Item(0).innerText
AHPriceCent = AHCents.Item(0).innerText
AHPrice = AHPriceEuro + (AHPriceCent / 100)
Debug.Print AHPrice
Exit For
End If
Next AHArticle
End Sub



