1

I would like to scrape a website (extract a product price) from a single website page (with XML HTTP request). But before this script should run I need to have selected the correct store first (saved in browser cookie variable or included in any other way/request if possible) since prices are different in different shops.

I have created a working code but it's taking a very long time to run so i assume there must be faster and cleaner :) way. I also needed to include the application to wait for the website to follow the steps.

My current vba code:

  • runs a HTTP IE request to open the website, and in multiple clicks selects the desired store and saves it in a cookie (like a site user should do)
  • next the product page is requested with another HTTP IE request and data is extracted. I found out a can't use the XML HTTP request because it won't use the cookie value with the correct store, displaying the correct price.
  • The price i'm after (in the example below) is E 1,39 instead of E 1,48 (when no cookie value is used and no store is selected).
  • The cookie value is saved in the cookie "www.jumbo.com/cookie/HomeStore the Content is holding the store tag which is known upfront and could be hardcoded in a request if possible.

Selecting the correct store (and saving it in a browser cookie)

   Sub SetStore()

    Dim IE As New SHDocVw.InternetExplorer
    Dim HTMLDoc As MSHTML.HTMLDocument

    Dim HTMLSearchbox As MSHTML.IHTMLElement
    Dim HTMLSearchboxes As MSHTML.IHTMLElementCollection
    Dim HTMLButton As MSHTML.IHTMLElement
    Dim HTMLButtons As MSHTML.IHTMLElementCollection
    Dim HTMLSearchButton As MSHTML.IHTMLElement
    Dim HTMLSearchButtons As MSHTML.IHTMLElementCollection
    Dim HTMLStoreID As MSHTML.IHTMLElement
    Dim HTMLStoreIDs As MSHTML.IHTMLElementCollection
    Dim HTMLSaveStore As MSHTML.IHTMLElement
    Dim HTMLSaveStores As MSHTML.IHTMLElementCollection


   'set on False to hide IE screen
    IE.Visible = True

    'navigate to url with limited content
    IE.navigate "https://www.jumbo.com/content/algemene-voorwaarden/"

    Do While IE.readyState <> READYSTATE_COMPLETE

    Loop
    Set HTMLDoc = IE.document

    Set HTMLButtons = HTMLDoc.getElementsByTagName("button")


    For Each HTMLButton In HTMLButtons

        If HTMLButton.getAttribute("data-jum-action") = "openHomeStoreFinder" Then
           HTMLButton.Click
            Exit For
        End If

     Next HTMLButton


       Application.Wait Now + #12:00:02 AM#

    Set HTMLSearchboxes = HTMLDoc.getElementsByTagName("input")

    For Each HTMLSearchbox In HTMLSearchboxes

     If HTMLSearchbox.getAttribute("id") = "searchTerm__DkKYx4XylsAAAFJktpb2Guy" Then


    'input field store name/location to show search results
    HTMLSearchbox.Value = "Oosterhout"

           Application.Wait Now + #12:00:03 AM#

           HTMLSearchbox.Click

            Exit For
        End If

     Next HTMLSearchbox

     Set HTMLSearchButtons = HTMLDoc.getElementsByTagName("button")

    For Each HTMLSearchButton In HTMLSearchButtons

        If HTMLSearchButton.getAttribute("data-jum-filter") = "search" Then
            HTMLSearchButton.Click

            Exit For
        End If

    Next HTMLSearchButton

    Application.Wait Now + #12:00:05 AM#

    Set HTMLStoreIDs = HTMLDoc.getElementsByTagName("li")

    For Each HTMLStoreID In HTMLStoreIDs


  'oosterhout = YC8KYx4XB88AAAFIDcIYwKxJ
  'nieuwegein = 84IKYx4XziUAAAFInSYYwKrH
  'vaassen = JYYKYx4XC1oAAAFItvcYwKxJ
  'brielle = OG8KYx4XP4wAAAFIlsEYwKxK

     If HTMLStoreID.getAttribute("data-jum-store-id") = "YC8KYx4XB88AAAFIDcIYwKxJ" Then


     HTMLStoreID.Click

      Application.Wait Now + #12:00:03 AM#

          Exit For
      End If


  Next HTMLStoreID

  Set HTMLSaveStores = HTMLDoc.getElementsByTagName("button")


  For Each HTMLSaveStore In HTMLSaveStores

        If HTMLSaveStore.getAttribute("data-jum-action") = "saveHomeStore" Then
            HTMLSaveStore.Click


            Exit For
       End If

    Next HTMLSaveStore


   'IE.Quit

End Sub

Extracting data from product page (IE HTTP request, working with cookie store value)

Sub GetJumboPriceIE()


Dim IE As New SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument
Dim JumInputs As MSHTML.IHTMLElementCollection
Dim JumInput As MSHTML.IHTMLElement
Dim JumPrice As MSHTML.IHTMLElement
Dim JumboPrice As Double
Dim Price_In_Cents_Tag As String

Dim SKU_tag As String, SKU_url As String

SKU_tag = "173140KST"
SKU_url = "https://www.jumbo.com/lu-bastogne-koeken-original-260g/173140KST/"

IE.Visible = False
   IE.navigate SKU_url



    Do While IE.readyState <> READYSTATE_COMPLETE
    Loop


    Set HTMLDoc = IE.document

    IE.Quit


Set JumInputs = HTMLDoc.getElementsByTagName("input")

Price_In_Cents_Tag = "PriceInCents_" & SKU_tag

Set JumPrice = HTMLDoc.getElementById(Price_In_Cents_Tag)


JumboPrice = JumPrice.getAttribute("value") / 100
Debug.Print JumboPrice


End Sub

The code above is working but would like to use XML HTTP request code like below (but using the correct store). The price of 1,39 is printed.

Extracting data from product page (using XML HTTP request), but cookie value is not used

Sub GetJumboPriceXML()

Dim XMLReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument

Dim JumInputs As MSHTML.IHTMLElementCollection
Dim JumInput As MSHTML.IHTMLElement
Dim JumPrice As MSHTML.IHTMLElement
Dim JumboPrice As Double
Dim Price_In_Cents_Tag As String

Dim SKU_tag As String, SKU_url As String

SKU_tag = "173140KST"
SKU_url = "https://www.jumbo.com/lu-bastogne-koeken-original-260g/173140KST/"


XMLReq.Open "GET", SKU_url, False
XMLReq.send

If XMLReq.Status <> 200 Then

MsgBox "Problem" & vbNewLine & XMLReq.Status & " - " & XMLReq.statusText
 Exit Sub
 End If

  HTMLDoc.body.innerHTML = XMLReq.responseText

Set JumInputs = HTMLDoc.getElementsByTagName("input")


Price_In_Cents_Tag = "PriceInCents_" & SKU_tag

Set JumPrice = HTMLDoc.getElementById(Price_In_Cents_Tag)

JumboPrice = JumPrice.getAttribute("value") / 100
Debug.Print JumboPrice



End Sub

This code is not using the correct store and outputting the price i'm not after (The price 1,48 is printed).


To summarize:

When no store is selected (no cookie set) the following URL now gives the price of €1,48.

I would like the VB script to set the store to “Jumbo Oosterhout Nieuwe Bouwlingstraat” and then scrape a predefined list op product URL’s and extract the prices (URL above gives €1,39).

Then set the store to a different local store “Jumbo Brielle Thoelaverweg” and scrape the identical list of product URL’s. The above URL gives €1,48.

You can select a different store by clicking on the location pin icon at the top right of the page.

Thanks a lot for your help

14
  • Hi Maurice. You have to put @qharr for me to be notified of a message in chat. I have only just seen your message. Last I remember I had reached the point that unless the API designers have a way then I would probably go the long winded way of automating setting the local store between each set of price extractions. Commented Sep 30, 2018 at 12:27
  • I raised an issue here but no answer as yet. Commented Sep 30, 2018 at 12:47
  • 1
    @QHarr, sorry i didn't know that :) The question with Albert Heijn was solved by your other reply in a different post and works perfect now using the API rest request, stackoverflow.com/questions/52181293/… The current open question is for Jumbo, a different supermarket where the prices are different in different stores. Please let me know if the question is still clear to you. Or i should elaborate/explain more. Thanks! Commented Oct 1, 2018 at 13:38
  • Oopsy there! I am losing track Commented Oct 1, 2018 at 13:39
  • Hi @Qharr, is there any way you could help me further in this project? Sorry to bother you again! I summarized the question again in the edited topic (at the bottom of the post) with actual prices to illustrate the question. Please let me know if I can help with additional info. Thanks! Commented Oct 15, 2018 at 18:29

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.