0

Can someone help me find a way to load a public web page that requires JavaScript and blocks access from developers tools? I had an automated process that that worked as follows.

$TdyDate = $(get-date -f yyyyMMdd)
$wsjurl = "https://www.wsj.com/print-edition/$TdyDate/frontpage"
$wsjweb = Invoke-WebRequest -Uri $wsjurl -UseBasicParsing

This recently started generating "Please enable JS and disable any ad blocker" errors.

Based on this Stack Overflow post I tried the following which gets me past these errors but is only able to pull down an "Access Blocked" landing page instead of the full web page that renders in my browser.

Set-Alias msedge 'C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe'
msedge --headless --dump-dom --disable-gpu $wsjurl

If anyone could help me figure out a way around this, it would be greatly appreciated. The web page I'm targeting is publicly accessible.

1
  • Try using Postman to make request and see if you get same error. Postman is very robust and adds HTTP header to the request automatically. If postman works than check the Postman Console for Raw Request. Then add any http headers that Postman added to your PS request. Often issue like this are caused by User-Agent Header being different in Postman than your PS request. Commented Mar 16 at 17:17

1 Answer 1

0

The following code snippet could help:

$wsjDate = Get-Date
if ( 0 -eq $wsjDate.DayOfWeek.value__ ) {
    $TdyDate = "{0:yyyyMMdd}" -f $wsjDate.AddDays( -1)  # Sunday -> Saturday
} else {
    $TdyDate = "{0:yyyyMMdd}" -f $wsjDate
}

$wsjurl = "https://www.wsj.com/print-edition/$TdyDate/frontpage"
$wsjweb = Invoke-WebRequest -Uri $wsjurl -Method Options -UseBasicParsing

Explanation:

  • a bit (seemingly) complicated calculation of $TdyDate respects that the pages are not defined on Sundays,
  • -Method Options circumvents the Please enable JS and disable any ad blocker error, so that
  • $wsjweb.Content contains full web page code: <!DOCTYPE html><html lang="en-US"> … … … </script></body></html>

Moreover, $wsjweb.Headers could enlighten the problem (see properties X-XSS-Protection and X-Content-Type-Options):

$wsjweb.Headers # truncated

Key                       Value
---                       -----
…
X-XSS-Protection          1; mode=block
X-Content-Type-Options    nosniff
…
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.