1

I've used the Invoke-WebRequest to pull a page into a variable $content. I then assign the results of $content.ParsedHtml.getElementsByTagName to another variable $x. $x[1] returns several lines of HTML. However I am unable to parse the lines of HTML into an array.

$content = Invoke-WebRequest -Uri $Uri
$x = $content.ParsedHtml.getElementsByTagName('TR')
$x[1].outerHTML

If I write the HTML out to a text file I can then read it back into an array but I was hoping to skip that step. If anyone has any suggestions it would be most appreciated.

6
  • 1
    Looks like you are trying to read a web table. It can be pretty difficult t o do that, from my experience. You'd think there would be a simple way but... anyway, Lee Holmes built a tool I've used many times over the last few years that does exactly what I think you are trying to do Extracting Tables from PowerShell’s Invoke-WebRequest. I can't tell you exactly how to get the data in your particular case without seeing it, but the utility is pretty straight forward. Commented Apr 3, 2018 at 19:12
  • 1
    Thanks Brendan - will definitely look into the article you linked. Commented Apr 3, 2018 at 19:17
  • any chance the site turns the data you require out via a wsdl/rest api? Commented Apr 3, 2018 at 19:37
  • Dude - I'm like a total n00b. I really have no idea. Commented Apr 3, 2018 at 20:04
  • 1
    [email protected]'ve all been there. see how far you can get with Lee Holmes (a name every n00b should learn!) code and post back if you have problems, what you are after is kind of involved but you'll figure it out. Also, try to remember to upvote/downvote and otherwise indicate helpful answers on SO! we appreciate the points :) Commented Apr 3, 2018 at 20:13

2 Answers 2

2

Found a solution, though I am open to any suggestions for alternative answers: This is what works for me:

$z = $x[1].innerHTML.ToString() -split([Environment]::NewLine)

Thanks all for the input I received.

Sign up to request clarification or add additional context in comments.

Comments

0

I tried the following which seems to work (I've included the URL I tested against).

Each <tr>...</tr> and everything between the tags will be an entry in the Array.

Essentially just expand the outerHTML and cast it to an array.

$uri = "https://www.w3schools.com/html/html_tables.asp"
$content = Invoke-WebRequest -Uri $Uri
[array]$x = $content.ParsedHtml.getElementsByTagName('TR') | select -ExpandProperty outerHTML

1 Comment

Thank you for the response. $x[n] (In my case) returns several lines of HTML. Where I am running into the problem is in taking the contents of $x[n] and then loading that into a new array.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.