0

I am trying to process the results of a web query. I am doing

$data = Invoke-WebRequest $uri $data.RawContent

which returns this

HTTP/1.1 200 OK Transfer-Encoding: chunked Expires: Thu, 01 Jan 1970 00:00:00 GMT Set-Cookie: JSESSIONID=19gbg0sejwxXXXXXXpzwlasaaeer;Path=/informer;Secure Server: Jetty(8.1.8.v20121106) <table><tr><td>username</td><td>firstname</td><td>lastname</td><td>idnumber</td><td>auth</td><td>course1</td></tr><tr valign="top"><td style="mso-number-format:'\@';">d_smith</td><td style="mso-number-format:'\@';">Daniel</td><td style="mso-number-format:'\@';">smith</td><td style="mso-number-format:'\@';">2221840</td><td style="mso-number-format:'\@';">ldap</td><td style="mso-number-format:'\@';">2017CE1-CTP-8001-101</td></tr><tr valign="top"><td style="mso-number-format:'\@';">d_rutherford</td><td style="mso-number-format:'\@';">Daniel</td><td style="mso-number-format:'\@';">rutherford</td><td style="mso-number-format:'\@';">2223039</td><td style="mso-number-format:'\@';">ldap</td><td style="mso-number-format:'\@';">2017CE1-CTP-8001-101</td></tr><tr valign="top"><td style="mso-number-format:'\@';">j_smithe21</td><td style="mso-number-format:'\@';">James</td><td style="mso-number-format:'\@';">smithe</td><td style="mso-number-format:'\@';">2221844</td><td style="mso-number-format:'\@';">ldap</td><td style="mso-number-format:'\@';">2017CE1-CTP-8001-101</td></tr><tr valign="top"><td style="mso-number-format:'\@';">h_phillipsiii</td><td style="mso-number-format:'\@';">Harvey</td><td style="mso-number-format:'\@';">phillips III</td><td style="mso-number-format:'\@';">2221845</td><td style="mso-number-format:'\@';">ldap</td><td style="mso-number-format:'\@';">2017CE1-CTP-8001-101</td></tr><tr valign="top"><td style="mso-number-format:'\@';">h_phillipsiii</td><td style="mso-number-format:'\@';">Harvey</td><td style="mso-number-format:'\@';">phillips III</td><td style="mso-number-format:'\@';">2221845</td><td style="mso-number-format:'\@';">ldap</td><td style="mso-number-format:'\@';">2017CE1-CTP-8001-102</td></tr><tr valign="top"><td style="mso-number-format:'\@';">d_rutherford</td><td style="mso-number-format:'\@';">Daniel</td><td style="mso-number-format:'\@';">rutherford</td><td style="mso-number-format:'\@';">2223039</td><td style="mso-number-format:'\@';">ldap</td><td style="mso-number-format:'\@';">2017CE1-CTP-8001-102</td></tr><tr valign="top"><td style="mso-number-format:'\@';">j_smithe21</td><td style="mso-number-format:'\@';">James</td><td style="mso-number-format:'\@';">smithe</td><td style="mso-number-format:'\@';">2221844</td><td style="mso-number-format:'\@';">ldap</td><td style="mso-number-format:'\@';">2017CE1-CTP-8001-102</td></tr><tr valign="top"><td style="mso-number-format:'\@';">d_smith</td><td style="mso-number-format:'\@';">Daniel</td><td style="mso-number-format:'\@';">smith</td><td style="mso-number-format:'\@';">2221840</td><td style="mso-number-format:'\@';">ldap</td><td style="mso-number-format:'\@';">2017CE1-CTP-8001-102</td></tr><tr valign="top"><td></td></tr></table>

notice there is no HTML or body tags. As best I can figure, this is why I am not seeing the ParsedHTML option. These are the only properties $data has are shown below $data Properties.

$data.Content returns the character codes to make up the text $data.RawContent produces. This is the table when it is rendered in a web browser rendered table

How can I convert this table to a CSV or PSCustomObject or something similar so that I can iterate through the lines and filter and process the data?

1

1 Answer 1

1

Since the response is valid XML, you can use PowerShells XML parsing ability, like this:

$data = Invoke-WebRequest $uri
[xml]$x = $data.RawContent

$first = $true
$OldOFS = $OFS
$OFS = '","'

foreach ($tr in $x.table.tr)
{
    if ($first)
    {
        $first = $false
        """$($tr.td)""" | Out-File "out.csv"
    }
    elseif ($tr.td)
    {
        """$($tr.td.'#text')""" | Out-File "out.csv" -Append
    }
}

$OFS = $OldOFS

notepad out.csv

The $OFS hack is just quick and dirty to get the arrays converted to something csv-like. You could for example also use the -join operator.

The $tr.td in the first row produces an array of header strings, the $tr.td.'#text' produces the array of values for each remaining, non-empty row.

Hope that helps.

Sign up to request clarification or add additional context in comments.

2 Comments

YAAASSSSSSSSSSS
When I put in the table, I wasn't 100% correct. there was some additional HTML headers and such before it. i.imgur.com/ixc7X7O.png I took the lazy way and removed the content before <table> with [xml]$x = $data.RawContent.Remove(0,$data.RawContent.IndexOf("<table>") -1 ) The rest was just like you had, but your answer is 100% correct before the edit

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.