3

I have a Powershell script i'm using to parse each row in a file, reformat it, and write the new string to an output file. It works fine with an input file with a few hundred lines. However, I need to ultimately run it against a file with a few million lines, and I have been waiting hours and it still hasn't finished. Following this post, I think I need to put Write-Output outside of the loop, but i've been unsuccessful so far.

This is my current code:

Foreach ($line in Get-Content $logFile) {

    $arr = $line.Split()

    $port1 = $arr[9].Split(":")

    $port2 = $arr[11].Split(":")

    $connstring = '|' + $port1[0] + "|" + $port1[1] + "|" + $port2[0] + "|" + $port2[1] + "|" + $arr[4] + "|"

    Write-Output $connstring | Out-File "C:\logging\output\logout.txt" -Append 
}

An example of an input string is:

06/14-04:40:11.371923  [**] [1:4:0] other [**] [Priority: 0] {TCP} 67.202.196.92:80 -> 192.168.1.105:55043

And I need to reformat it to this:

|67.202.196.92|80|192.168.1.105|55043|other|

Any help is much appreciated!

2
  • 1
    Do you only need to capture the IP/port and content after that? A regex might be able to accomplish what you want faster. Commented Jul 1, 2017 at 23:49
  • Yes, correct. The IPs, ports and the tag (in this case 'other'). Commented Jul 2, 2017 at 0:03

3 Answers 3

4

If you use a -ReadCount on the Get-Content it will have the effect of streaming the file one row at a time rather than having to read the entire file in to memory. I suspect that moving write operation outside of your loop might be faster. Less variables and steps inside your loop will probably help too.

Assuming the fourth element after the split doesn't contain a colon (you didn't supply and example of your file) then something like this is should do the trick:

Get-Content $logFile -ReadCount 1 | % {
    '|' + (($_.Split()[9, 11, 4] -replace ':', '|') -join '|') + '|' 
} | Out-File "C:\logging\output\logout.txt"
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you Dave; so far, my initial testing with your code seems a little faster but unfortunately not by much. I've updated my original post with an example of an input string and the reformatted string. Perhaps my code logic needs tweaked?
Update, this code was substantially faster. Thank you Dave!
2

It might help to remove the addition in your string construction

$connstring = "|$($port1[0])|$($port1[1])|$($port2[0])|$($port2[1])|$($arr[4])|"

Try using Measure-Command to test with sample data sets.

Comments

1

try Something like this :

$test="06/14-04:40:11.371923  [**] [1:4:0] other [**] [Priority: 0] {TCP} 67.202.196.92:80 -> 192.168.1.105:55043"

$template=@"
{Row:06/14-04:40:11.371923  [**] [1:4:0] {Text:other} [**] [Priority: 0] \{TCP\} {IPIN:67.202.196.92}:{PORTIN:80} -> {IPOUT:192.168.1.105}:{PORTOUT:55043}}
"@

$test| ConvertFrom-String -TemplateContent $template |%{"|{0}|{1}|{2}|{3}|{4}|" -f $_.Row.IPIN, $_.Row.PORTIN, $_.Row.IPOUT , $_.Row.PORTOUT , $_.Row.Text }

but you could export direectly to csv like this :

$template=@"
{Row:06/14-04:40:11.371923  [**] [1:4:0] {Text:other} [**] [Priority: 0] \{TCP\} {IPIN:67.202.196.92}:{PORTIN:80} -> {IPOUT:192.168.1.105}:{PORTOUT:55043}}
"@

Get-Content $logFile | ConvertFrom-String -TemplateContent $template | % {
 [pscustomobject]@{
 IPIN=$_.Row.IPIN 
 PORTIN=$_.Row.PORTIN 
 IPOUT=$_.Row.IPOUT 
 PORTOUT=$_.Row.PORTOUT 
 Text=$_.Row.Text
 }

} | export-csv "C:\logging\output\logout.csv" -Append -NoType

1 Comment

In the second code the last line is an alternative foreach and should IMO be commented out/deleted. Otherwise +1 I think ConvertFrom-String with a template is heavily underestimated

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.