Parse multiple lines of text with powershell and export to csv

Question

I have multiple large log files that I'd like to export to CSV. To start with, I just want to split two parts, Date and Event. The problem I'm having is that not every line starts with a date.

Here is a sample chunk of log. Date/times are always 23 characters. The rest varies with the log and event description.

enter image description here

I'd like the end result to look like this in excel.

enter image description here

Here's what I've tried so far but just returns the first 23 characters of each line.

$content = Get-Content myfile.log -TotalCount 50 
for($i = 0; $i -lt $content.Length; $i++) {
$a = $content[$i].ToCharArray()
$b = ([string]$a[0..23]).replace(" ","")
Write-Host $b }

could you post part of the log in text please so i can try something — ArcSet
– ArcSet, Commented Sep 7, 2017 at 20:01
2017-09-04 12:31:11.343 General BOECD:: ProcessStartTime: Word: Length 3 [0917 1204 3029 ] Hex: Length 6 [17 09 04 12 29 30] . Display: False 2017-09-04 12:31:11.479 General MelsecIoWrapper: Scan ended: device: 1, ScanStart: 9/4/2017 12:31:10 PM Display: False 2017-09-04 12:31:11.705 General BOECD:: ProcessEndTime: Word: Length 3 [0917 1204 0931 ] Hex: Length 6 [17 09 04 12 31 09] . Display: False 2017-09-04 12:31:13.082 General BOECD:: DV Data: — Jeremyn Horsley
– Jeremyn Horsley, Commented Sep 7, 2017 at 21:05
Note: In the actual log file, the Date always starts a line like the picture above. When I pasted the sample, it just wrapped everything together. — Jeremyn Horsley
– Jeremyn Horsley, Commented Sep 7, 2017 at 21:08
You should edit your question and put the sample text in there rather than responding to it. If for no other reason than the formatting issues you just encountered. — TheMadTechnician
– TheMadTechnician, Commented Sep 7, 2017 at 21:09

TheMadTechnician · Accepted Answer · 2017-09-07 21:29:07Z

3

Read the file in raw as a multi-line string, then use RegEx to split on the date pattern, and for each chunk make a custom object with the two properties that you want, where the first value is the first 23 characters, and the second value is the rest of the string trimmed.

(Get-Content C:\Path\To\File.csv -Raw) -split '(?m)(?=^\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})'|
    Where{$_}|
    ForEach-Object{
        [PSCustomObject]@{
            'Col1'=$_.Substring(0,23)
            'Col2'=$_.Substring(23).Trim()
        }
    }

Then you can pipe that to a CSV, or do whatever you want with the data. If the files are truly massive this may not be viable, but it should work ok on files up to a few hundred megs I would think. Using your sample text that output:

Col1                    Col2
----                    ----
2017-09-04 12:31:11.343 General BOECD:: ProcessStartTime: ...
2017-09-04 12:31:11.479 General MelsecIoWrapper: Scan ended: device: 1, ScanStart: 9/4/2017 12:31:10 PM Display: False
2017-09-04 12:31:11.705 General BOECD:: ProcessEndTime: ...
2017-09-04 12:31:13.082 General BOECD:: DV Data:

The ... at the end of the two lines are where it truncated the multi-line value in order to display it on screen, but the value is there intact.

(?=...) is a so-called "positive lookahead assertion". Such assertions cause a regular expression to match the given pattern without actually including it in the returned match/string. In this case the match returns the empty string before a timestamp, so the string can be split there without removing the timestamp.

edited Sep 7, 2017 at 21:29

answered Sep 7, 2017 at 21:17

TheMadTechnician

36.5k3 gold badges48 silver badges63 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Ansgar Wiechers Over a year ago

I would make the pattern (?m)(?=^\d{4}-...) to match timestamps at the beginning of a line specifically. The hyphens and colons don't need to be escaped, BTW.

TheMadTechnician Over a year ago

Thanks, I have a hard time remembering what all counts as a reserved character in RegEx, so I tend to over-escape sometimes. I have also updated the answer to reflect your suggestion of only getting date/times at the beginning of a line, that is an excellent idea.

Collectives™ on Stack Overflow

Parse multiple lines of text with powershell and export to csv

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related