0

I've a task and don't have an easy way to parse some data into correct format. The text file i have is in following format

#N Last Name: Joe
#D First Name: Doe
#P Middle Name: A
Some Data:
#C ID Number: (1) 12345
#S Status: (1) Active 

#N Last Name: Jane
#D First Name: Doee
#P Middle Name: 
Some Data:
#C ID Number: (1) 11111
#S Status: (1) Active 
ID Number: (2) 1231
Status: (2) Active

Here is the code i was trying to using.

$A = Select-String -Pattern "#N" MYFILE.txt;
$B = Select-String -Pattern "#D" MYFILE.txt;
$C = Select-String -Pattern "#P" MYFILE.txt;
$D = Select-String -Pattern "#C" MYFILE.txt;
$E = Select-String -Pattern "#S" MYFILE.txt;

$wrapper = New-Object PSObject -Property @{ FirstColumn = $A; SecondColumn = $B; ThirdColumn = $C; FourthColumn = $D; FifthColumn = $E }
Export-Csv -InputObject $wrapper -Path .\output.csv -NoTypeInformation

This is the result i'm getting

"SecondColumn","ThirdColumn","FifthColumn","FourthColumn","FirstColumn"
"System.Object[]","System.Object[]","System.Object[]","System.Object[]","System.Object[]"

The output i'm looking for is; #N,#D,#P,#C,#S

 Joe, Doe, A, 12345, Active
 Jane, Doee, , 11111, Active

Any and all help is really appreciated.

4
  • You list N P D but the sample output is N D P. Can you clarify which is desired? Commented Aug 7, 2020 at 1:27
  • @DougMaurer thank you for pointing that out. Is should be N D P C S Commented Aug 7, 2020 at 1:49
  • A quick note... @Veebster, you already asked this question less that five hours previously. It was closed because you had initially missed important information. When you edited it to a better standard, I voted to reopen it, however it appears that nobody else has revewed it or voted for it to be reopened too. I apologise on behalf of the site, because it has clearly not worked well enough for you on this occasion. Please could you delete that closed question, if nothing else, it will save others from potentially reopening a now duplicate question. Commented Aug 7, 2020 at 11:39
  • @Compo Thank you, and no worries. I've deleted that post. Commented Aug 7, 2020 at 12:23

3 Answers 3

2

here's another way to parse that block of data. i changed the user info to make it more obvious what was going on. [grin]

what it does ...

  • creates a single multiline string to work with
    when ready to do this for real, replace the entire #region/#endregion block with a call to Get-Content -Raw.
  • defines the delimiter between blocks of user data
    in this case it is 2 newlines - one at the end of the last data line and one for the blank line.
  • splits the single multiline string into multiple such strings
  • iterates thru the resulting blocks of text
  • initializes the $Vars used to build the PSCO
  • splits the text blocks into lines of text
  • filters out any line that does NOT start with a #, then a letter, and a final space
  • iterates thru the remaining lines
  • runs a switch on the 2nd char in each line
  • when it matches one of the code letters, parse the line & set the value for the equivalent $Var
  • finishes iterating thru the current set of strings
  • builds a [PSCustomObject] to hold the values
  • sends that out to the $Result collection
  • finishes iterating thru the text blocks [the outer foreach]
  • displays the collection on screen
  • saves the collection to a CSV

if you want to remove the quotes from the CSV, please don't. [grin] if you MUST risk wrecking the CSV, then you can use Get-Content to load the lines from the file and replace the quotes with nothing.

the code ...

#region >>> fake reading in a text file as a single multiline string
#    in real life, use "Get-Content -Raw"
$InStuff = @'
#N Last Name: ALast
#D First Name: AFirst
#P Middle Name: AMid
Some Data:
#C ID Number: (1) 11111
#S Status: (1) Active 

#N Last Name: BLast
#D First Name: BFirst
#P Middle Name: 
Some Data:
#C ID Number: (1) 22222
#S Status: (1) Active 
ID Number: (2) 1231
Status: (2) Active
'@
#endregion >>> fake reading in a text file as a single multiline string

$BlockDelim = ([System.Environment]::NewLine) * 2

$Result = foreach ($Block in ($InStuff -split $BlockDelim))
    {
    # initialize stuff to $Null
    #    this handles non-matches [such as a missing middle name] 
    $FirstName = $MidName = $LastName = $IdNumber = $Status = $Null

    # the "-match" filters for lines that start with a "#", a single letter, and a space
    foreach ($Line in ($Block -split [System.Environment]::NewLine -match '^#\w '))
        {
        switch ($Line[1])
            {
            'N' {
                $LastName = $Line.Split(':')[-1].Trim()
                break
                }
            'D' {
                $FirstName = $Line.Split(':')[-1].Trim()
                break
                }
            'P' {
                $MidName = $Line.Split(':')[-1].Trim()
                break
                }
            'C' {
                $IdNumber = $Line.Split(':')[-1].Trim().Split(' ')[-1].Trim()
                break
                }
            'S' {
                $Status = $Line.Split(':')[-1].Trim().Split(' ')[-1].Trim()
                break
                }
            } # end >>> switch ($Line[1])
        } # end >>> foreach ($Line in ($Block -split [System.Environment]::NewLine))

        # create a custom object and send it out to the collection
        [PSCustomObject]@{
            FirstName = $FirstName
            LastName = $LastName
            MidName = $MidName
            IdNumber = $IdNumber
            Status = $Status
            }
    } # end >>> foreach ($Block in ($InStuff -split $BlockDelim))

# display on screen
$Result

# send to a CSV file
$Result |
    Export-Csv -LiteralPath "$env:TEMP\Veebster_ParsedResult.csv" -NoTypeInformation

the on-screen output ...

FirstName : AFirst
LastName  : ALast
MidName   : AMid
IdNumber  : 11111
Status    : Active

FirstName : BFirst
LastName  : BLast
MidName   : 
IdNumber  : 22222
Status    : Active

the content of the CSV file ...

"FirstName","LastName","MidName","IdNumber","Status"
"AFirst","ALast","AMid","11111","Active"
"BFirst","BLast","","22222","Active"

please note that there is no error detection OR error handling. [grin]

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you very much. This was exactly what i needed. Its perfectly ignoring the extra lines that is the file.
@Veebster - you are most welcome! glad to have helped a bit ... [grin]
1

Here is yet another recommendation. What I'm suggesting is to use ConvertFrom-String.

First we'll make a template, I accounted for the junk two lines in your sample data. I'm really hoping that's a typo/copyo.

$template = @'
#N Last Name {[string]Last*:last1}
#D First Name: {[string]First:first1}
#P Middle Name: {[string]Middle:A}
#C ID Number: (1) {[int]ID:11111}
#S Status: (1) {[string]Status:status1}

#N Last Name: {[string]Last*:Jane}
#D First Name: {[string]First:Doee}
#P Middle Name: {[string]Middle: \s}
#C ID Number: (1) {[int]ID:11111}
#S Status: (1) {[string]Status:Active}
{!Last*:ID Number: (2) 1231
Status: (2) Active}
'@

Now we apply that template to your data. First we will parse a here-string.

@'
#N Last Name: Joe
#D First Name: Doe
#P Middle Name: A
Some Data:
#C ID Number: (1) 12345
#S Status: (1) Active 

#N Last Name: Jane
#D First Name: Doee
#P Middle Name: 
Some Data:
#C ID Number: (1) 11111
#S Status: (1) Active 
ID Number: (2) 1231
Status: (2) Active
'@ | ConvertFrom-String -TemplateContent $template -OutVariable results

Output

Last   : Joe
First  : Doe
Middle : A
ID     : 12345
Status : Active

Last   : Jane
First  : Doee
ID     : 11111
Status : Active

Now we can construct our object in preparation to export.

$results | foreach {
    [pscustomobject]@{
        FirstName = $_.first
        LastName  = $_.last
        MidName   = $_.middle
        IdNumber  = $_.id
        Status    = $_.status
    }
} -OutVariable export

And now we can export it

$export | Export-Csv -Path .\output.csv -NoTypeInformation

Here is what's in output.csv

PS C:\> Get-Content .\output.csv
"FirstName","LastName","MidName","IdNumber","Status"
"Doe","Joe","A","12345","Active"
"Doee","Jane",,"11111","Active"

Here's the same thing reading it from a file instead.

$template = @'
#N Last Name {[string]Last*:last1}
#D First Name: {[string]First:first1}
#P Middle Name: {[string]Middle:A}
#C ID Number: (1) {[int]ID:11111}
#S Status: (1) {[string]Status:status1}

#N Last Name: {[string]Last*:Jane}
#D First Name: {[string]First:Doee}
#P Middle Name: {[string]Middle: \s}
#C ID Number: (1) {[int]ID:11111}
#S Status: (1) {[string]Status:Active}
{!Last*:ID Number: (2) 1231
Status: (2) Active}
'@

get-content .\ndpcs.txt | 
    ConvertFrom-String -TemplateContent $template | foreach {
        [pscustomobject]@{
            FirstName = $_.first
            LastName  = $_.last
            MidName   = $_.middle
            IdNumber  = $_.id
            Status    = $_.status
        }
    } | Export-Csv -Path .\output.csv -NoTypeInformation

Let's double check the contents of our CSV just to be sure.

Get-Content .\output.csv
"FirstName","LastName","MidName","IdNumber","Status"
"Doe","Joe","A","12345","Active"
"Doee","Jane",,"11111","Active"

A couple things to note: If the datasets after this have different characteristics, you'll need to add more samples to the template. If the two extra lines (ID and status) shouldn't be there, simply remove that part of the template.

I recommend everyone use the -outvariable parameter when working out logic/building scripts as you can see the output and assign to a variable at the same time.

5 Comments

neat! [grin] one thing that i would change is this [pscustomobject][ordered]@{ ... the [ordered] is not needed with he PSCO accelerator since the items are processed in the order they are defined.
You know I didn't even test. I just figured the OP wanted it in that order and I'm old school. Thanks for the info!
you are most welcome! [grin] i started off using [ordered] when i 1st started using the PSCO accelerator - then some kind person pointed out that it was not needed.
@DougMaurer Thank you for above solution. The extra lines are there in the file, it was not a copy paste error. yours would have worked too, but looks like there are too many variations in the data to get a true sample data.
If you can provide just a few more different examples it may be achievable. As long as you have a working solution that is what's important though.
0

This will work, but it is ugly as sin.

# Get content from text file
$Txtfile = Get-Content "C:\temp\test.txt" -raw
# Add delimiter to split users
$Delimiter = "

"
$Users = $Txtfile -split $Delimiter

# Create an array list to add data to so it can be exported later.
$collectionVariable = New-Object System.Collections.ArrayList
ForEach($Grouping in $Users) {
    $temp = New-Object System.Object
    $temp | Add-Member -MemberType NoteProperty -Name "#N" -Value (([regex]::match($Grouping, '#N.*').value) -split " ")[-1]
    $temp | Add-Member -MemberType NoteProperty -Name "#D" -Value (([regex]::match($Grouping, '#D.*').value) -split " ")[-1]
    $temp | Add-Member -MemberType NoteProperty -Name "#P" -Value (([regex]::match($Grouping, '#P.*').value) -split " ")[-1]
    $temp | Add-Member -MemberType NoteProperty -Name "#C" -Value (([regex]::match($Grouping, '#C.*').value) -split " ")[-1]
    # This is [-2] due to new line at end of the groupings.
    $temp | Add-Member -MemberType NoteProperty -Name "#S" -Value (([regex]::match($Grouping, '#S.*').value) -split " ")[-2]
    $collectionVariable.Add($temp) | Out-Null   
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.