0

I'm trying to write a part of a script that replaces a match line with RegEx.

Here's what the input looks like:

Name, type, ADDRESSES
“Aaa”, “bbb”, “19 S 149TH $NEWPORT NEWS, WA 96332”
“Aaa”, “bbb”,  “851 16TH AVE #365$SALISH, WA 98402-4410”
“Aaa”, “bbb”,  “2445 E BROADWAY #204$YELM WA 98653”

Here's what I've tried

$regex = '\d{5}([ \-]\d{4})?'

##get the data
$people = Get-Content 'C:\test.csv'

## let's convert the data first

foreach ($p in $people) {
    if ($p -match $regex) { $p | out-file -append C:\test.csv }
}

Here's what I expect back

Name, type, ADDRESSES
“Aaa”, “bbb”,  “96332”
“Aaa”, “bbb”,  “98402-4410”
“Aaa”, “bbb”,  “98653”

Here's what I get back:


Name, type, ADDRESSES
“Aaa”, “bbb”, “19 S 149TH $NEWPORT NEWS, WA 96332”
“Aaa”, “bbb”,  “851 16TH AVE #365$SALISH, WA 98402-4410”
“Aaa”, “bbb”,  “2445 E BROADWAY #204$YELM WA 98653”
4
  • 1
    In addition to Olaf's comment which I assume is just a typo, $p -match $regex is just going to check if the line matches and then you output the whole line ($p) to file as-is. You are not manipulating the line first. Commented Jul 26, 2021 at 23:56
  • I fixed the code. Commented Jul 26, 2021 at 23:56
  • 1
    The csv data now does not "pass validation" because you have commas in the ADDRESSES column. You would need to surround your addresses with quotes so that they are properly parsed as a single column when using Import-Csv OR if you stick with Get-Content then you need to update your regex and use -Replace Commented Jul 27, 2021 at 0:40
  • Like import-csv and the change the $_.ADDRESSES ? Commented Jul 27, 2021 at 1:00

4 Answers 4

1

This should work.

$text = @'
Name, type, ADDRESSES
Aaa, bbb, 19 S 149TH $NEWPORT NEWS, WA 96332
Aaa, bbb,  851 16TH AVE #365$SALISH, WA 98402-4410
Aaa, bbb,  2445 E BROADWAY #204$YELM WA 98653
'@ -split '\r?\n' | Select-Object -Skip 1

$result = $text.ForEach({
    $name, $type, $addresses = $_.Split(',',3)
    $addresses = [regex]::Matches($addresses, '[\d-]+(?=$)').Value

    [pscustomobject]@{
        Name = $name
        Type = $type
        Addresses = $addresses
    }
})
Name Type Addresses 
---- ---- --------- 
Aaa   bbb 96332     
Aaa   bbb 98402-4410
Aaa   bbb 98653     
Sign up to request clarification or add additional context in comments.

4 Comments

Ok, or you can just do this. My regex is more fun though. :)
@Daniel I think I didn't understood the concept lol, just posted for fun. I didn't expect OP to provide much feedback given he didn't accept any of his previous questions hehe
@Daniel [regex]::Matches($addresses,'(?<=WA).+$').Value.Trim() should do it too assuming all lines have WA no? I wanna get better at regex, it's so fun :P
It would work, but not for all. You're open to false matches, like the 'WA' in 'BROADWAY'. regex101.com/r/7TpOlc/1 The version in your answer works perfect. I think your solution is better than mine and my crazy (unnecessary) regex lol.
1

To continue from comments, since the csv data is not in good form it might be better to use a different regex and -replace to modify the data.


$file = 'c:\temp\test.csv'

# add test data to a file
@'
Name, type, ADDRESSES
Aaa, bbb, 19 S 149TH $NEWPORT NEWS, WA 96332
Aaa, bbb,  851 16TH AVE #365$SALISH, WA 98402-4410
Aaa, bbb,  2445 E BROADWAY #204$YELM WA 98653
'@ | Set-Content $file

$regex = ',[ \w$#]+,?[ \w]+(\d{5}(?:\-\d+)?)$'

# This line will read in the file, skipping the header line.
# Then it will perform a replace using the regex above 
# substituting whatever is matched with the first matching group (\d{5}(?:\-\d+).
# Finally the lines are appended to the end of the file
(Get-Content $file | Select-Object -Skip 1) -replace $regex, ', $1' | Add-Content -Path $file

# Get-Content to check our file
Get-Content $file

Output

Name, type, ADDRESSES
Aaa, bbb, 19 S 149TH $NEWPORT NEWS, WA 96332
Aaa, bbb,  851 16TH AVE #365$SALISH, WA 98402-4410
Aaa, bbb,  2445 E BROADWAY #204$YELM WA 98653
Aaa, bbb, 96332
Aaa, bbb, 98402-4410
Aaa, bbb, 98653

Comments

1

This works for me. Just replace everything up to 5 digits with only the 5 digits. It still works if there's a 5 digit number in the beginning. https://javascript.info/regexp-greedy-and-lazy

import-csv file.csv | 
  % { $_.addresses = $_.addresses -replace '.*(\d{5})', '$1'; $_ }
Name type ADDRESSES
---- ---- ---------
Aaa  bbb  96332
Aaa  bbb  98402-4410
Aaa  bbb  98653

6 Comments

This worked! Can you tell me how I can edit this to remove the -#### instead of capturing it?
Replace '.*(\d{5}).*' instead.
If there's an address like "12345 W Main St, BFE WA 98899-0023" will that capture the first group only?
Nope regex is greedy.
I googled that but i'm unsure what it means in this context. Can you elaborate? Is it greedy because it captures the final instance of the match or the largest number? I'm trying to test ways I can break the match with a poorly formatted address field.
|
0

Here's my 2 cents on this:

$csv = Import-Csv -Path 'theOriginal.csv' | ForEach-Object {
    $_ | Select-Object *, @{Name = 'ADDRESSES'; Expression = { $_.ADDRESSES -replace '.*\s([-\d]+)$', '$1'}} -ExcludeProperty ADDRESSES
}

# output on screen
$csv

# write to file
$csv | Export-Csv -Path 'theUpdated.csv' -NoTypeInformation

Result:

Name type ADDRESSES 
---- ---- --------- 
Aaa  bbb  96332     
Aaa  bbb  98402-4410
Aaa  bbb  98653

Regex details:

.             Match any single character that is not a line break character
   *          Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\s            Match a single character that is a “whitespace character” (spaces, tabs, line breaks, etc.)
(             Match the regular expression below and capture its match into backreference number 1
   [-\d]      Match a single character present in the list below
              The character “-”
              A single digit 0..9
      +       Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)            
$             Assert position at the end of the string (or before the line break at the end of the string, if any)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.