1

I have a PowerShell 5.1 script that reads one csv file, looks up the read value in another csv file (which has some characters in it like Ω and ±) and finally writes a result to a third csv file. The lookup csv comes from China as an Excel file and I convert it to csv utf8 from Excel.

It all works fine except my regex searches while they work great at regex101.com and on the command line they don't seem to work in the Where-Object cmdlet where I need them.

So these work great. Notice Ω and ±1

PS C:\Users\grefgarg> $u = "14.3kΩ ±1% 0.1W ±100ppm/? 0603 Chip Resistor - Surface Mount RoHS" 
PS C:\Users\grefgarg> $u -match "(^14.3kΩ |  14.3KOhm )"
PS C:\Users\grefgarg> $true

But this does not. Where $b is the lookup csv and $c is the column to search say "Description"

$b= import-csv $bfile -Encoding 'utf8'
$r = "(^14.3kΩ |  14.3KOhm )"
$a= $b | where-object {($_.$($c) -match $r  )}

$a.Count is 0 If, however, I replace the Ω with \D it works again.

r$ = "(^14.3k\D |  14.3KOhm )"

I would like to use the Ω and ±1 in my regex but the \D works for now. I am asking to get a better understanding of how the pipeline, regex and encoding work.

I did try:

$PSDefaultParameterValues = @{ '*:Encoding' = 'utf8' }

I also searched for this specific issue of command line vs. Where-Object but I didn't see anything.

Thanks, Gregory

5
  • 2
    Are you sure the spaces in your regex string are correct? Did you try [regex]::Escape() on the strings? (For one thing, the dot should be \.) Commented Jun 4, 2021 at 18:07
  • Since the problem occurs with a string literal in your source code, the likeliest explanation is that your script file is misinterpreted by PowerShell, which happens if the script is saved as UTF-8 without a BOM. Try saving your script as UTF-8 with BOM; see this answer for more information. Commented Jun 4, 2021 at 18:20
  • 1
    Try $r = "(^14.3kΩ |^14.3kΩ |14.3KOhm )" (Ω U+03A9 Greek Capital Letter Omega or Ω U+2126 Ohm Sign). Note that a literal space character works the same way as \s character class (whitespace) Commented Jun 4, 2021 at 18:20
  • Thanks @Theo I will add \. Commented Jun 4, 2021 at 19:17
  • 1
    Saving as UTF-8 with BOM from VS Code did the trick. Thanks @mklement0 Commented Jun 4, 2021 at 19:18

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.