0

I'm having difficulty formulating the regex statement as well as to itse placement in a powershell script to extract a value that is in brackets of a command in a series of files (missing documentation so we're extracting the possible pass values from a zillion files - don't ask, its my pain)

What I currently have is:

Get-ChildItem -Recurse -Include *.* | Select-String "getBackOfficeCmdObject\(" | Out-File C:\work\found.txt

now, this is selecting all the lines that contain "getBackOfficeCmdObject(", but I was hoping to get the unique/distinct values contained in the brackets.

So for clarity,

blah blah getBackOfficeCmdObject(val1) blah blah
blah blah getBackOfficeCmdObject(val2) blah blah
blah blah getBackOfficeCmdObject(val3) blah blah
blah blah getBackOfficeCmdObject(val1) blah blah
blah blah getBackOfficeCmdObject(val4) blah blah
blah blah getBackOfficeCmdObject(val2) blah blah

as the data set to work with, would result in a file with the results

val1
val2
val3
val4

selecting the unique values from the list.

Thanks

3 Answers 3

4

EDITED to return unique values only.
A more succinct answer using regex lookbehind, grabs anything preceded by getBackOfficeCmdObject( and followed by )

(?<=getBackOfficeCmdObject\().*(?=\))

which is supported by Powershell

Get-ChildItem -Recurse -Include *.* | 
    cat | % { 
         ([regex]::matches($_,"(?<=getBackOfficeCmdObject\().*(?=\))")).value} 
        | Sort | Get-Unique
        | Out-File C:\work\found.txt
Sign up to request clarification or add additional context in comments.

3 Comments

This almost does everything I'm looking for. Is there anyway we could sort and filter out the results uniquely? PS how on earth can your skills at regex get this good? Its like (xkcd.com/208)
OK, so what I did was add a Get-Unique to the pipe and that hasn't solved getting the unique results. My powershell command now looks like this: Get-ChildItem -Recurse -Include . | cat | % { ([regex]::matches($_,"(?<=getBackOfficeCmdObject().*(?=))")).value} | Get-Unique | Out-File C:\Work\found.txt
Almost there, get-unique works on a sorted list only. I amended the answer(and thanks for xkcd comparison).
0

I think this should work:

$ht = @{}
Get-Childitem -Recurse -Include *.* |
 Get-Content -ReadCount 1000 |
  foreach { $_ -match 'getBackOfficeCmdObject\(' -replace '^.+getBackOfficeCmdObject\(([^)]+).+','$1' } |
  foreach {$ht[$_]=$true}


  $ht.keys | Out-File C:\work\found.txt

Use the -ReadCount with get-content to handle 1000 lines at a time using -match and -replace to extract the values. Send the values to a hash table so they get de-duped in stream, and the save the keys.

Comments

0

You might want to give this regex capture a try:

^(?:.*)\s(?:.*)\s(?:getBackOfficeCmdObject\((val\d)\))\s(?:.*)\s(?:.*)$

It will capture only the values where you use the placeholder (val*)...

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.