5

I'm struggling to write a Powershell Command that does the following. Assume A folder which has a bunch of files with random names that match a regex pattern. I would like to capture the part that matches the pattern and rename the file to that part only.

E.g. "asdjlk-c12aa13-.pdf" should become "c12aa13.pdf" if the pattern is \w\d+\w+\d+ (or similiar).

My current idea looks something like this:

Get-ChildItem | Rename-Item -NewName { $_.Name -match $pattern ... } -WhatIf

where ... needs to be replaced with something that sets the "value" of the codeblock (i.e. the NewName) to the matched group. I.e. I don't know how to access $matched directly after the -match command.

Also, I wonder if it's possible to do lazy matching using -match, .*? doesn't seem to do the trick.

4
  • 1
    For the regex you could capture in a group what you want and use those groups in the replacement example Commented Feb 20, 2018 at 15:19
  • exactly what i had in mind, yes. the powershell poses more of a problem :/ Commented Feb 20, 2018 at 17:14
  • Heh... random names that match a regular expression. I'm not positive that you know what random means. :) Commented Feb 20, 2018 at 18:22
  • @EBGreen possibly poor choice of words, granted. Random in the sense that there is a random part in the filename not governed by the pattern and that the pattern could be anything and that the symbols constituting the pattern can be random. Commented Feb 26, 2018 at 12:26

4 Answers 4

4

tl;dr

Use -replace rather than -match to match and extract the parts of interest in a single operation, which requires you to:

  • design your regex to match the whole input string,

  • enclose the subexpressions that match the parts of interest in (…), i.e. capture groups

  • and refer to those parts in the substitution operand; $1 refers to what the first capture group captured, $2 the second, and so on.

Get-ChildItem |
  Rename-Item -NewName { $_.Name -replace '^.*\b(\w\d+\w+\d+)\b.*(\.pdf)$', '$1$2' } -WhatIf

Note: As in your own code, the -WhatIf common parameter in the command above previews the operation. Remove -WhatIf and re-execute once you're sure the operation will do what you want.

Note that input files that don't match the regex are left untouched.
Read on for details.

As for:

I wonder if it's possible to do lazy matching using -match, .*? doesn't seem to do the trick.

The above uses \b (word-boundary assertions) as the more robust alternative to lazy matching, but .*? does work in principle, as the following simplified example shows:

# -> 'c12aa13.pdf'
'asdjlk-c12aa13-.pdf' -replace '^.*?(\w\d+\w+\d+).*(\.pdf)$', '$1$2'

That is, the ? after .* ensured that the c match was "given up" in order to match the following subexpression, (\w\d+\w+\d+), as early as possible - go to this regex101.com page and experiment with removing the ? to see the difference in behavior.


Explanation of the -replace technique and regex:

While you could follow the -match operation in your own attempt with subsequent extraction of the matched part(s) via the automatic $Matches variable, it's often easier to combine the two operations with the help of the -replace operator:

You just need to make sure that in order to return only the parts of interest, you must match the input string in full and then ignore the parts you don't care about, as shown in this simplified example:

# -> 'c12aa13.pdf'
'asdjlk-c12aa13-.pdf' -replace '^.*\b(\w\d+\w+\d+)\b.*(\.pdf)$', '$1$2'

For a more detailed explanation of the regex and the option to experiment with it, see this regex101.com page.

  • .*\b matches the prefix before the part of interest; \b makes sure that the following subexpression only matches at a word boundary (i.e. only at a character other than an alphanumeric one or _).

  • (\w\d+\w+\d+) matches the part of interest, wrapped in a capture group; since it is the 1st capture group in the regex, you can refer to what it captured as $1 in the replacement operand.

  • \b.*, at a word boundary, matches everything after up to the .pdf filename extension.

  • (\.pdf)$ matches filename extension .pdf at the end of the name and, as the 2nd capture group, can be referenced as $2 in the replacement operand.

    • Note that an alternative to operating on the full .Name value including the extension is to match against the .BaseName property only and append the .Extension property afterwards, along the lines of:

      ($_.BaseName -replace '…','…') + $_.Extension
      
  • $1$2 simply concatenates the 2 capture-group matches to output the desired name.

    • Note: Generally, use single-quoted strings for both the regex and the replacement operand, so that $ isn't accidentally interpreted by PowerShell beforehand.

    • For more information about -replace and the syntax of the replacement operand, see this answer.


Sign up to request clarification or add additional context in comments.

Comments

2

A safer method is to do so with a test (similar to -WhatIf) This example renames files from DSC12345 - X-1.jpg => DSC12345-X1.jpg

# first verify what your files will convert too
# - gets files
# - pipes to % (foreach)
# - creates $a variable for replacement
# - echo replacement
Get-ChildItem . | % { $a = $_.name -replace "^DSC(\d+)\s-\s([A-Z])-(\d).jpg$",'DSC$1-$2$3.jpg'; echo "$_.name => $a"; }

# example output:
# DSC04975-W1.jpg.name => DSC04975-W1.jpg
# DSC04976-W2.jpg.name => DSC04976-W2.jpg
# DSC04977-W3.jpg.name => DSC04977-W3.jpg
# ...

# use the same command and replace "echo" with "ren"
Get-ChildItem . | % { $a = $_.name -replace "^DSC(\d+)\s-\s([A-Z])-(\d).jpg$",'DSC$1-$2$3.jpg'; ren $_.name $a; }

This is much safer as renames can be disastrous when run incorrectly.

4 Comments

Using -WhatIf is safe - and there's no reason to re-invent it. Combining -WhatIf with a -replace operation to preview the results, as in the accepted answer, is sufficient.
i think the difference is i prefer to get a visual log of the change.
-WhatIf provides display output by design - or are you referring to wanting to control the specific format of these what-if messages (which -WhatIf gives you no control over)? If so, I suggest framing your answer accordingly. The "A safer method" part is misleading.
P.S.: To visualize what happens when you perform the actual renaming, i.e. without -WhatIf, you can use -Verbose (which in essence prints the same message as -WhatIf).
1

You can put as much as you want in the scriptblock. Also hiding the output of the -match. The regex is lazy with the "?".

Get-ChildItem | Rename-Item -NewName { [void]($_.Name -match '.+?'); $matches.0 } -WhatIf

What if: Performing the operation "Rename File" on target "Item: /Users/js/foo/afile Destination: /Users/js/foo/a".
What if: Performing the operation "Rename File" on target "Item: /Users/js/foo/bfile Destination: /Users/js/foo/b".
What if: Performing the operation "Rename File" on target "Item: /Users/js/foo/cfile Destination: /Users/js/foo/c".

Comments

0

To be honest, I am not sure if your line above will work. If "\w\d+\w+\d+" is the pattern you are looking for, I would do something like this:

[regex]$regex = "\w\d+\w+\d+"    
Get-ChildItem | ?{$_.name -match $regex} | %{rename-item $_ "$($regex.Matches($_).value).pdf"}

In this case, you pipeline the output of the Get-ChildItem to the "foreach where loop" (?{...}), and after that you pipeline this outpout to the "foreach loop" (%{...}) to rename every object.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.