2

I want to count duplicates of files with PowerShell. My files have a special separator ('#') and I can only compare the part before the separator.

Mode        LastWriteTime   Length Name
----        -------------   ------ ----
-a----   23.09.2016 09:44        0 AnotherDuplicateOffer_#1265473v1.DOCX
-a----   23.09.2016 09:44        0 AnotherDuplicateOffer_#89798798546v1.DOCX
-a----   23.09.2016 09:44        0 AnotherDuplicateOffer_#98769876v1.DOCX
-a----   23.09.2016 09:44        0 DuplicateOffer_#1254798v1.DOCX
-a----   23.09.2016 09:44        0 DuplicateOffer_#34987094587v1.DOCX
-a----   23.09.2016 09:44        0 DuplicateOffer_#4986598v1.DOCX
-a----   23.09.2016 09:44        0 DuplicateOffer_#567809v1.DOCX
-a----   23.09.2016 09:44        0 WordFilesAlthoug_#89798798546v1.DOCX

The part after the separator is a unique ID and at least I want to rename the files by removing this ID. So the new filename should be something like 'string (x).docx' and the 'x' should a counter for the duplicates.

I'm stuck by counting the duplicates:

foreach ($file in (Get-ChildItem -Path $path -Recurse | Where {!$_.PSIsContainer})) {
    $file.Name
    $file.Name.IndexOf("#")
    $file.Name.Substring(0, ($file.Name.IndexOf("#")))
    (dir *.* | group -Property Name | Where {($_.Name.Substring(0,($_.Name.IndexOf("#")))) -match ($_.Name.Substring(0,($_.Name.IndexOf("#"))))}).Count
}

I get the right index of '#' with $file.Name.IndexOf("#") and also the string of $file.Name.Substring(0,($file.Name.IndexOf("#"))) is right. But when I use the same in the pipe I get exceptions in Substring because of the second part - this must be greater than 0 and it appears to be 0 or less.

For better understanding: $_ is the same as $file - it is the actual pointer in the pipe.

2 Answers 2

3

Simply group the files by the first part of their name and select those groups that have more than one element.

Get-ChildItem -Path $path -Recurse |
    Where-Object { -not $_.PSIsContainer } |
    Group-Object { ($_.Name -split '#')[0] } |
    Where-Object { $_.Count -ge 2 }

Rename the files by processing each group separately:

... | ForEach-Object {
    $i = 0
    $_.Group | ForEach-Object {
        $newname = $_.Name -replace '#\d+v\d+', "($i)"
        Rename-Item -Path $_.FullName -NewName $newname
        $i++
    }
}
Sign up to request clarification or add additional context in comments.

1 Comment

thanks that works either. Anyway, i got my Probelm solved by myself. see update above
0

Finally i got it working. The point was to give dir the right path. I didn't do this in the first time, because i thought this i given by my pointer $file but it is not. So giving the the right path is done with Parameter -Path and $file.Directory. This way dir gets the right path where the actuall $file is into.

(dir -Path $file.Directory *.* | group -Property Name | Where{($_.Name.Substring(0,($_.Name.IndexOf("#")))) -match ($_.Name.Substring(0,($_.Name.IndexOf("#"))))}).Count

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.