33

I want to write a PowerShell script that will recursively search a directory, but exclude specified files (for example, *.log, and myFile.txt), and also exclude specified directories, and their contents (for example, myDir and all files and folders below myDir).

I have been working with the Get-ChildItem CmdLet, and the Where-Object CmdLet, but I cannot seem to get this exact behavior.

6 Answers 6

60

I like Keith Hill's answer except it has a bug that prevents it from recursing past two levels. These commands manifest the bug:

New-Item level1/level2/level3/level4/foobar.txt -Force -ItemType file
cd level1
GetFiles . xyz | % { $_.fullname }

With Hill's original code you get this:

...\level1\level2
...\level1\level2\level3

Here is a corrected, and slightly refactored, version:

function GetFiles($path = $pwd, [string[]]$exclude)
{
    foreach ($item in Get-ChildItem $path)
    {
        if ($exclude | Where {$item -like $_}) { continue }

        $item
        if (Test-Path $item.FullName -PathType Container)
        {
            GetFiles $item.FullName $exclude
        }
    }
} 

With that bug fix in place you get this corrected output:

...\level1\level2
...\level1\level2\level3
...\level1\level2\level3\level4
...\level1\level2\level3\level4\foobar.txt

I also like ajk's answer for conciseness though, as he points out, it is less efficient. The reason it is less efficient, by the way, is because Hill's algorithm stops traversing a subtree when it finds a prune target while ajk's continues. But ajk's answer also suffers from a flaw, one I call the ancestor trap. Consider a path such as this that includes the same path component (i.e. subdir2) twice:

\usr\testdir\subdir2\child\grandchild\subdir2\doc

Set your location somewhere in between, e.g. cd \usr\testdir\subdir2\child, then run ajk's algorithm to filter out the lower subdir2 and you will get no output at all, i.e. it filters out everything because of the presence of subdir2 higher in the path. This is a corner case, though, and not likely to be hit often, so I would not rule out ajk's solution due to this one issue.

Nonetheless, I offer here a third alternative, one that does not have either of the above two bugs. Here is the basic algorithm, complete with a convenience definition for the path or paths to prune--you need only modify $excludeList to your own set of targets to use it:

$excludeList = @("stuff","bin","obj*")
Get-ChildItem -Recurse | % {
    $pathParts = $_.FullName.substring($pwd.path.Length + 1).split("\");
    if ( ! ($excludeList | where { $pathParts -like $_ } ) ) { $_ }
}

My algorithm is reasonably concise but, like ajk's, it is less efficient than Hill's (for the same reason: it does not stop traversing subtrees at prune targets). However, my code has an important advantage over Hill's--it can pipeline! It is therefore amenable to fit into a filter chain to make a custom version of Get-ChildItem while Hill's recursive algorithm, through no fault of its own, cannot. ajk's algorithm can be adapted to pipeline use as well, but specifying the item or items to exclude is not as clean, being embedded in a regular expression rather than a simple list of items that I have used.

I have packaged my tree pruning code into an enhanced version of Get-ChildItem. Aside from my rather unimaginative name--Get-EnhancedChildItem--I am excited about it and have included it in my open source Powershell library. It includes several other new capabilities besides tree pruning. Furthermore, the code is designed to be extensible: if you want to add a new filtering capability, it is straightforward to do. Essentially, Get-ChildItem is called first, and pipelined into each successive filter that you activate via command parameters. Thus something like this...

Get-EnhancedChildItem –Recurse –Force –Svn
    –Exclude *.txt –ExcludeTree doc*,man -FullName -Verbose 

... is converted internally into this:

Get-ChildItem | FilterExcludeTree | FilterSvn | FilterFullName

Each filter must conform to certain rules: accepting FileInfo and DirectoryInfo objects as inputs, generating the same as outputs, and using stdin and stdout so it may be inserted in a pipeline. Here is the same code refactored to fit these rules:

filter FilterExcludeTree()
{
  $target = $_
  Coalesce-Args $Path "." | % {
    $canonicalPath = (Get-Item $_).FullName
    if ($target.FullName.StartsWith($canonicalPath)) {
      $pathParts = $target.FullName.substring($canonicalPath.Length + 1).split("\");
      if ( ! ($excludeList | where { $pathParts -like $_ } ) ) { $target }
    }
  }
} 

The only additional piece here is the Coalesce-Args function (found in this post by Keith Dahlby), which merely sends the current directory down the pipe in the event that the invocation did not specify any paths.

Because this answer is getting somewhat lengthy, rather than go into further detail about this filter, I refer the interested reader to my recently published article on Simple-Talk.com entitled Practical PowerShell: Pruning File Trees and Extending Cmdlets where I discuss Get-EnhancedChildItem at even greater length. One last thing I will mention, though, is another function in my open source library, New-FileTree, that lets you generate a dummy file tree for testing purposes so you can exercise any of the above algorithms. And when you are experimenting with any of these, I recommend piping to % { $_.fullname } as I did in the very first code fragment for more useful output to examine.

Sign up to request clarification or add additional context in comments.

12 Comments

+1 just noticed this answer because of a comment on my own. Really nice job! I knew my way had a bit of a duct tape and bubble gum flavor to it, but you've taken the solution to the next level. Thanks for pointing out the ancestor trap as well. While you're right that it's unlikely to come up very often, it's something you should be aware of before it bites you.
Thanks for the kind words @ajk. But do not sell yourself short; your answer definitely has merit for its brevity.
I have set up your test folder hierarchy, then tried your "corrected, and slightly refactored" version, but it still produces the same output as Keith Hill's version. That is, only level2 and level3 are displayed. I tried your "third alternative" and that one works. It displays all levels and the foobar.txt file. I am using PS version 2 on Win 7. FYI.
Yes, they are different. This is a sub-pipeline: $excludeList | where { $pathParts -like $_ } so $_ takes on the value of each member of the exclusion list. Now let's rewrite the original line abstractly as if (not_on_exclusion_list) { $_ }. That is, if it passes the condition, output the member of the current pipeline, i.e. the current item returned by Get-ChildItem.
@Tariq: The statement you referenced is not from my code above; that is from Keith Hill's original answer and is, in fact, one of the issues my code addresses. If you look above at my GetFiles function you will see that I use $item.FullName rather than just $item as the first argument to Test-Path, which should be all you need to make it work for you.
|
28

The Get-ChildItem cmdlet has an -Exclude parameter that is tempting to use but it doesn't work for filtering out entire directories from what I can tell. Try something like this:

function GetFiles($path = $pwd, [string[]]$exclude) 
{ 
    foreach ($item in Get-ChildItem $path)
    {
        if ($exclude | Where {$item -like $_}) { continue }

        if (Test-Path $item.FullName -PathType Container) 
        {
            $item 
            GetFiles $item.FullName $exclude
        } 
        else 
        { 
            $item 
        }
    } 
}

4 Comments

I like the way you use the if with a pipeline inside, excellent concise syntax and like @jonZ says you forgot the $exclude parameter in the recursive call
@jonZ, yes that arg shoud get passed down with the recursive call. Good catch.
An old post but may be worth clarifying. The directory check above didn't work in my case i.e. Test-Path $item -PathType Container. I had to use $item.PSIsContainer instead. PS: I am using Get-ChildItem with -Recurse switch (in case it has any effect).
This answer was mentioned in the blog post Practical PowerShell: Pruning File Trees and Extending Cmdlets (including having a bug(?) - same bug as mentioned in Michael Sorens' answer (same author)(?) .
12

Here's another option, which is less efficient but more concise. It's how I generally handle this sort of problem:

Get-ChildItem -Recurse .\targetdir -Exclude *.log |
  Where-Object { $_.FullName -notmatch '\\excludedir($|\\)' }

The \\excludedir($|\\)' expression allows you to exclude the directory and its contents at the same time.

Update: Please check the excellent answer from msorens for an edge case flaw with this approach, and a much more fleshed out solution overall.

3 Comments

+1 Can you please explain what the expression \\excludedir($|\\) does? Thank you.
Sure thing! It is a regular expression that matches any file or folder whose full path contains \excludedir. The ($|\\) part means the pattern matches the end of the full path name or a trailing backslash. So it would match \dir1\dir2\excludedir or dir1\excludedir\dir2. I highly recommend checking out the answer from @msorens. Aside from being an excellent answer in general, he points out a shortcoming in my approach.
+1 Thanks, really like this expression, have put it in my notebook. Please also see my comments to msorens' answer. FYI: Your solution also only goes two levels deep. Don't understand why this is.
1

A bit late, but try this one.

function Set-Files($Path) {
    if(Test-Path $Path -PathType Leaf) {
        # Do any logic on file
        Write-Host $Path
        return
    }

    if(Test-Path $path -PathType Container) {
        # Do any logic on folder use exclude on get-childitem
        # cycle again
        Get-ChildItem -Path $path | foreach { Set-Files -Path $_.FullName }
    }
}

# call
Set-Files -Path 'D:\myFolder'

Comments

0

Recently, I explored the possibilities to parameterize the folder to scan through and the place where the result of recursive scan will be stored. At the end, I also did summarize the number of folders scanned and number of files inside as well. Sharing it with community in case it may help other developers.

    ##Script Starts
    #read folder to scan and file location to be placed

    $whichFolder = Read-Host -Prompt 'Which folder to Scan?'  
    $whereToPlaceReport = Read-Host -Prompt 'Where to place Report'
    $totalFolders = 1
    $totalFiles = 0

    Write-Host "Process started..."

    #IMP separator ? : used as a file in window cannot contain this special character in the file name

    #Get Foldernames into Variable for ForEach Loop
    $DFSFolders = get-childitem -path $whichFolder | where-object {$_.Psiscontainer -eq "True"} |select-object name ,fullName

    #Below Logic for Main Folder
    $mainFiles = get-childitem -path "C:\Users\User\Desktop" -file
    ("Folder Path" + "?" + "Folder Name" + "?" + "File Name " + "?"+ "File Length" )| out-file "$whereToPlaceReport\Report.csv" -Append

    #Loop through folders in main Directory
    foreach($file in $mainFiles)
    {

    $totalFiles = $totalFiles + 1
    ("C:\Users\User\Desktop" + "?" + "Main Folder" + "?"+ $file.name + "?" + $file.length ) | out-file "$whereToPlaceReport\Report.csv" -Append
    }


    foreach ($DFSfolder in $DFSfolders)
    {
    #write the folder name in begining
    $totalFolders = $totalFolders + 1

    write-host " Reading folder C:\Users\User\Desktop\$($DFSfolder.name)"
    #$DFSfolder.fullName | out-file "C:\Users\User\Desktop\PoC powershell\ok2.csv" -Append
    #For Each Folder obtain objects in a specified directory, recurse then filter for .sft file type, obtain the filename, then group, sort and eventually show the file name and total incidences of it.

    $files = get-childitem -path "$whichFolder\$($DFSfolder.name)" -recurse

    foreach($file in $files)
    {
    $totalFiles = $totalFiles + 1
    ($DFSfolder.fullName + "?" + $DFSfolder.name + "?"+ $file.name + "?" + $file.length ) | out-file "$whereToPlaceReport\Report.csv" -Append
    }

    }


    # If running in the console, wait for input before closing.
    if ($Host.Name -eq "ConsoleHost")
    {

    Write-Host "" 
    Write-Host ""
    Write-Host ""

    Write-Host  "                            **Summary**"  -ForegroundColor Red
    Write-Host  "                            ------------" -ForegroundColor Red

    Write-Host  "                           Total Folders Scanned = $totalFolders "  -ForegroundColor Green
    Write-Host  "                           Total Files   Scanned = $totalFiles "     -ForegroundColor Green

    Write-Host "" 
    Write-Host "" 
        Write-Host "I have done my Job,Press any key to exit" -ForegroundColor white
        $Host.UI.RawUI.FlushInputBuffer()   # Make sure buffered input doesn't "press a key" and skip the ReadKey().
        $Host.UI.RawUI.ReadKey("NoEcho,IncludeKeyUp") > $null
    }

##Output

enter image description here

##Bat Code to run above powershell command

@ECHO OFF
SET ThisScriptsDirectory=%~dp0
SET PowerShellScriptPath=%ThisScriptsDirectory%MyPowerShellScript.ps1
PowerShell -NoProfile -ExecutionPolicy Bypass -Command "& {Start-Process PowerShell -ArgumentList '-NoProfile -ExecutionPolicy Bypass -File ""%PowerShellScriptPath%""' -Verb RunAs}";

Comments

0

Commenting here as this seems to be the most popular answer on the subject for searching for files whilst excluding certain directories in powershell.

To avoid issues with post filtering of results (i.e. avoiding permission issues etc), I only needed to filter out top level directories and that is all this example is based on, so whilst this example doesn't filter child directory names, it could very easily be made recursive to support this, if you were so inclined.

Quick breakdown of how the snippet works

$folders << Uses Get-Childitem to query the file system and perform folder exclusion

$file << The pattern of the file I am looking for

foreach << Iterates the $folders variable performing a recursive search using the Get-Childitem command

$folders = Get-ChildItem -Path C:\ -Directory -Name -Exclude Folder1,"Folder 2"
$file = "*filenametosearchfor*.extension"

foreach ($folder in $folders) {
   Get-Childitem -Path "C:/$folder" -Recurse -Filter $file | ForEach-Object { Write-Output $_.FullName }
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.