All,
There is a application which generates it's export dumps.I need to write a script that will compare the previous days dump against the latest and if there are differences among them i have to some basic manipulations of moving and deleting sort of stuff.
I have tried finding a suitable way of doing it and the method i tried was :
$var_com=diff (get-content D:\local\prodexport2 -encoding Byte) (get-content D:\local\prodexport2 -encoding Byte)
I tried the Compare-Object cmdlet as well. I notice a very high memory usage and eventually i get a message System.OutOfMemoryException after few minutes. Has one of you done something similer ?. Some thoughts please.
There was a thread which mentioned about a has comparison which i have no idea as to how to go about.
Thanks in advance folks
Osp
5 Answers
With PowerShell 4 you can use native commandlets to do this:
function CompareFiles {
param(
[string]$Filepath1,
[string]$Filepath2
)
if ((Get-FileHash $Filepath1).Hash -eq (Get-FileHash $Filepath2).Hash) {
Write-Host 'Files Match' -ForegroundColor Green
} else {
Write-Host 'Files do not match' -ForegroundColor Red
}
}
PS C:> CompareFiles .\20131104.csv .\20131104-copy.csv
Files Match
PS C:> CompareFiles .\20131104.csv .\20131107.csv
Files do not match
You could easily modify the above function to return a $true or $false value if you want to use this programmatically on a large scale
EDIT
After seeing this answer, I just wanted to supply larger scale version that simply returns true or false:
function CompareFiles
{
param
(
[parameter(
Mandatory = $true,
HelpMessage = "Specifies the 1st file to compare. Make sure it's an absolute path with the file name and its extension."
)]
[string]
$file1,
[parameter(
Mandatory = $true,
HelpMessage = "Specifies the 2nd file to compare. Make sure it's an absolute path with the file name and its extension."
)]
[string]
$file2
)
( Get-FileHash $file1 ).Hash -eq ( Get-FileHash $file2 ).Hash
}
Comments
You could use fc.exe. It comes with Windows. Here's how you would use it:
fc.exe /b d:\local\prodexport2 d:\local\prodexport1 > $null
if (!$?) {
"The files are different"
}
3 Comments
if (!$?) and replace it with if ($LastExitCode -eq 0). Check out stackoverflow.com/q/10666101 and all the answers.$null = fc.exe ...?Another method is to compare the MD5 hashes of the files:
$Filepath1 = 'c:\testfiles\testfile.txt'
$Filepath2 = 'c:\testfiles\testfile1.txt'
$hashes =
foreach ($Filepath in $Filepath1,$Filepath2)
{
$MD5 = [Security.Cryptography.HashAlgorithm]::Create( "MD5" )
$stream = ([IO.StreamReader]"$Filepath").BaseStream
-join ($MD5.ComputeHash($stream) |
ForEach { "{0:x2}" -f $_ })
$stream.Close()
}
if ($hashes[0] -eq $hashes[1])
{'Files Match'}
3 Comments
cd somewhere and then $FilePath1 = 'testfile.txt') but the StreamReader doesn't pick up Powershell's change of folder and thinks it is relative to my home folder instead. The fix is to use $Filepath1 = Get-Item 'testfile.txt' instead and then Powershell passes the correct absolute path to StreamReader.A while back I wrote an article on a buffered comparison routine to compare two files with PowerShell:
function FilesAreEqual {
param(
[System.IO.FileInfo] $first,
[System.IO.FileInfo] $second,
[uint32] $bufferSize = 524288)
if ($first.Length -ne $second.Length) return $false
if ( $bufferSize -eq 0 ) $bufferSize = 524288
$fs1 = $first.OpenRead()
$fs2 = $second.OpenRead()
$one = New-Object byte[] $bufferSize
$two = New-Object byte[] $bufferSize
$equal = $true
do {
$bytesRead = $fs1.Read($one, 0, $bufferSize)
$fs2.Read($two, 0, $bufferSize) | out-null
if ( -Not [System.Linq.Enumerable]::SequenceEqual($one, $two)) {
$equal = $false
}
} while ($equal -and $bytesRead -eq $bufferSize)
$fs1.Close()
$fs2.Close()
return $equal
}
You can use it by:
FilesAreEqual c:\temp\test.html c:\temp\test.html
A hash (like MD5) needs to traverse the entire file to do the hash calculation. This script returns as soon at it sees a difference in the buffer. It compares the buffer using LINQ which is faster than native PowerShell.
9 Comments
foreach that contains however many files of varying sizes, is yours more optimized than the 4.0 Get-FileHash?$BYTES_TO_READ to some higher value than 8. On my system reading 8 Bytes per iteration was extremely slow. I don't know what the best value is, but increasing the buffer size to 32768 (32 KB) certainly made the file compare a lot snappier.$BYTES_TO_READ is not enough, because inside the loop the BitConverter calls only compare the first 8 Bytes (= one Int64) of the buffer. After some deliberation I settled for a second, inner loop that iterates over the byte arrays and individually compares every byte. This is reasonably fast, and it's especially much faster than the ultra-slow compare-object cmdlet.if ( (Get-FileHash c:\testfiles\testfile1.txt).Hash -eq (Get-FileHash c:\testfiles\testfile2.txt).Hash ) {
Write-Output "Files match"
} else {
Write-Output "Files do not match"
}
-Rawparameter ofGet-Contentwithout any-Encoding, comparison wents much faster and easier.