I use PowerShell as much as possible for quick and easy scripting tasks; A lot of times during my job I will use it for data parsing, log file sifting, or for creating CSV\Text files.
One thing I can't figure out is why it can be very inefficient to perform certain data\IO tasks. I figure it has to do with something under the hood with the way it handles Pipelines or just something I haven't understood yet.
If you take the following logic to generate ABC123 ids, compile it in PowerShell and execute it, it will take less than 1 minute to complete:
$source = @'
public static System.Collections.Generic.List<String> GetIds()
{
System.Collections.Generic.List<String> retValue = new System.Collections.Generic.List<String>();
for (int left = 97; left < 123; left++)
{
for (int middle = 97; middle < 123; middle++)
{
for (int right = 97; right < 123; right++)
{
for (int i = 1; i < 1000; i++)
{
String tmp = String.Format("{0}{1}{2}000", (char)left, (char)middle, (char)right);
retValue.Add(String.Format("{0}{1}", tmp.Substring(0, tmp.Length - i.ToString().Length), i));
}
}
}
}
return retValue;
}
'@
$util = Add-Type -Name "Utils" -MemberDefinition $source -PassThru -Language CSharp
$start = get-date
$ret = $util::GetIds()
Write-Host ("Time: {0} minutes" -f ((get-date)-$start).TotalMinutes)
Now take the same logic, run it through PowerShell without compiling as an assembly and it takes hours to complete
$start = Get-Date
$retValue = @()
for ($left = 97; $left -lt 123; $left++)
{
for ($middle = 97; $middle -lt 123; $middle++)
{
for ($right = 97; $right -lt 123; $right++)
{
for ($i = 1; $i -lt 1000; $i++)
{
$tmp = ("{0}{1}{2}000" -f [char]$left, [char]$middle, [char]$right)
$retValue += ("{0}{1}" -f $tmp.Substring(0, $tmp.Length - $i.ToString().Length), $i)
}
}
}
}
Write-Host ("Time: {0} minutes" -f ((get-date)-$start).TotalMinutes)
Why is that? Is there some sort of excessive type casting or inefficient operation I am using that slows down performance?