I have a SSIS Script Task written in C# and I want it ported to powershell to be used as a script. The C# version runs in 12.1s, but the powershell version takes 100.5s almost an order of magnitude slower. I'm processing 11 text files (csv) with about 3-4 million rows in each of the format:
<TICKER>,<DTYYYYMMDD>,<TIME>,<OPEN>,<HIGH>,<LOW>,<CLOSE>,<VOL>
AUDJPY,20010102,230100,64.30,64.30,64.30,64.30,4
AUDJPY,20010102,230300,64.29,64.29,64.29,64.29,4
<snip>
I want to simply write out the contents to a new file where the column has a date of 20110101 or later. Here's my C# version:
private void ProcessFile(string fileName)
{
string outfile = fileName + ".processed";
StringBuilder sb = new StringBuilder();
using (StreamReader sr = new StreamReader(fileName))
{
string line;
int year;
while ((line = sr.ReadLine()) != null)
{
year = Convert.ToInt32( sr.ReadLine().Substring(7, 4));
if (year >= 2011)
{
sb.AppendLine(sr.ReadLine());
}
}
}
using (StreamWriter sw = new StreamWriter(outfile))
{
sw.Write(sb.ToString());
}
}
Here's my powershell version:
foreach($file in ls $PriceFolder\*.txt) {
$outFile = $file.FullName + ".processed"
$sr = New-Object System.IO.StreamReader($file)
$sw = New-Object System.IO.StreamWriter($outFile)
while(($line = $sr.ReadLine() -ne $null))
{
if ($sr.ReadLine().SubString(7,4) -eq "2011") {$sw.WriteLine($sr.ReadLine())}
}
}
How can I get the same performance in powershell that I can get in my C# Script Task in SSIS?