I'm writing a c# program to update the starting comment -that is commonly the license header- of java source code. The following snippet do the job.
foreach (string r in allfiles)
{
// GC.Collect();
string thefile = System.IO.File.ReadAllText(r);
var pattern = @"/\*(?s:.*?)\*/[\s\S]*?package";
Regex regex1 = new Regex(pattern /*,RegexOptions.Compiled */) ;
var replaced = regex1.Replace(thefile, newheader + "package");
System.IO.File.WriteAllText(r, replaced);
}
The problem is that after hundreds of source file processed the process hang at .Replace
It's not a matter of Garbage Collection as forcing it don't solve the issue. And doesn't matter if RegexOptions.Compiled or not.
I'm quite sure it depends on an issue in the pattern as the hanging appear on some files that -if removed from processing- let the job continue till the end of one thousand of source file. But if I process these files alone, it work and also work if I use an online testing tool as http://regexstorm.net/tester https://www.myregextester.com/index.php
Please let me know if there is any way to optimize better the search pattern for finding the first Java comment in a file.
Thank you in advance.
newheader? The pattern is poorly written, you can re-write it as@"(?s)/\*[^*]*\*/.*?package"and can be further improved, but without sample input (what string it hangs with) it is difficult to help effeciently.@"/\*[^*]*(?:\*(?!/)[^*]*)*\*/\s*package"regex. Or even@"/\*[^*]*(?:\*(?!/)[^*]+)*\*/\s*package".