1

I have a 12 MB file that I copy its data into the RichTextBox1 to process.. It takes about 4 seconds to finish, but someone told me to use (RegexOptions.Compiled) to make it faster but I don't see any difference between both.

Debug.Print(ParseData2(RichTextBox1.Text, "start", "end"))

this is the function but I commented where I am doing the tests

Function ParseData2(strData As String, ByVal sStart As String, ByVal sStop As String)
    'Dim r As New Regex(sStart & "(.*?)(" & sStop & "|$)", RegexOptions.Multiline Or RegexOptions.IgnoreCase Or RegexOptions.Compiled)
    Dim r As New Regex(sStart & "(.*?)(" & sStop & "|$)", RegexOptions.Multiline Or RegexOptions.IgnoreCase)
    Dim matches = r.Matches(strData)
    Dim i As Integer = 1
    For Each m As Match In matches
        'Debug.Print("    match #" & i & ": " & m.Groups(1).Value)
        i += 1
    Next
    Return matches.Count
End Function

2 Answers 2

3

'RegEx' and 'Fast' can't be used together, a simple string split/substring is faster than using regular espressions.

From MSDN: http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regexoptions.aspx

Regex.Compiled:

Specifies that the regular expression is compiled to an assembly. This yields faster execution but increases startup time. This value should not be assigned to the Options property when calling the CompileToAssembly method.

http://social.msdn.microsoft.com/Forums/en-US/2b1dd1ad-2ea9-46df-a15a-61a40efcf113/regexoptionscompiled?forum=regexp

When you specify the RegexOptions.Compiled option, the framework will create a dynamic assembly with a custom method that will handle the regular expression (a pre-compiled version of the regex).

The problem is that compiling the regular expression to a dynamic assembly takes a long time, so the first time a Regex object is created with the Compiled option, it will take a very long time. Subsequent calls to Match() or Replace() will execute a little bit faster than a non-compiled regex.

Pre-compiling a Regex is only useful if you create the Regex object early in your application, and re-use it very often.

I'm not an expert but I think that you can't do more than what you've already did to try to gain speed using regular expressions, maybe using an stream to write the debug info could display it "faster" (but maybe a second or less of difference).

But maybe to improve the code just a little bit to don't let the compiler suppose things you can try to assign the datatype of the returned value of the function:

 Private Function ParseData2(...) As Integer

And the type of the matches variable:

 Dim matches As MatchCollection = r.Matches(strData)

Also In VB all things start counting from 0, not from 1, you maybe want to consider to use this:

 Dim i As Integer = 0
Sign up to request clarification or add additional context in comments.

Comments

0

i have no idea about the RegexOptions.Compiled but may i ask do you just to count the occurences? if so maybe you should try this

match.Captures["Digits"].Length

see more here http://blogs.msdn.com/b/ericgu/archive/2006/03/06/544553.aspx

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.