Compare two strings using Regex

Question

I am using two strings for a matching program like this:

string s1= 5-4-6-+1-+1+1+3000+12+21-+1-+1-+1-+2-3-4-5-+1-+10+1-+1-+;
string s2= 6-+1-+1+1+3000+12+21-+1-+1-+1-+1-+1-+1+1-+1-+;

And I am going to write a Regex matching function which compares each part string between each "+" separately and calculates the match percent, which is the number of matches occurring in each string. For example in this example we have these matches:

In this example the match percent is 13*100/15=87%.

Currently I am using the function below, but I think it is not optimized and using Regex may be faster.

public double MatchPercent(string s1, string s2) {
    int percent=0;
    User = s1.Split('+').ToArray();
    Policy = s2.Split('+').ToArray();

    for (int i = 0; i < s1.Length - 2; i++) {
        int[] U = User[i].Split('-').Where(a => a != "").Select(n => 
                      Convert.ToInt32(n)).Distinct().ToArray();
        int[] P = Policy[i].Split('-').Where(a => a != "").Select(n => 
                      Convert.ToInt32(n)).Distinct().ToArray();
        var Co = U.Intersect(P);
        if (Co.Count() > 0) {
            percent += 1;
        }
    }
    return Math.Round((percent) * 100 / s1.Length );
}

I don't understand what do you want to do. In your for loop, you don't use iterator value. So you always should get 98% of match or 0% of match. — Kirill Bestemyanov
– Kirill Bestemyanov, Commented Jun 8, 2013 at 9:29
I don't think regular expressions will work. Specifically, I don't think you can maintain state (i.e. the sameness count) over a regex this way. And calculating this after the match would require a variable number of capture groups. — jpaugh
– jpaugh, Commented Jun 8, 2013 at 16:05
This function first splits two strings separated by "+" and find match numbers in each part. @KirillBestemyanov I edited the function again, it was my typing mistake. — Kamran
– Kamran, Commented Jun 8, 2013 at 17:02
This is essentially an alignment problem. You need a suitable sequence alignment algorithm here, not regular expressions. — Konrad Rudolph
– Konrad Rudolph, Commented Jun 8, 2013 at 18:28
Konrad's right; instead of making your job easier, switching to a regex solution will make it much more difficult, if not impossible. — Alan Moore
– Alan Moore, Commented Jun 8, 2013 at 19:00

Andras Sebo · Accepted Answer · 2013-06-12 08:46:40Z

2

A better solution would be Levenshtein Word Distance algorithm. Some C# samples:

From the matching characters you can also calculate the percentages.

edited Jun 12, 2013 at 8:46

user447356

answered Jun 12, 2013 at 8:06

Andras Sebo

1,1408 silver badges20 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Compare two strings using Regex

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related