0

If I have two values eg/ABC001 and ABC100 or A0B0C1 and A1B0C0, is there a RegEx I can use to make sure the two values have the same pattern?

11
  • 2
    Can you give us more examples or explain the pattern better? Commented Dec 9, 2010 at 15:46
  • 1
    Do you know the pattern in advance? Is the pattern constant? Or do you want to be able to match them, if they're the same "pattern" even if you haven't seen that pattern before? Commented Dec 9, 2010 at 15:47
  • 2
    What defines the "same pattern"? Do you mean that they have a digits at the sample places in both strings, and letters at the same places in both strings? So AA1 has the same "pattern" as AA0 but not A1A? A little more clarification would be helpful. Commented Dec 9, 2010 at 15:47
  • The issue is I'm not sure what the pattern is. It could be different. I have 2 values that contain alpanumeric characters and I want to make sure the first value has the same pattern as the second. Commented Dec 9, 2010 at 15:47
  • 3
    What do you mean by "same pattern"? I can come up with dozens of patterns which will match any of these strings. Commented Dec 9, 2010 at 15:49

5 Answers 5

2

Well, here's my shot at it. This doesn't use regular expressions, and assumes s1 and s2 only contain numbers or digits:

public static bool SamePattern(string s1, string s2)
{
   if (s1.Length == s2.Length)
   {
      char[] chars1 = s1.ToCharArray();
      char[] chars2 = s2.ToCharArray();

      for (int i = 0; i < chars1.Length; i++)
      {
         if (!Char.IsDigit(chars1[i]) && chars1[i] != chars2[i])
         {
            return false;
         }
         else if (Char.IsDigit(chars1[i]) != Char.IsDigit(chars2[i]))
         {
            return false;
         }
      }

      return true;
   }
   else
   {
      return false;
   }
}

A description of the algorithm is as follows:

  1. If the strings have different lengths, return false.
  2. Otherwise, check the characters in the same position in both strings:
    1. If they are both digits or both numbers, move on to the next iteration.
    2. If they aren't digits but aren't the same, return false.
    3. If one is a digit and one is a number, return false.
  3. If all characters in both strings were checked successfully, return true.
Sign up to request clarification or add additional context in comments.

6 Comments

The problem is if you test SamePattern("EFG001", "ABC002"); the result is true but I want it to return false as the letters are different
@jon string1 == string2; I think we need a more detailed description of your rules of what a matching pattern is. Do the letters need to be the same, but the numbers may change?
Letters need to be the same but the numbers can change. I have a feeling I can't see the wood for the trees.
Ok, I've updated it. This should work for you, SamePattern("ABC001", "ABC002") returns true while SamePattern("EFG001", "ABC002") returns false.
If I may, you can greatly improve readability by using c1 >= '0', or Char.IsDigit.
|
2

If you don't know the pattern in advance, but are only going to encounter two groups of characters (alpha and digits), then you could do the following:

Write some C# that parsed the first pattern, looking at each char and determine if it's alpha, or digit, then generate a regex accordingly from that pattern.

You may find that there's no point writing code to generate a regex, as it could be just as simple to check the second string against the first.

Alternatively, without regex:

First check the strings are the same length. Then loop through both strings at the same time, char by char. If char[x] from string 1 is alpha, and char[x] from string two is the same, you're patterns are matching.

Try this, it should cope if a string sneaks in some symbols. Edited to compare character values ... and use Char.IsLetter and Char.IsDigit

private bool matchPattern(string string1, string string2)
{
    bool result = (string1.Length == string2.Length);
    char[] chars1 = string1.ToCharArray();
    char[] chars2 = string2.ToCharArray();

    for (int i = 0; i < string1.Length; i++)
    {
        if (Char.IsLetter(chars1[i]) != Char.IsLetter(chars2[i]))
        {
            result = false;
        }
        if (Char.IsLetter(chars1[i]) && (chars1[i] != chars2[i]))
        {   
            //Characters must be identical
            result = false;
        }
        if (Char.IsDigit(chars1[i]) != Char.IsDigit(chars2[i]))
            result = false;
    }
    return result;
}

Comments

1

Consider using Char.GetUnicodeCategory
You can write a helper class for this task:

public class Mask
{
    public Mask(string originalString)
    {
        OriginalString = originalString;
        CharCategories = originalString.Select(Char.GetUnicodeCategory).ToList();
    }

    public string OriginalString { get; private set; }
    public IEnumerable<UnicodeCategory> CharCategories { get; private set; }

    public bool HasSameCharCategories(Mask other)
    {
        //null checks
        return CharCategories.SequenceEqual(other.CharCategories);
    }
}

Use as

Mask mask1 = new Mask("ab12c3");
Mask mask2 = new Mask("ds124d");
MessageBox.Show(mask1.HasSameCharCategories(mask2).ToString());

4 Comments

I haven't run this but I would expect the result to be false as although the pattern matches the letters used are different.
+1 for GetUnicodeCategory, although I confess I found the code hard to follow, I think the addition of a bit of explicit typing would help? Such as List<UnicodeCategory> CharCategories
@Jon - than I've misunderstood your question completely, and may delete the answer promptly. So AA0 and AA1 are the same, but AA0 and BB0 are not?
@Andrew - It isn't my best work, I see that :) Just a quick demo.
0

I don't know C# syntax but here is a pseudo code:

  • split the strings on ''
  • sort the 2 arrays
  • join each arrays with ''
  • compare the 2 strings

Comments

0

A general-purpose solution with LINQ can be achieved quite easily. The idea is:

  1. Sort the two strings (reordering the characters).
  2. Compare each sorted string as a character sequence using SequenceEquals.

This scheme enables a short, graceful and configurable solution, for example:

// We will be using this in SequenceEquals
class MyComparer : IEqualityComparer<char>
{
    public bool Equals(char x, char y)
    {
        return x.Equals(y);
    }

    public int GetHashCode(char obj)
    {
        return obj.GetHashCode();
    }
}

// and then:
var s1 = "ABC0102";
var s2 = "AC201B0";

Func<char, double> orderFunction = char.GetNumericValue;
var comparer = new MyComparer();
var result = s1.OrderBy(orderFunction).SequenceEqual(s2.OrderBy(orderFunction), comparer);

Console.WriteLine("result = " + result);

As you can see, it's all in 3 lines of code (not counting the comparer class). It's also very very easily configurable.

  • The code as it stands checks if s1 is a permutation of s2.
  • Do you want to check if s1 has the same number and kind of characters with s2, but not necessarily the same characters (e.g. "ABC" to be equal to "ABB")? No problem, change MyComparer.Equals to return char.GetUnicodeCategory(x).Equals(char.GetUnicodeCategory(y));.
  • By changing the values of orderFunction and comparer you can configure a multitude of other comparison options.

And finally, since I don't find it very elegant to define a MyComparer class just to enable this scenario, you can also use the technique described in this question:

Wrap a delegate in an IEqualityComparer

to define your comparer as an inline lambda. This would result in a configurable solution contained in 2-3 lines of code.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.