Using Regex.Replace with LinQ

Question

I have a small plugin to convert unicode string to non-unicode string. It looks like:

public static class StringHelper
{
    public static string ToNonUnicode(this string source)
    {
        if (!string.IsNullOrEmpty(source))
        {
            source = source.Trim().Replace(".", "");

            #region rule
            IDictionary<string, string> dict = new Dictionary<string, string>
            {
                { @"\-|\,", "" },
                { @"\s{2}", " " },

                { "à|á|ả|ã|ạ|ă|â|ấ|ầ|ẩ|ẫ|ậ|ằ|ẳ|ắ|ẵ|ặ", "a" },
                { "á|à|ả|ã|ạ|â|ă|ấ|ầ|ẩ|ẫ|ậ|ắ|ằ|ẳ|ẵ|ặ", "a" },

                { "À|Á|Ả|Ã|Ạ|Ă|Â|Ầ|Ấ|Ẩ|Ẫ|Ậ|Ằ|Ắ|Ẳ|Ẵ|Ặ", "A" },
                { "Á|À|Ả|Ã|Ạ|Â|Ă|Ấ|Ầ|Ẩ|Ẫ|Ậ|Ắ|Ằ|Ẳ|Ẵ|Ặ", "A" },

                { "ò|ó|ỏ|õ|ọ|ô|ơ|ồ|ố|ổ|ỗ|ộ|ờ|ớ|ở|ỡ|ợ", "o" },
                { "ó|ò|ỏ|õ|ọ|ô|ơ|ố|ồ|ổ|ỗ|ộ|ớ|ờ|ở|ỡ|ợ", "o" },

                { "Ò|Ó|Ỏ|Õ|Ọ|Ô|Ơ|Ồ|Ố|Ổ|Ỗ|Ộ|Ờ|Ớ|Ở|Ỡ|Ợ", "O" },
                { "Ó|Ò|Ỏ|Õ|Ọ|Ô|Ơ|Ố|Ồ|Ổ|Ỗ|Ộ|Ớ|Ờ|Ở|Ỡ|Ợ", "O" },

                { "è|é|ẻ|ẽ|ẹ|ê|ề|ế|ể|ễ|ệ", "e" },
                { "é|è|ẻ|ẽ|ẹ|ê|ế|ề|ể|ễ|ệ", "e" },

                { "È|É|Ẻ|Ẽ|Ẹ|Ê|Ề|Ế|Ể|Ễ|Ệ", "E" },
                { "É|È|Ẻ|Ẽ|Ẹ|Ê|Ế|Ề|Ể|Ễ|Ệ", "E" },

                { "ù|ú|ủ|ũ|ụ|ư|ừ|ứ|ử|ữ|ự", "u" },
                { "ú|ù|ủ|ũ|ụ|ư|ứ|ừ|ử|ữ|ự", "u" },

                { "Ù|Ú|Ủ|Ũ|Ụ|Ư|Ừ|Ứ|Ử|Ữ|Ự", "U" },
                { "Ú|Ù|Ủ|Ũ|Ụ|Ư|Ứ|Ừ|Ử|Ữ|Ự", "U" },

                { "ì|í|ỉ|ĩ|ị|í|ì|ỉ|ĩ|ị", "i" },
                { "Ì|Í|Ỉ|Ĩ|Ị|Í|Ì|Ỉ|Ĩ|Ị", "I" },

                { "ỳ|ý|ỷ|ỹ|ỵ|ý|ỳ|ỷ|ỹ|ỵ", "y" },
                { "Ỳ|Ý|Ỷ|Ỹ|Ỵ|Ý|Ỳ|Ỷ|Ỹ|Ỵ", "Y" },

                { "đ", "d" }, { "Đ", "D" }
            };
            #endregion

            foreach (var d in dict)
            {
                var matches = Regex.Matches(source, d.Key);
                foreach (Match match in matches)
                {
                    source = Regex.Replace(source, match.Value, d.Value);
                }
            }                
        }            
        return source;
    }
}

Test:

string str = "Làm người yêu em nhé baby...";
string res = str.ToNonUnicode(); // "Lam nguoi yeu em nhe baby"

To achieve that, I have to use loop twice, one for matching, one for replacing. I'm looking for another way(s) to do that for writing code faster. Using LinQ is a way I think, but I don't know where I go.

Can you give me some tips? Thank you!

Wouldn't foreach(var d in dict) source = Regex.Replace(source, d.Key, d.Value) do the same thing without and extra Regex.Matches? From MSDN: If pattern is not matched in the current instance, the method returns the current instance unchanged. — MarcinJuraszek
– MarcinJuraszek, Commented Jan 13, 2017 at 22:27

Callback Kid · Accepted Answer · 2017-01-13 22:28:40Z

2

You don't need the Matches loop, just do it directly with the Regex.Replace

foreach (var d in dict)
{
    source = Regex.Replace(source, d.Key, d.Value);
}

answered Jan 13, 2017 at 22:28

Callback Kid

7185 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Callback Kid Over a year ago

Please accept the answer, if it works for you, it helps other users see that this problem is solved, and It gives me sweet, sweet rep :)

Collectives™ on Stack Overflow

Using Regex.Replace with LinQ

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related