1

I'm using the following regex:

documentText = Regex.Replace(documentText, "\\\\|\\^|\\+|\\*|~|#|=|\"", "");

and it works. But when I split this string by using:

wordsInText = documentText.ToLower().Split(' ').ToList();

I get elements that are marked as "" (empty string). I can remove it manually by iterating through collection and removing empty elements, but it must be a way to prevent this weird behaviour.

3
  • 1
    You mean, the original documentText didn't contain consecutive blanks but after replacing it does? Then just use the regex string "(\\\\|\\^|\\+|\\*|~|#|=|\") ?" instead Commented Oct 25, 2012 at 19:28
  • 1
    Independently of your question, I suggest that for your regex you use a verbatim string for readability: @"\\|\^|\+|\*|~|#|=|""" Commented Oct 25, 2012 at 19:29
  • Bergi, you're right. I tried to use your regex pattern but it still leaves blanks. Commented Oct 25, 2012 at 19:39

1 Answer 1

1
documentText.ToLower().Split(new char[]{' '},StringSplitOptions.RemoveEmptyEntries)
Sign up to request clarification or add additional context in comments.

3 Comments

Just out of curiosity, why do you use a char array if it is only one delimiter character?
@m.buettner Because there is no overloaded method that takes a char and StringSplitOptions
Thanks L.B. This works, but is it possible to solve this issue inside regex.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.