1

My situation is not about removing empty spaces, but keeping them. I have this string >[database values] which I would like to find. I created this RegEx to find it then go in and remove the >, [, ]. The code below takes a string that is from a document. The first pattern looks for anything that is surrounded by >[some stuff] it then goes in and "removes" >, [, ]

  string decoded = "document in string format";
  string pattern = @">\[[A-z, /, \s]*\]";
  string pattern2 = @"[>, \[, \]]"; 
  Regex rgx = new Regex(pattern);
  Regex rgx2 = new Regex(pattern2);         
  foreach (Match match in rgx.Matches(decoded))
  {                     
    string replacedValue= rgx2.Replace(match.Value, "");
    Console.WriteLine(match.Value);
    Console.WriteLine(replacedValue);

What I am getting in first my Console.WriteLine is correct. So I would be getting things like >[123 sesame St]. But my second output shows that my replace removes not just the characters but the spaces so I would get something like this 123sesameSt. I don't see any space being replaced in my Regex. Am I forgetting something, perhaps it is implicitly in a replace?

1
  • to complete ScoJo answer: [A-z] is different from [A-Za-z] since the hyphen define a range. Take a look at the ascii table. Commented Oct 15, 2014 at 18:12

3 Answers 3

3

The [A-z, /, \s] and [>, \[, \]] in your patterns are also looking for commas and spaces. Just list the characters without delimiting them, like this: [A-Za-z/\s]

string pattern = @">\[[A-Za-z/\s]*\]";
string pattern2 = @"[>,\[\]]";

Edit to include Casimir's tip.

Sign up to request clarification or add additional context in comments.

Comments

1

After rereading your question (if I understand well) I realize that your two steps approach is useless. You only need one replacement using a capture group:

string pattern = @">\[([^]]*)]";
Regex rgx = new Regex(pattern);

string result = rgx.Replace(yourtext, "$1");

pattern details:

>\[         # literals: >[
(           # open the capture group 1
    [^]]*   # all that is not a ]
)           # close the capture group 1
]           # literal ]

the replacement string refers to the capture group 1 with $1

1 Comment

It a bit more complicated, this is just a generalize version that contains the functionality I am looking for.
1

By defining [>, \[, \]] in pattern2 you define a character group consisting of single characters like >, ,, , [ and every other character you listed in the square brackets. But I guess you don't want to match space and ,. So if you don't want to match them leave them out like

string pattern2 = @"[>\[\]]";

Alternatively, you could use

string pattern2 = @"(>\[|\])";

Thereby, you either match >[ or ] which better expresses your intention.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.