Regex removing empty spaces when using replace

Question

My situation is not about removing empty spaces, but keeping them. I have this string >[database values] which I would like to find. I created this RegEx to find it then go in and remove the >, [, ]. The code below takes a string that is from a document. The first pattern looks for anything that is surrounded by >[some stuff] it then goes in and "removes" >, [, ]

  string decoded = "document in string format";
  string pattern = @">\[[A-z, /, \s]*\]";
  string pattern2 = @"[>, \[, \]]"; 
  Regex rgx = new Regex(pattern);
  Regex rgx2 = new Regex(pattern2);         
  foreach (Match match in rgx.Matches(decoded))
  {                     
    string replacedValue= rgx2.Replace(match.Value, "");
    Console.WriteLine(match.Value);
    Console.WriteLine(replacedValue);

What I am getting in first my Console.WriteLine is correct. So I would be getting things like >[123 sesame St]. But my second output shows that my replace removes not just the characters but the spaces so I would get something like this 123sesameSt. I don't see any space being replaced in my Regex. Am I forgetting something, perhaps it is implicitly in a replace?

to complete ScoJo answer: [A-z] is different from [A-Za-z] since the hyphen define a range. Take a look at the ascii table. — Casimir et Hippolyte
– Casimir et Hippolyte, Commented Oct 15, 2014 at 18:12

ScoJo · Accepted Answer · 2014-10-15 18:54:09Z

3

The [A-z, /, \s] and [>, \[, \]] in your patterns are also looking for commas and spaces. Just list the characters without delimiting them, like this: [A-Za-z/\s]

string pattern = @">\[[A-Za-z/\s]*\]";
string pattern2 = @"[>,\[\]]";

Edit to include Casimir's tip.

edited Oct 15, 2014 at 18:54

answered Oct 15, 2014 at 18:03

ScoJo

914 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Casimir et Hippolyte · Accepted Answer · 2014-10-15 18:20:57Z

1

After rereading your question (if I understand well) I realize that your two steps approach is useless. You only need one replacement using a capture group:

string pattern = @">\[([^]]*)]";
Regex rgx = new Regex(pattern);

string result = rgx.Replace(yourtext, "$1");

pattern details:

>\[         # literals: >[
(           # open the capture group 1
    [^]]*   # all that is not a ]
)           # close the capture group 1
]           # literal ]

the replacement string refers to the capture group 1 with $1

answered Oct 15, 2014 at 18:20

Casimir et Hippolyte

90k5 gold badges102 silver badges131 bronze badges

1 Comment

Jack Thor Over a year ago

It a bit more complicated, this is just a generalize version that contains the functionality I am looking for.

participant · Accepted Answer · 2014-10-15 19:59:00Z

1

By defining [>, \[, \]] in pattern2 you define a character group consisting of single characters like >, ,, , [ and every other character you listed in the square brackets. But I guess you don't want to match space and ,. So if you don't want to match them leave them out like

string pattern2 = @"[>\[\]]";

Alternatively, you could use

string pattern2 = @"(>\[|\])";

Thereby, you either match >[ or ] which better expresses your intention.

edited Oct 15, 2014 at 19:59

answered Oct 15, 2014 at 18:09

participant

3,0132 gold badges26 silver badges40 bronze badges

Collectives™ on Stack Overflow

Regex removing empty spaces when using replace

3 Answers 3

Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related