Delphi StringReplace a string between two chars

Question

I have a TMemo that displays text from a query. I would like to remove all chars between '{' and '}' so this string '{color:black}😊{color}{color:black}{color}' would end up like this 😊.

MemoComments.Lines.Text :=  StringReplace(MemoComments.Lines.Text, '{'+ * +'}', '', rfReplaceAll);

I know that the * in my code is wrong. It's just a placeholder. How can I do this the right way?

Is this possible, or do I have to create a complicated loop?

Andreas Rejbrand · Accepted Answer · 2020-07-15 15:18:29Z

6

This is a case where you can use a regular expression. I trust someone will publish such an answer for you very shortly.

However, just for the sake of completeness, I want to show that a loop-based approach isn't complicated at all, but rather straightforward:

function ExtractContent(const S: string): string;
var
  i, c: Integer;
  InBracket: Boolean;
begin
  SetLength(Result, S.Length);
  InBracket := False;
  c := 0;
  for i := 1 to S.Length do
  begin
    if S[i] = '{' then
      InBracket := True
    else if S[i]= '}' then
      InBracket := False
    else if not InBracket then
    begin
      Inc(c);
      Result[c] := S[i];
    end;
  end;
  SetLength(Result, c);
end;

Notice that I avoid unnecessary heap allocations.

(Personally, I have never been a huge fan of regular expressions. To me, the correctness of the above algorithm is obvious, it can only be interpreted in one way, and it is clearly written in a performant way. A regex, on the other hand, is a bit more like "magic". But I am a bit of a dinosaur, I admit that.)

answered Jul 15, 2020 at 15:18

Andreas Rejbrand

110k8 gold badges298 silver badges404 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

AmigoJack Over a year ago

On corrupted input (no closing }) the regex will leave the string as is with the opening {, while your code will silently omit anything afterwards. OP must decide on his own which outcome is preferred.

Andreas Rejbrand Over a year ago

@AmigoJack: Very true. I also remove a stray } between the "tags" and don't support nested tags. Unfortunately, the Q doesn't contain a full specification, so we don't know what the desired behaviour is. In any case, all these things can easily be changed by minor adjustments in the loop. I suspect the regex can be adjusted as well, so differences in behaviour are not indications of inherent limitations in either approach.

MartynA Over a year ago

I prefer this kind of approach to regexp, too so +1. If I did it myself, though, I'd use a pchar indexer and omit the bool flag and use two mutually exclusive (accept and reject) loops.

GolezTrol · Accepted Answer · 2020-07-15 16:01:43Z

3

Looks like you want a sort of regular expression, which Delphi fortunately offers in their RTL.

s := TRegEx.Replace('{color:black}😊{color}{color:black}{color}', '{.*?}', '', []);

or using the memo:

MemoComments.Lines.Text := TRegEx.Replace(MemoComments.Lines.Text, '{.*?}', '', []);

In this expression, {.*?}, .*? means any number (*) of any character (.), but as few as possible to match the rest of the expression (*?). That last bit is very powerful. By default, regexes are 'greedy', which means that .* would just match as many characters as possible, so it would take everything up to the last }, including the smiley and all the other color codes in between.

Pitfalls/cons

Like Andreas, I'm not a huge fan of regular expressions either. The awkward syntax can be hard to decypher, especially if you don't use them a lot.

Also, a seemingly simple regex can be hard to execute making it actually very slow sometimes, especially when working with larger strings. I recently bumped into one that was so magical, it was stuck for minutes on verifying whether a string of about 1000 characters matched a certain pattern.

The used expression is actually an example of that. It will have to look forward after the .*? part, to check whether it can satisfy the rest of the expression already. If not, go back, take another character, and look forward again. For this expression that's not an issue, but if an expression has multiple parts of variable length, this can be a CPU intensive process!

My earlier version, {[^}]*} is, theoretically at least, more efficient, because instead of any character, it just matches all characters that are not a }. Easier to execute, but harder to read. In the answer above I went for readability over performance, but it's always something to keep in mind.

Note that my first version, \{[^\}]*\} looked even more convoluted. I was using \ to escape the brackets, since they also have a special meaning for grouping, but it doesn't seem necessary in this case.

Lastly, there are different regex dialects, which is not helpful either.

That said

Fortunately Delphi wraps the PCRE library, which is open source, highly optimized, well maintained, well documented, and implements the most commonly used dialect.

And for operations like this they can be brief and easy to write, fast enough to use, and if you use them more often, it also becomes easier to read and write them, especially if you use a tool like regex101.com, where you can try out and debug regexes.

edited Jul 15, 2020 at 16:01

answered Jul 15, 2020 at 15:27

GolezTrol

116k19 gold badges186 silver badges215 bronze badges

3 Comments

Gerry Coll Over a year ago

Note that older versions of Delphi (pre 2010?) don't include these classes, but they are based on some open source (MPL) components by Jan Goyvaerts, available from regular-expressions.info/delphi.html

GolezTrol Over a year ago

Thanks for the addition, @GerryColl! I have used those in the past, and I think they are even the base for the now included library. I must say I omitted it from the answer deliberately, since it was introduced in Delphi XE, 10 years ago and about as many major versions.

Gean Over a year ago

Thank you everyone for the amazing answers. Very well explained and I learned alot from both answers.

Collectives™ on Stack Overflow

Delphi StringReplace a string between two chars

2 Answers 2

3 Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related