how to replace a string in an efficient way in C

Question

I have a string which is generated by linux uuid generation code (libc):

1b4e28ba-2fa1-11d2-883f-b9a761bde3fb

I need to replace some of the characters in this string:

- with _
2 with f
4 with x

I am generating the 200 UUIDs using a loop.

So for every uuid I need to replace using a custom function, so that function must be maximum optimised to do so, how can I attain that?

200? That's not a large number, it would take some real work to not make this almost instant on a modern computer. Also, what do you have so far? — Collin
– Collin, Commented Apr 20, 2013 at 3:20
You can attain that by writing a program which does that. It's best to start early. If you have any problems, please post here. — user93353
– user93353, Commented Apr 20, 2013 at 3:31
@SchighSchagh , is it a good way "switch on each char, and replace it accordingly" for 200 uuids? — Kajal
– Kajal, Commented Apr 20, 2013 at 3:33
@RobertHarvey OP is replacing individual characters not substrings per se so this question is not a duplicate. — Matt Phillips
– Matt Phillips, Commented Apr 20, 2013 at 3:33

johnchen902 · Accepted Answer · 2013-04-20 05:20:57Z

4

I suppose you're using char[] str

char *c;
for(c = str; *c != '\0'; ++c){
    if( *c == '-' ) *c = '_';
    else if( *c == '2' ) *c = 'f';
    else if( *c == '4' ) *c = 'x';
}

switch version

char *c;
for(c = str; *c != '\0'; ++c){
    switch(*c){
        case '-': *c = '_'; break;
        case '2': *c = 'f'; break;
        case '4': *c = 'x'; break;
    }
}

edited Apr 20, 2013 at 5:20

answered Apr 20, 2013 at 3:34

johnchen902

9,6191 gold badge29 silver badges70 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Kajal Over a year ago

what about replacing with regular expression

johnchen902 Over a year ago

@Kajal replacing via regular expression is slower

Nicu Stiurca Over a year ago

@Kajal reg ex is overkill. I think writing it with a switch as I said in comment above instead of if/else if would be more readable, but it likely compiles to the same or very similar code.

Matt Phillips Over a year ago

You may be interested in the result of my testing.

johnchen902 Over a year ago

@MattPhillips did you use -O3 in your compiler argument?

|

Yongwei Wu · Accepted Answer · 2013-04-21 12:41:47Z

4

Is something as trivial as this what you want?

void my_replace(char* str)
{
    while (*str) {
        switch (*str) {
        case '-':
            *str = '_';
            break;
        case '2':
            *str = 'f';
            break;
        case '4':
            *str = 'x';
            break;
        default:
            break;
        }
        ++str;
    }
}

It is really fast and simple. I really can't see how you can make it faster.

EDIT: I am aware of some optimizations in certain string operations, but I do not see how they can be applicable here. For example, in the case of memcpy, one may copy 4 or more bytes at a time, depending on the processor. In case of comparing strings that is properly aligned, comparing as integers may be possible and more efficient. I just can't see an applicable technique.

edited Apr 21, 2013 at 12:41

answered Apr 20, 2013 at 4:05

Yongwei Wu

5,6421 gold badge39 silver badges51 bronze badges

4 Comments

Matt Phillips Over a year ago

High performance libraries such as LAPACK and BLAS use finely tuned, processor-specific code (often in Assembly) to achieve speeds many times faster than what one can do in a high-level language, even for operations as simple as matrix multiplication. I would expect the same is true of the C standard library as well.

autistic Over a year ago

@MattPhillips Your expectation is invalid. Have you tested the optimisations that your compiler implements, lately?

Matt Phillips Over a year ago

Have you? What makes you certain that the C lib is not optimized?

jweyrich Over a year ago

As an example of what @Matt is saying, glibc employs SSE to optimise memcpy, strnlen, and other functions. In fact, several libc implementations do employ optimisations like this.

Matt Phillips · Accepted Answer · 2013-04-20 05:41:42Z

1

C library functions may be optimized and significantly faster than a hand-coded iteration.

char* uuid; // = ...
//    size_t uuid_len; // = ... length of uuid


char* ptr = strpbrk(uuid, "-24");
while (ptr)
{
   switch(*ptr)
   {
      case '-':
          *ptr = '_';
          break;
      case '2':
          *ptr = 'f';
          break;
      case '4':
          *ptr = 'x';
          break;
   }
//       if (ptr-uuid == uuid_len) break;

   ptr = strpbrk(ptr+1, "-24");
}

Edit: Took out the range checking, on the basis of the example here that doesn't seem necessary.

Edit: So I decided to test out the 3 algorithms here to see which is faster. I had a loop of 100000 strings, on a vintage 2006 Mac Pro, compiling with gcc, -O3. I took the average of 1000 runs and did 5 cycles.

AND THE WINNER IS...

@johnchen by a hair with an average time of 7.85ms.

@YongweiWu just behind with an average time with 7.89ms. The difference looks significant; going in and doing a proper statistical test is not going to happen tonight unfortunately. :)

...and strpbrk a distant third at 32ms. (Glad I qualified all my optimization claims with 'might', 'may' etc....)

Edit: There is a big difference with Clang--j @ WY's algorithms take 10ms under Clang (looks tied between them), mine is unchanged.

edited Apr 20, 2013 at 5:41

answered Apr 20, 2013 at 3:46

Matt Phillips

9,7918 gold badges48 silver badges77 bronze badges

1 Comment

Matt Phillips Over a year ago

@user93353 No, look at the last line. After a character find strpbrk restarts at the next character, it does not start over at the beginning.

Collectives™ on Stack Overflow

how to replace a string in an efficient way in C

3 Answers 3

6 Comments

4 Comments

AND THE WINNER IS...

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

6 Comments

4 Comments

AND THE WINNER IS...

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related