1

I'm getting a heap-buffer-overflow error on this code:

// ast.c
char *not_last_prefix = malloc(strlen(next_prefix) + 4); // line 204

sprintf(not_last_prefix, "%s│  ", next_prefix); // line 206
=================================================================
==3394==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000279 at pc 0x7f0d9e6d7715 bp 0x7fff975bcf60 sp 0x7fff975bc6f0
WRITE of size 11 at 0x602000000279 thread T0
    #0 0x7f0d9e6d7714 in vsprintf (/lib/x86_64-linux-gnu/libasan.so.5+0x9e714)
    #1 0x7f0d9e6d7bce in sprintf (/lib/x86_64-linux-gnu/libasan.so.5+0x9ebce)
    #2 0x55708e40b909 in print_ast_impl src/ast.c:206
    #3 0x55708e40b7ef in print_ast src/ast.c:192
    #4 0x55708e4112ad in main src/main.c:50
    #5 0x7f0d9e46f1e2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x271e2)
    #6 0x55708e40a5cd in _start (/home/michael/Code/Baby-C/debug/bcc+0x65cd)

0x602000000279 is located 0 bytes to the right of 9-byte region [0x602000000270,0x602000000279)
allocated by thread T0 here:
    #0 0x7f0d9e746ae8 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dae8)
    #1 0x55708e40b8cd in print_ast_impl src/ast.c:204
    #2 0x55708e40b7ef in print_ast src/ast.c:192
    #3 0x55708e4112ad in main src/main.c:50
    #4 0x7f0d9e46f1e2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x271e2)

SUMMARY: AddressSanitizer: heap-buffer-overflow (/lib/x86_64-linux-gnu/libasan.so.5+0x9e714) in vsprintf
Shadow bytes around the buggy address:
  0x0c047fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff8000: fa fa 00 fa fa fa 02 fa fa fa 00 00 fa fa 00 00
  0x0c047fff8010: fa fa 02 fa fa fa 00 00 fa fa 00 00 fa fa 02 fa
  0x0c047fff8020: fa fa 00 00 fa fa 00 00 fa fa 02 fa fa fa 02 fa
  0x0c047fff8030: fa fa 02 fa fa fa 02 fa fa fa 02 fa fa fa 02 fa
=>0x0c047fff8040: fa fa 02 fa fa fa fd fa fa fa 00 01 fa fa 00[01]
  0x0c047fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8060: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8070: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8080: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8090: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==3394==ABORTING

Everything I can find suggests that I'm not allocating enough space for the result of the sprintf, but I can't see how that could be the case. I allocate space for the length of next_prefix, 3 bytes for the "│ " that follows it, and 1 for the NULL terminator. The resulting string should fit. What am I missing here?

7
  • Oh my gosh, you're right. There is some unicode nonsense in that string, that's not a normal vertical bar. I feel pretty dumb, I think that must be the problem. Commented Apr 27, 2020 at 8:20
  • For such a simple formatting, why don't you use strncpy() and strncat()? It would have the same problem, though. Commented Apr 27, 2020 at 8:27
  • @thebusybee I considered it, but sprintf seemed to have a clearer syntax - is there a disadvantage to doing it this way? Commented Apr 27, 2020 at 8:29
  • snprintf() has the clever syntax and an insurance against buffer overflow. Commented Apr 27, 2020 at 8:56
  • @wildplasser I considered snprintf as well - that would have prevented a buffer overflow, but the resulting bug would have actually been harder to catch, since it would have resulted in incorrect behavior with no error. Commented Apr 27, 2020 at 9:04

2 Answers 2

2

The problem is that the length of the string literal is not 3, but 5. This is due to the fact that the vertical bar is not the standard ASCII character, but a unicode character (UTF-8 encoded as three bytes).

To avoid problems like this, one should assign the literal to a char * and take its length, like this

char *separator = "│  ";
char *not_last_prefix = malloc(strlen(next_prefix) + strlen(separator) + 1);
sprintf(not_last_prefix, "%s%s", next_prefix, separator); 
Sign up to request clarification or add additional context in comments.

Comments

1

The problem, as was pointed out to me, was that my format string contained a unicode character. I wrongly assumed that mallocing one more byte would solve the problem - turns out UTF-8 characters can be as many as 4 bytes long! The good news is that you can check exactly how many bytes they take up by checking this simple table (found here).

Character code (decimal) | Bytes used
-------------------------|------------
0-127                    | 1 byte
128-2047                 | 2 bytes
2048-65535               | 3 bytes
65536-1114111            | 4 bytes

In my case, the vertical bar character I was using () is unicode "\u2502", which means it takes up 3 bytes!

1 Comment

Apparently StackOverflow doesn't support Markdown tables??

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.