-1

Based on the data structure of the AVL tree, I implemented a memory manager that does the best matching according to the size. I originally thought that the speed would be fast and the length of the memory heap would not be too large, because it was matched according to the optimal size. However, when I use the valgrind tool to compare the performance with the standard library, I found that the speed is much lower than the standard library. I'd be interested to know what algorithms the standard library uses to implement them.

5
  • 2
    sourceware.org/glibc/wiki/MallocInternals ? Commented Apr 4, 2023 at 14:42
  • memory management is an art. Best matching rapidly leaves small holes which are harder to fill later. Commented Apr 4, 2023 at 20:57
  • 3
    Which standard library? GLibC has a different one than musl which has a different one than dietlibc libc which has a different one than µClibc which has a different one than FreeBSD libc which has a different one than NetBSD libc which has a different one than bionic libc which has a different one than Solaris libc which has a different one than AIX libc which has a different one than HP-UX libc which has a different one than … etc. Also, there's lots of alternative implementations that are not part of any C library, like Hoard, mimalloc, tcmalloc, etc. Commented Apr 4, 2023 at 21:59
  • Even though Poul-Henning Kamps version of malloc in FreeBSD has since been replaced, there is still a really interesting description of its implementation and the thoughts behind it here papers.freebsd.org/1998/phk-malloc.files/phk-malloc-paper.pdf Commented Apr 4, 2023 at 22:23
  • 1
    May I request respondents&voters to be a little bit gentler? It's abundantly clear that OP is not aware that the C spec and implementation are not an integrated monolith. Their question is misconstrued, but valid. It's totally non-obvious that one language can have multiple implementation approaches, especially if you come from a language background with only one implementation (or at least those with a single overwhelmingly popular one, like Rust, CPython, MRI Ruby, etc.) Commented Apr 6, 2023 at 13:44

2 Answers 2

5

Each implementation of the Standard Library can use whatever implementation it wants, as long as it conforms to the interface described in section 7.22.3. The Standard doesn't mandate any particular algorithm, so the question is unanswerable.

Allocator performance is important to the overwhelming majority of workloads, so implementations are constantly being optimised for speed and efficiency. Don't expect that the implementation you're using today will be the same in five years' time, for example.

2

MacOS / ios assume that you allocate many malloc blocks. So they allocate one large block for allocations up to 16 byte, one large block for allocations up to 32 bytes and so on. Allocating just requires setting one single bit in a bitmap of which blocks are used.

There is a bit of space wasted, but if you use 100MB of memory it’s not much. And a 16 byte block uses 16 byte plus 1 bit on average.

And they assume that malloc/reallocate/free are often performed on the same thread. So some data structures are duplicated per thread, which means allocations don’t need to be synchronised between threads.

So what exactly does malloc need to do: It requests a large block from the OS and returns it if the allocation size is large (this is rare; the number of large allocations is limited by memory size). It rounds the allocation size to a multiple of 16. It uses a table to get a pointer to a large memory block and a bitmap of used malloc blocks within that block. It finds an unused bit in the bitmap, sets the bit, and returns the address of the malloc block. If no unused malloc block is there, it has a list of large memory blocks with unused malloc blocks and uses one of those. If there is no large block with unused malloc blocks, then a new one is allocated.

So assume you have 256 malloc blocks in a large memory block. Then 256 allocations will be very fast. And if blocks are freed, then each freed block can be re-allocated very quickly as well. So say 200 mallocs, 200 free(), 200 malloc() etc. will be very fast.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.