5

I've been curious about this for a while now, when using structures inside of arrays, as far as memory allocation is concerned, is it better to allocate a new structure for each entry in the array, or is it better to allocate enough space in the array for N structures.

//pointer based:
struct myStructure ** tmp = malloc(sizeof(struct myStructure *) * N);
tmp[0] = malloc(sizeof(struct myStructure));
tmp[0]->whatever = true;

//or structure in the array:
struct myStructure * tmp = malloc(sizeof(struct myStructure) * N);
tmp[0].whatever = true

Are there any benefits over one or the other? I feel like using the second form is better practice because you end up with fewer small malloc calls, but there might be cases where you can only use the first method.

Any insight into this?

Thanks!

4 Answers 4

5

In general I'd use the second way, since, if you use all the slots, it:

  • it uses slightly less memory (N times the size of a pointer);
  • fragments less the heap;
  • avoids N calls to malloc/free (=>it's faster and simpler to allocate/deallocate);
  • avoids double indirection when accessing each structure (very small improvement).

On the other hand, the first way may be convenient if you are not going to use all the slots of the array (but you must be able to store many structs on demand) and your struct is very big, so saving that memory is worth the effort. Also, it may be worth if you need to change the order of your structs cheaply (although you could do that also with the second method by using a separated array of pointers).

Sign up to request clarification or add additional context in comments.

Comments

3

Usually, the second is better [IMO] - it will probably be cheaper [both memory and time], and definetly will be easier to maintain.

However, at some cases, the second approach might fail - when the first will succeed. This might happen due to memory fragmentation - there is enough memory for all your structs, it is just not in "one place" in your memory.

Comments

2

There are currently three very good answers why you should use the second approach, so I won't duplicate their answers. I do want to say that the first approach has several benefits:

  • The first array is far easier to grow and shrink based on their actual need in the system. Growing the first array of pointers is pretty easy -- each element is four or eight bytes long total so doubling the size of the array won't cost too much.

    The second array of actual structures might be significantly larger (by sizeof struct foo times the number of elements), and growing the array even slightly might run out of memory if realloc(3) does not have sufficient free space to work with.

  • The first array gives you the ability to refer to objects in the system by "handles" and re-arrange their memory as needs dictate. You could allocate the underlying objects in page-sized slabs and re-compact objects towards nearly-full slabs -- allowing you to return pages to the operating system for other use later. Other objects in the system would have to go through another layer of indirection to get to referenced objects, but those client objects ("referents"?) would not need to have their pointers updated when you move objects.

  • The lifetime of objects in the first array is "decoupled" -- some of those objects might live for a very long time and others might live for mere milliseconds. In the second array, the entire array has the same lifetime. You could add an additional data structure to the second array to manage which objects are live and which are dead, or add new fields in the structures to indicate which are live and dead, but both approaches require more work. With the first array, if the pointer is non-NULL, then the object is live.

Both approaches have their benefits. Pick the right one for the job at hand.

Comments

1

Structure in the array is usually better; you should use that unless you have a reason to use the other form.

From a code correctness point of view, you should handle malloc() fails. If you have 1000 malloc()s in a loop, you're more likely to make a programming error in your error-handling code. Similarly, if your data structure is more complex you're more likely to leak something. So the single malloc() is easier.

From a allocation speed point of view, malloc() obviously takes time to run, so a single malloc() will usually be faster.

From a memory size point of view, malloc() usually has some overhead on each allocation. And obviously the pointers are an extra cost. So if you allocate 1000 16-byte structures, you might end up with 16 bytes per structure in malloc overhead and an 8-byte pointer, total 40,016 bytes. Doing a single allocation will only take up 8,016 bytes.

From an access speed point of view, the single array is likely to be faster, especially if the structures are small or you read the structures in order. If the structures are small, then several of them will fit in a single cache line so they can be read/written as a group. If you read the structures in order, the CPU is likely to notice the linear access to the big array and preload it into the cache. If you use a pointer array and separate allocations, then the memory layout is more random and neither of these optimizations will work. Also, since you're accessing more data (the pointers), your algorithm will fall out of data cache sooner.

From a memory fragmentation point of view, it depends. If you have large-enough structures (a significant fraction of your total RAM), then you might get into a situation where there's not enough contiguous free memory to allocate a single big array, but there is enough to allocate an array of pointers and the individual structures. (E.g. if you have a 32-bit program on an OS that limits you to 2GB of memory, and your memory allocator has allocated something else half way through the memory, then you can't do a single 1.5GB allocation but you could do 15 100MB allocations). This kind of scenario is rare, because people don't usually work with data that large.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.