10

Removed the C tag, seeing as that was causing some confusion (it shouldn't have been there to begin with; sorry for any inconvenience there. C answer as still welcome though :)

In a few things I've done, I've found the need to create objects that have a dynamic size and a static size, where the static part is your basic object members, the dynamic part is however an array/buffer appended directly onto the class, keeping the memory contiguous, thus decreasing the amount of needed allocations (these are non-reallocatable objects), and decreasing fragmentation (though as a down side, it may be harder to find a block of a big enough size, however that is a lot more rare - if it should even occur at all - than heap fragmenting. This is also helpful on embedded devices where memory is at a premium(however I don't do anything for embedded devices currently), and things like std::string need to be avoided, or can't be used like in the case of trivial unions.

Generally the way I'd go about this would be to (ab)use malloc(std::string is not used on purpose, and for various reasons):

struct TextCache
{
    uint_32 fFlags;
    uint_16 nXpos;
    uint_16 nYpos;
     TextCache* pNext;
    char pBuffer[0];
};

TextCache* pCache = (TextCache*)malloc(sizeof(TextCache) + (sizeof(char) * nLength));

This however doesn't sit too well with me, as firstly I would like to do this using new, and thus in a C++ environment, and secondly, it looks horrible :P

So next step was a templated C++ varient:

template <const size_t nSize> struct TextCache
{
    uint_32 fFlags;
    uint_16 nXpos;
    uint_16 nYpos;
     TextCache<nSize>* pNext;
    char pBuffer[nSize];
};

This however has the problem that storing a pointer to a variable sized object becomes 'impossible', so then the next work around:

class DynamicObject {};
template <const size_t nSize> struct TextCache : DynamicObject {...};

This however still requires casting, and having pointers to DynamicObject all over the place becomes ambiguous when more that one dynamically sized object derives from it (it also looks horrible and can suffer from a bug that forces empty classes to still have a size, although that's probably an archaic, extinct bug...).

Then there was this:

class DynamicObject
{
    void* operator new(size_t nSize, size_t nLength)
    {
        return malloc(nSize + nLength);
    }
};

struct TextCache : DynamicObject {...};

which looks a lot better, but would interfere with objects that already have overloads of new(it could even affect placement new...).

Finally I came up with placement new abusing:

inline TextCache* CreateTextCache(size_t nLength)
{
    char* pNew = new char[sizeof(TextCache) + nLength];
    return new(pNew) TextCache;
}

This however is probably the worst idea so far, for quite a few reasons.

So are there any better ways to do this? or would one of the above versions be better, or at least improvable? Is doing even considered safe and/or bad programming practice?


As I said above, I'm trying to avoid double allocations, because this shouldn't need 2 allocations, and cause this makes writing(serializing) these things to files a lot easier. The only exception to the double allocation requirement I have is when its basically zero overhead. the only cause where I have encountered that is where I sequentially allocate memory from a fixed buffer(using this system, that I came up with), however its also a special exception to prevent superflous copying.

3
  • 1
    Is there any reason why you don't want to use STL? Commented Sep 7, 2010 at 15:20
  • 1
    @Kornel: maybe because of the 'C' tag - but the examples suggest that maybe the C tag is superfluous (even though my answer concentrates on C). Commented Sep 7, 2010 at 15:28
  • @Kornel: a few of the things I have in mind can't use STL, due to them either a) using allocators that aren't to friendly with STL, or b) they are allocators themselves, and using STL would defeat the whole purpose of their existance. Commented Sep 7, 2010 at 15:35

6 Answers 6

8

I'd go for a compromise with the DynamicObject concept. Everything that doesn't depend on the size goes into the base class.

struct TextBase 
{ 
    uint_32 fFlags; 
    uint_16 nXpos; 
    uint_16 nYpos; 
     TextBase* pNext; 
}; 

template <const size_t nSize> struct TextCache : public TextBase
{ 
    char pBuffer[nSize]; 
}; 

This should cut down on the casting required.

Sign up to request clarification or add additional context in comments.

Comments

3

C99 blesses the 'struct hack' - aka flexible array member.

§6.7.2.1 Structure and union specifiers

¶16 As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member. With two exceptions, the flexible array member is ignored. First, the size of the structure shall be equal to the offset of the last element of an otherwise identical structure that replaces the flexible array member with an array of unspecified length.106) Second, when a . (or ->) operator has a left operand that is (a pointer to) a structure with a flexible array member and the right operand names that member, it behaves as if that member were replaced with the longest array (with the same element type) that would not make the structure larger than the object being accessed; the offset of the array shall remain that of the flexible array member, even if this would differ from that of the replacement array. If this array would have no elements, it behaves as if it had one element but the behavior is undefined if any attempt is made to access that element or to generate a pointer one past it.

¶17 EXAMPLE Assuming that all array members are aligned the same, after the declarations:

struct s { int n; double d[]; };
struct ss { int n; double d[1]; };

the three expressions:

 sizeof (struct s)
 offsetof(struct s, d)
 offsetof(struct ss, d)

have the same value. The structure struct s has a flexible array member d.

106) The length is unspecified to allow for the fact that implementations may give array members different alignments according to their lengths.

Otherwise, use two separate allocations - one for the core data in the structure and the second for the appended data.

2 Comments

@DeadMG: He is, but he tagged it [c] as well so c answers are in; if the OP doesn't like it he should have thought better in the first place. There is no excuse for using [c] [c++] unless you mean it.
my appologies, C tag shouldn't have been there, however, It is good to know the method I use in C programs is 'in the clear'(though in C there really isn't any other way to do this)
2

That may look heretic regarding to the economic mindset of C or C++ programmers, but the last time I had a similar problem to solve I chosed to put a fixed size static buffer in my struct and access it through a pointer indirection. If the given struct grew larger than my static buffer the indirection pointer was then allocated dynamically (and the internal buffer unused). That was very simple to implement and solved the kind of issues you raised like fragmentation, as static buffer was used in more than 95% of actual use case, with the remaining 5% needing really large buffers, hence I didn't cared much of the small loss of internal buffer.

1 Comment

This principle of small static buffer + heap allocation when required is often referred to as "Small String Optimization" for strings class. It's used (for example) in Visual Studio 2010's std::string. I don't know if there's a more generic name.
0

I believe that, in C++, technically this is Undefined Behavior (due to alignment issues), although I suspect it can be made to work for probably every existing implementation.

But why do that anyway?

Comments

0

You could use placement new in C++:

char *buff = new char[sizeof(TextCache) + (sizeof(char) * nLength)];
TextCache *pCache = new (buff) TextCache;

The only caveat being that you need to delete buff instead of pCache and if pCache has a destructor you'll have to call it manually.

If you are intending to access this extra area using pBuffer I'd recommend doing this:

struct TextCache
{
...
    char *pBuffer;
};
...
char *buff = new char[sizeof(TextCache) + (sizeof(char) * nLength)];
TextCache *pCache = new (buff) TextCache;
pCache->pBuffer = new (buff + sizeof(TextCache)) char[nLength];
...
delete [] buff;

Comments

0

There's nothing wrong with managing your own memory.

template<typename DerivedType, typename ElemType> struct appended_array {
    ElemType* buffer;
    int length;
    ~appended_array() {
        for(int i = 0; i < length; i++)
            buffer->~ElemType();
        char* ptr = (char*)this - sizeof(DerivedType);
        delete[] ptr;
    }
    static inline DerivedType* Create(int extra) {
        char* newbuf = new char[sizeof(DerivedType) + (extra * sizeof(ElemType))];
        DerivedType* ptr = new (newbuf) DerivedType();
        ElemType* extrabuf = (ElemType*)newbuf[sizeof(DerivedType)];          
        for(int i = 0; i < extra; i++)
            new (&extrabuf[i]) ElemType();
        ptr->lenghth = extra;
        ptr->buffer = extrabuf;
        return ptr;                    
    }
};

struct TextCache : appended_array<TextCache, char>
{
    uint_32 fFlags;
    uint_16 nXpos;
    uint_16 nYpos;
    TextCache* pNext;
    // IT'S A MIRACLE! We have a buffer of size length and pointed to by buffer of type char that automagically appears for us in the Create function.
};

You should consider, however, that this optimization is premature and there are way better ways of doing it, like having an object pool or managed heap. Also, I didn't count for any alignment, however it's my understanding that sizeof() returns the aligned size. Also, this will be a bitch to maintain for non-trivial construction. Also, this is totally untested. A managed heap is a way better idea. But you shouldn't be afraid of managing your own memory- if you have custom memory requirements, you need to manage your own memory.

Just occurred to me that I destructed but not deleted the "extra" memory.

3 Comments

Actually I do have a managed heap, however, the less calls the better, seeing as there are thousands of allocs/frees per second just from my above example. This was actually implemented cause I had a huge drop in fps(this is part of a text caching engine for render engine), with only text being drawn, and a small amount of text to boot, however, after doing this optimization, it tripled in fps, which fixed the bottleneck that slowed everything down to a halt almost
The point of having a managed heap is that allocs are virtually free and memory allocated on it is contiguous. I think that if a managed heap doesn't work out for you (with this specific sample, I'm not saying they're great for all purposes) then you need to look again at your managed heap.
sequential allocs aren't guarenteed to be comtiguos, atleast when one uses some like a binary buddy system, which is what I'm using, but yeas, allocs are ment to be (almost) free.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.