1

I am currently working on a C++ dll, which returns a string to the caller. In order to keep the dll independent from the rest of the build process, as well as C++'s lack of a standard ABI, I am forced to wrap the strings with C char pointers.

As far as I recall, there are two possibilities of returning strings in plain C fashion:

//Method 1
bool Foo1(wchar_t* s1, int len)
{
    //Needs space for 6 chars + null terminator
    if (len < 7)
        return false;

    wcscpy(s1, L"Hello1");

    return true;
}

//Method 2
wchar_t* Foo2()
{
    wchar_t* s2 = new wchar_t[10];

    wcscpy(s2, L"Hello2");

    return s2;
}

//Caller
int _tmain(int argc, _TCHAR* argv[])
{
    wchar_t s1[10];
    bool res = Foo1(s1, sizeof(s1) / sizeof(WCHAR));

    wchar_t* s2 = Foo2();

    delete s2;

    return 0;
}

Is there any guideline present which would favor one of these two solutions? I've seen Method 1 being used predominantly in the Windows API, probably due to historic reasons. However, I also don't see any negative impact in using the second method, which eliminates the need for the caller to allocate memory in beforehand. The only drawback would be that the responsibility of freeing the assigned memory now lies in the hands of the caller. Thanks for your suggestions.

10
  • 1
    Do you know the maximum length of the strings? Commented Jun 15, 2014 at 21:02
  • The drawback that you mention for Method 2 becomes bigger when you consider that the same DLL should perform that delete. blogs.msdn.com/b/oldnewthing/archive/2006/09/15/755966.aspx Commented Jun 15, 2014 at 21:04
  • 1
    The code: wchar_t s1[10]; bool res = Foo1(s1, sizeof(s1) / sizeof(WCHAR)); is dubious. Since the type of s1 is wchar_t, you should use that rather than WCHAR in the second sizeof. Alternatively, and generally better, you can use: wchar_t s1[10]; bool res = Foo1(s1, sizeof(s1) / sizeof(s1[0]));, dividing the size of the array by the size of a single element in the array. The advantage of this is it continues to be correct regardless of the type of s1. Commented Jun 15, 2014 at 21:15
  • @self, the length of the strings varies from one case to another. However, I am able to define a maximum limit. Commented Jun 15, 2014 at 21:21
  • @ Cheers and hth, Right now I am using the same version of Visual C++ for all modules. However, I can't guarantee that this situation will stay persistent. Therefore, I need to make sure that any compiler changes won't break interoperability in the future. COM seems to be a viable way, but the hassle of registering the dll or embedding a RegFree COM manifest makes me be reluctant to use it. Commented Jun 15, 2014 at 21:35

3 Answers 3

3

You don't want to new strings in the DLL and just return them. You need to also provide a deallocation function that the caller can use to deallocate. For if the caller's delete[] implementation matches the DLL's new[] implementation, then you don't need to go down to the C level at all, and can just use std::string.

As you noted an alternative is to let the caller provide a buffer.

A third alternative is to let the caller provide the allocation machinery.

And a fourth alternative is to use Windows dedicated allocation machinery for precisely this task, namely BSTR strings, SysAllocString and friends.

Instead of a pure C style DLL, if that seemed necessary, I would make it a COM server. It is a hassle to do that with g++ though. So, if I had to support also g++, I'd probably do a pure C DLL and use the BSTR stuff mentioned.


In passing, there is no reason to use Microsoft’s archaic and nonstandard _tmain and _TCHAR stuff for other than hardcore legacy code. This macro stuff, for a kind of weak Windows 9x compatibility, was obsolete already in the year 2000 with the introduction of Layer for Unicode. So it's bad bad bad and about 15 years obsolete.

Sign up to request clarification or add additional context in comments.

5 Comments

You're right, code nowadays should be exclusively Unicode. Unfortunately Visual C++ still uses the default _tmain template when creating new projects. As for your suggestion with BSTR strings, is it guaranteed that these will stay compatible with future versions of the C-Runtime?
COM seems to be a viable choice, however the hassle of registering the dll or embedding a RegFree manifest have so far made me reluctant to use it.
@Aurora: BSTR strings don't use the runtime library's allocator. They're handled by the Windows API. That's why you can allocate one such in a DLL and deallocate it in client code. Visual C++ provides some smart pointer classes to handle them, e.g. _bstr_t. If you need to support g++ then it's no big deal to define such class.
COM registration is a royal pain in the butt, but it's good when it works :) A nice advantage is that you can make your server remotely-hosted easily (DCOM wrapper) and take advantage of the various threading models.
@Aurora just backing up wha Alf said, you DO NOT want to use allocation and deallocation functions in your compiler's runtime library across a DLL boundary (e.g. what if MSVC code news it and some Visual Basic code loading your DLL tries to delete it). So either you need to have the caller allocate and deallocate; or have the library allocate or deallocate; or use Win32 API allocation and deallocation functions .
1

A third approach:

inline char* calloc_buffer(void*,unsigned n){ return (char*)calloc(1,n);}
char* get( char*(*make_buff)(void*,unsigned n), void* );

which includes a callback to have the caller allocate the buffer, and a function the caller can pass to have them calloc it. This allows maximum efficiency. They can also use a local buffer if they like, and fallback on calloc if not big enough. A version with offset and length is also useful.

Next, for a high quality C++ portable dll, we include optional header-only wrappers:

inline char* from_vector(void*v_,unsigned n){
  auto v =static_cast<std::vector<char>*>(v_);
  v->resize(n);
  return v->data();
}
inline std::vector<char> get(){
  std::vector<char> buf;
  get( from_vector, &buf );
  return buf;
}

which gives you near optimal C++ with near optimal C code.

Comments

0

It depends. The second makes it harder to return error conditions, which you might want to consider. Of course you could use errno to overcome this. Bottom line, there is no right or wrong, it's whatever matches the style of code you are working with or your personal preference if it's your own application.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.