0

I need to store a lot of short strings with a constant length. I noticed that a string object allocates 8 bytes, even if it holds some chars, which let me run into memory trouble.

Is there a way to tell C++ that it should allocate only n (constant) bytes for the string? Or do I have to use char arrays?

6
  • 3
    This is a matter of quality of implementation. Recent libstdc++ and libc++ strings use small-string optimization that doesn't allocate any memory at all. Commented Dec 12, 2015 at 17:49
  • 1
    Just in case you wonder, sizeof(some_std_string_object) != some_std_string_object.length() Commented Dec 12, 2015 at 17:51
  • 1
    This looks like a low level optimisation. C++ usages recomment to avoid such optimisation in early development stages and only use them when you are sure they are worth it, by profiling or memory usage measuring. You should hide implementation in a dedicated class using the least possible of string functionnalities, and at optimization time, look if replacing strings with char arrays is justified. Commented Dec 12, 2015 at 17:57
  • You might be a good candidate for Boost.Flyweight since there's often a lot of repetition in short strings--it could give you some extra memory savings. Commented Dec 12, 2015 at 18:00
  • Thx for recommendation, but it's a homework for university. It's not allowed to use non standard librarys Commented Dec 12, 2015 at 18:03

3 Answers 3

1

Since the strings are constant size, you may want allocate a 2 dimensional array (each row is a string). Allocate the array once, at initialization. This is the most compact form.

If the quantity of strings is unknown, consider using a std::vector of arrays of character. I recommend reserving a large size when the vector is created, to reduce the number or reallocations.

Also, ask yourself if the strings need to be stored in memory for the duration of the program. Will you be accessing (searching) them? Can the data be placed into a file or database?

Sign up to request clarification or add additional context in comments.

Comments

0

std::string does have the constructor that takes a char* and a size but some std::string implementations do not use small string optimisation so they always contain a pointer to the actual data which takes some space. If you use GCC it uses copy on write to save space which also requires a pointer. If there are many duplicates of the strings you could maybe save space by using a map or a set of strings to eliminate duplicates. Or you could just use a char array which is the most basic form with the lowest overhead but also less comfort.

2 Comments

AFAIK copy on write was used maybe in older libstdc++, today it disappeared in almost any standard library (also due to threading issues).
@MatteoItalia: Nope, GCC up to and including 4.9 still uses COW according to info.prelert.com/blog/…
0

A pointer and count is 8 bytes on a 32-bit system anyway. It's hard to make a string be smaller than that. A string with SSO will likely be larger than 8 bytes. (Perhaps 4 bytes - 1 bit for the count. 1 bit to indicate small string or not. 12 bytes of small string buffer or 4 bytes pointer + 4 bytes capacity => A total of 16 bytes.)

If they are constant length (and given you probably don't need most of std::string's functionality), you probably need to write your own text-id class. (Probably with conversions to and from std::string.)

Possibly a suitable guide: https://akrzemi1.wordpress.com/2015/05/14/handling-short-codes-part-i/

1 Comment

You may notice a lot of "probably"s in the last para. That is deliberate.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.