1

I'm creating a hash table. Each value is a string. I have the problem of deciding what structure to use to store the string. Intuitively I thought of std::string and char*. But,

1), std::string seems to use the stack if the string is short. That means it's not a good choice if my hash table is really big.
2), If using char* then I don't know what to return if I want to change a value, for example like in the following situation: myTable[i] = changedString; It seems in this case I'll need to implement a new string class. But I'm feeling it won't be necessary with std::string there.

Could anyone give any suggestions/comments? Thanks!

3
  • 3
    "std::string seems to use the stack if the string is short". How should it use stack if you are going to store them in a unordered_map<K,V>? I'd say, stick with string until you verify there is any problem with that. Commented Nov 2, 2013 at 13:51
  • emm I'm implementing my own hash table. But maybe I'll stick to std::string now. Thanks~ Commented Nov 2, 2013 at 14:43
  • Use neither. Instead have a template parameter to let the user of the table pass use any string class Commented Nov 2, 2013 at 14:55

3 Answers 3

1

I'll assume you are trying to implement unordered_map (H.W?) and that this is why you dont use it.

you should use std::vector, or std::string, but don't use the array.

And why is there problem of std::string using a stack?

Sign up to request clarification or add additional context in comments.

3 Comments

I think he meant "short string optimization". One of the stdlib implementations statically allocates an array of 19 characters for short strings.
Oh there's no specific reason of std::string using a stack. I think it's because I don't have much experience judging how to arrange the memory. Because I learned 'stack has a limited size and is for temporary variables', I kinda want to leave it for local small value-based variables and design most of my complex structures using the heap. Don't know how to decide what's better/necessary..
@flyinggip You should not concern yourself with that, and you are talking about the runtime-stack, en.wikipedia.org/wiki/Call_stack. The ADT stack is not limited on size - en.wikipedia.org/wiki/Stack_(abstract_data_type), as for what std::string uses internally, it should not concern you, nor you should count on it as it is a compiler specification. The public API string has is the same for all implementation and string sizes.
0

If your goal is to create a hash table, you should try to eliminate any distractions that would make that specific task more complicated. As such you should use std::string for the mutable values in your table so you don't have to spend development effort on allocating and deallocating char*

Once your hash table is functional and correct, if you have a reason to move to char*, then you can always change to that later on. Focus on your highest priority goal, the hash table, and don't spend time on trying to beat std::string performance until your first goal is reached; beating std::string may not be worthwhile in any case.

1 Comment

Thanks I think this quite fits my level of experience now. I'd vote it if I've got more reputation.
0

The overhead caused by std::string is minimal, actually AFAIK apart from the pointer to the string's internal buffer, there are just size and capacity members, both of type size_t, causing let's say (it's environment dependent) 8 bytes per string, so if you have an array of 100 000 strings, there would be about 780KB overhead, which is something I wouldn't worry about unless you are in an environment with strict memory limitations.

In case the length of the string is fixed or varies in minimal way (let's say 2 to 4 characters), then it could be more reasonable to use an array with automatic storage duration:

struct X {
    ...
    char code[4]; // up to 4 characters
};

which would work fine even while copying instances of it in a following way:

X x1, x2;
...
x2 = x1;

However in case you don't have a really good reason to worry about this right now, anything you'd do at this point is pretty much a premature optimization.

1 Comment

Thanks for the detailed answer! I think one day I'll come back to the examples again.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.