I've seen people's code as:
char *str = NULL;
and I've seen this is as well,
char *str;
I'm wonder, what is the proper way of initializing a string? and when are you supposed to initialize a string w/ and w/out NULL?
I've seen people's code as:
char *str = NULL;
and I've seen this is as well,
char *str;
I'm wonder, what is the proper way of initializing a string? and when are you supposed to initialize a string w/ and w/out NULL?
You're supposed to set it before using it. That's the only rule you have to follow to avoid undefined behaviour. Whether you initialise it at creation time or assign to it just before using it is not relevant.
Personally speaking, I prefer to never have variables set to unknown values myself so I'll usually do the first one unless it's set in close proximity (within a few lines).
In fact, with C99, where you don't have to declare locals at the tops of blocks any more, I'll generally defer creating it until it's needed, at which point it can be initialised as well.
Note that variables are given default values under certain circumstances (for example, if they're static storage duration such as being declared at file level, outside any function).
Local variables do not have this guarantee. So, if your second declaration above (char *str;) is inside a function, it may have rubbish in it and attempting to use it will invoke the afore-mentioned, dreaded, undefined behaviour.
The relevant part of the C99 standard 6.7.8/10:
If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static storage duration is not initialized explicitly, then:
- if it has pointer type, it is initialized to a null pointer;
- if it has arithmetic type, it is initialized to (positive or unsigned) zero;
- if it is an aggregate, every member is initialized (recursively) according to these rules;
- if it is a union, the first named member is initialized (recursively) according to these rules.
I'm wonder, what is the proper way of initializing a string?
Well, since the second snippet defines an uninitialized pointer to string, I'd say the first one. :)
In general, if you want to play it safe, it's good to initialize to NULL all pointers; in this way, it's easy to spot problems derived from uninitialized pointers, since dereferencing a NULL pointer will yield a crash (actually, as far as the standard is concerned, it's undefined behavior, but on every machine I've seen it's a crash).
However, you should not confuse a NULL pointer to string with an empty string: a NULL pointer to string means that that pointer points to nothing, while an empty string is a "real", zero-length string (i.e. it contains just a NUL character).
char * str=NULL; /* NULL pointer to string - there's no string, just a pointer */
const char * str2 = ""; /* Pointer to a constant empty string */
char str3[] = "random text to reach 15 characters ;)"; /* String allocated (presumably on the stack) that contains some text */
*str3 = 0; /* str3 is emptied by putting a NUL in first position */
str2 (which points to a string literal) is not modified, and str3 is allocated on the stack (or somewhere else, depending on where that code is put), the literal is just for initialization.str3 is an array, not a pointer! The literal is used for initialization, it's copied in the buffer (str3), which is automatically sized by the compiler to accommodate that string. Look e.g. here: msdn.microsoft.com/en-us/library/7w7xccx8%28VS.80%29.aspx.str3 has type char[] which becomes complete once it is initialised with the string literal to char[38]. This type is a modifiable lvalue. You are not modifying the string literal, you are modifying an lvalue that has the same type and contents as the string literal.this is a general question about c variables not just char ptrs.
It is considered best practice to initialize a variable at the point of declaration. ie
char *str = NULL;
is a Good Thing. THis way you never have variables with unknown values. For example if later in your code you do
if(str != NULL)
doBar(str);
What will happen. str is in an unknown (and almost certainly not NULL) state
Note that static variables will be initialized to zero / NULL for you. Its not clear from the question if you are asking about locals or statics
an unitialized pointer should be considered as undefined so to avoid generating errors by using an undefined value it's always better to use
char *str = NULL;
also because
char *str;
this will be just an unallocated pointer to somewhere that will mostly cause problems when used if you forget to allocate it, you will need to allocate it ANYWAY (or copy another pointer).
This means that you can choose:
NULL (this is a sort of rule to thumb)Don't initialise all your pointer variables to NULL on declaration "just in case".
The compiler will warn you if you try to use a pointer variable that has not been initialised, except when you pass it by address to a function (and you usually do that in order to give it a value).
Initialising a pointer to NULL is not the same as initialising it to a sensible value, and initialising it to NULL just disables the compiler's ability to tell you that you haven't initialised it to a sensible value.
Only initialise pointers to NULL on declaration if you get a compiler warning if you don't, or you are passing them by address to a function that expects them to be NULL.
If you can't see both the declaration of a pointer variable and the point at which it is first given a value in the same screen-full, your function is too big.
Because free() doesn't do anything if you pass it a NULL value you can simplify your program like this:
char *str = NULL;
if ( somethingorother() )
{
str = malloc ( 100 );
if ( NULL == str )
goto error;
}
...
error:
cleanup();
free ( str );
If for some reason somethingorother() returns 0, if you haven't initialized str you will free some random address anywhere possibly causing a failure.
I apologize for the use of goto, I know some finds it offensive. :)
Your first snippet is a variable definition with initialization; the second snippet is a variable definition without initialization.
The proper way to initialize a string is to provide an initializer when you define it. Initializing it to NULL or something else depends on what you want to do with it.
Also be aware of what you call "string". C has no such type: usually "string" in a C context means "array of [some number of] char". You have pointers to char in the snippets above.
Assume you have a program that wants the username in argv[1] and copies it to the string "name". When you define the name variable you can keep it uninitialized, or initialize it to NULL (if it's a pointer to char), or initialize with a default name.
int main(int argc, char **argv) {
char name_uninit[100];
char *name_ptr = NULL;
char name_default[100] = "anonymous";
if (argc > 1) {
strcpy(name_uninit, argv[1]); /* beware buffer overflow */
name_ptr = argv[1];
strcpy(name_default, argv[1]); /* beware buffer overflow */
}
/* ... */
/* name_uninit may be unusable (and untestable) if there were no command line parameters */
/* name_ptr may be NULL, but you can test for NULL */
/* name_default is a definite name */
}
By proper you mean bug free? well, it depends on the situation. But there are some rules of thumb I can recommend.
Firstly, note that strings in C are not like strings in other languages.
They are pointers to a block of characters. The end of which is terminated with a 0 byte or NULL terminator. hence null terminated string.
So for example, if you're going to do something like this:
char* str;
gets(str);
or interact with str in any way, then it's a monumental bug. The reason is because as I have just said, in C strings are not strings like other languages. They are just pointers. char* str is the size of a pointer and will always be.
Therefore, what you need to do is allocate some memory to hold a string.
/* this allocates 100 characters for a string
(including the null), remember to free it with free() */
char* str = (char*)malloc(100);
str[0] = 0;
/* so does this, automatically freed when it goes out of scope */
char str[100] = "";
However, sometimes all you need is a pointer.
e.g.
/* This declares the string (not intialized) */
char* str;
/* use the string from earlier and assign the allocated/copied
buffer to our variable */
str = strdup(other_string);
In general, it really depends on how you expect to use the string pointer. My recommendation is to either use the fixed size array form if you're only going to be using it in the scope of that function and the string is relatively small. Or initialize it to NULL. Then you can explicitly test for NULL string which is useful when it's passed into a function.
Beware that using the array form can also be a problem if you use a function that simply checks for NULL as to where the end of the string is. e.g. strcpy or strcat functions don't care how big your buffer is. Therefore consider using an alternative like BSD's strlcpy & strlcat. Or strcpy_s & strcat_s (windows).
Many functions expect you to pass in a proper address as well. So again, be aware that
char* str = NULL;
strcmp(str, "Hello World");
will crash big time because strcmp doesn't like having NULL passed in.
You have tagged this as C, but if anyone is using C++ and reads this question then switch to using std::string where possible and use the .c_str() member function on the string where you need to interact with an API that requires a standard null terminated c string.
gets is a monumental bug whether you allocate the space or not :-)NULL since it introduces ambiguity with the null pointer constant, which is a fundamentally different concept altogether.gets_s as a replacement to gets, obviously taking an additional parameter specifying the size of the buffer.fgets(...,stdin). They should have just deprecated gets then removed it totally from c2x :-)