Pointers and Pointer Functions

Question

Studying the K&R book in C I had a few questions regarding complicated pointer declarations and pointer-array relationships.

1) What exactly is the difference between

char amessage[] = "this is a string";

and

char *pmessage
pmessage = "this is a string"

and when would you use one or the other?

From my understanding the first one allocates some amount of memory according to the size of the string, and then stores the chars in the memory. Then when you access amessage[] you just directly access whatever char you're looking for. For the second one you also allocate memory except you just access the data through a pointer whenever you need it. Is this the correct way of looking at it?

2) The book says that arrays when passed into functions are treated as if you gave the pointer to the first index of the array and thus you manipulate the array through manipulating the pointer even though you can still do syntax like a[i]. Is this true if you just created an array somewhere and want to access it or is it only true if you pass in an array into a function? For example:

char amessage[]= "hi";
char x = *(amessage + 1); // can I do this?

3) The book says the use of static is great in this particular function:

/* month_name:  return name of n-th month */
char *month_name(int n)
{
    static char *name[] = {
       "Illegal month",
       "January", "February", "March",
       "April", "May", "June",
       "July", "August", "September",
       "October", "November", "December"
   };
   return (n < 1 || n > 12) ? name[0] : name[n];
}

I don't understand why exactly this is a good use of static. Is it because the char *name[] would get deleted after function return if it is not static (because its a local variable)? Then does that mean in c you can't do stuff like:

void testFunction(){
    int x = 1;
    return x; 
}

Without x being deleted before you use the return value? (Sorry I guess this might not be a pointer question but it was in the pointer chapter).

4) There are some complicated declaration like

char (*(*x())[])()

I'm really confused as to what is going on. So the x() part means a function x that returns a pointer? But what kind of pointer does it return its just a "" without like int or void or w/e. Or does that mean a pointer to a function (but I thought that would be like (*x)())? And then after you add brackets (because I assume brackets have the next precedence)...what is that? An array of functions?

This kind of ties to my confusion with function pointers. If you have something like

int (*func)()

That means a pointer to a function that returns an int, and the name of that pointer is func, but what does it mean when its like int (*x[3])(). I don't understand how you can replace the pointer name with an array.

Thanks for any help!

Kevin

In general, you should try to ask individual questions rather than one large multi-part question, especially when, as you say, the parts aren't really all that related. — Chris Lutz
– Chris Lutz, Commented Oct 22, 2011 at 4:15
Well done on asking the question. Most people don't go this detailed into explaining their issue. +1 — John Riselvato
– John Riselvato, Commented Oct 22, 2011 at 4:28
+1: because you did a great job asking the questions -- but please take Chris's advice and use multiple questions in the future. :) — sarnold
– sarnold, Commented Oct 22, 2011 at 4:30

Alex Brooks · Accepted Answer · 2013-06-25 21:30:54Z

1) What exactly is the difference between
char amessage[] = "this is a string";
and
char *pmessage
pmessage = "this is a string"
and when would you use one or the other?

amessage will always refer to the memory holding this is a string\0. You cannot change the address it refers to. pmessage can be updated to point to any character in memory, whether or not it is part of a string. If you assign to pmessage, you might lose your only reference to this is a string\0. (It depends if you made references anywhere else.)

I would use char amessage[] if I intended to modify the contents of amessage[] in place. You cannot modify the memory that pmessage points to. Try this little program; comment out amessage[0]='H' and pmessage[0]='H'; one at a time and see that pmessage[0]='H'; causes a segmentation violation:

#include <stdio.h>

int main(int argc, char* argv[]) {
    char amessage[]="howdy";
    char *pmessage="hello";
    amessage[0]='H';
    pmessage[0]='H';
    printf("amessage %s\n", amessage);
    printf("pmessage %s\n", pmessage);
    return 0;
}

Modifying a string that was hard-coded in the program is relatively rare; char *foo = "literal"; is probably more common, and the immutability of the string might be one reason why.

2) The book says that arrays when passed into functions are treated as if you gave the pointer to the first index of the array and thus you manipulate the array through manipulating the pointer even though you can still do syntax like a[i]. Is this true if you just created an array somewhere and want to access it or is it only true if you pass in an array into a function? For example:
char amessage[]= "hi";
char x = *(amessage + 1); // can I do this?

You can do that, however it is pretty unusual:

$ cat refer.c
#include <stdio.h>

int main(int argc, char* argv[]) {
    char amessage[]="howdy";
    char x = *(amessage+1);
    printf("x: %c\n", x);
    return 0;
}

$ ./refer
x: o
$

At least, I have never seen a "production" program that did this with character strings. (And I'm having trouble thinking of a program that used pointer arithmetic rather than array subscripting on arrays of other types.)

3) The book says the use of static is great in this particular function:
/* month_name:  return name of n-th month */
char *month_name(int n)
{
    static char *name[] = {
       "Illegal month",
       "January", "February", "March",
       "April", "May", "June",
       "July", "August", "September",
       "October", "November", "December"
   };
   return (n < 1 || n > 12) ? name[0] : name[n];
}
I don't understand why exactly this is a good use of static. Is it because the char *name[] would get deleted after function return if it is not static (because its a local variable)? Then does that mean in c you can't do stuff like:
void testFunction(){
    int x = 1;
    return x; 
}
Without x being deleted before you use the return value? (Sorry I guess this might not be a pointer question but it was in the pointer chapter).

In this specific case, I believe the static is needless; at least GCC is able to determine that the strings are not modified and stores them in the .rodata read-only data segment. However, that might be an optimization with string literals. Your example with another primitive data type (int) also works fine because C passes everything by value both on function calls and function returns. However, if you're returning a pointer to an object allocated on the stack then the static is absolutely necessary, because it determines where in memory the object lives:

$ cat stackarray.c ; make stackarray
#include <stdio.h>

struct foo { int x; };

struct foo *bar() {
    struct foo array[2];
    array[0].x=1;
    array[1].x=2;
    return &array[1];
}

int main(int argc, char* argv[]) {
    struct foo* fp;
    fp = bar();

    printf("foo.x: %d\n", fp->x);
    return 0;
}

cc     stackarray.c   -o stackarray
stackarray.c: In function ‘bar’:
stackarray.c:9:2: warning: function returns address of local variable

If you change the storage duration of array to static, then the address that is being returned is not automatically allocated, and will continue to work even after the function has returned:

$ cat staticstackarray.c ; make staticstackarray ; ./staticstackarray
#include <stdio.h>

struct foo { int x; };

struct foo *bar() {
    static struct foo array[2];
    array[0].x=1;
    array[1].x=2;
    return &array[1];
}

int main(int argc, char* argv[]) {
    struct foo* fp;
    fp = bar();

    printf("foo.x: %d\n", fp->x);
    return 0;
}

cc     staticstackarray.c   -o staticstackarray
foo.x: 2

You can see where the memory allocation changes between stackarray and staticstackarray:

$ readelf -S stackarray | grep -A 3 '\.data'
  [24] .data             PROGBITS         0000000000601010  00001010
       0000000000000010  0000000000000000  WA       0     0     8
  [25] .bss              NOBITS           0000000000601020  00001020
       0000000000000010  0000000000000000  WA       0     0     8
$ readelf -S staticstackarray | grep -A 3 '\.data'
  [24] .data             PROGBITS         0000000000601010  00001010
       0000000000000010  0000000000000000  WA       0     0     8
  [25] .bss              NOBITS           0000000000601020  00001020
       0000000000000018  0000000000000000  WA       0     0     8

The .bss section in the version without static is 8 bytes smaller than the .bss section in the version with static. Those 8 bytes in the .bss section provide the persistent address that is returned.

So you can see that the case with strings didn't really make a difference -- at least GCC doesn't care -- but pointers to other types of objects, the static makes all the difference in the world.

However, most functions that return data in function-local-static storage have fallen out of favor. strtok(3), for example, extracts tokens from a string, and if subsequent calls to strtok(3) include NULL as the first argument to indicate that the function should re-use the string passed in the first call. This is neat, but means a program can never tokenize two separate strings simultaneously, and multiple-threaded programs cannot reliably use this routine. So a reentrant version is available, strtok_r(3), that takes an additional argument to store information between calls. man -k _r will show a surprising number of functions that have reentrant versions available, and the primary change is reducing static use in functions.

4) There are some complicated declaration like
char (*(*x())[])()
I'm really confused as to what is going on. So the x() part means a function x that returns a pointer? But what kind of pointer does it return its just a "" without like int or void or w/e. Or does that mean a pointer to a function (but I thought that would be like (*x)())? And then after you add brackets (because I assume brackets have the next precedence)...what is that? An array of functions?

This kind of ties to my confusion with function pointers. If you have something like
int (*func)() 
That means a pointer to a function that returns an int, and the name of that pointer is func, but what does it mean when its like int (*x[3])(). I don't understand how you can replace the pointer name with an array.

First, don't panic. You'll almost never need anything this complicated. Sometimes it is very handy to have a table of function pointers and call the next one based on a state transition diagram. Sometimes you're installing signal handlers with sigaction(2). You'll need slightly complicated function pointers then. However, if you use cdecl(1) to decipher what you need, it'll make sense:

       struct sigaction {
           void     (*sa_handler)(int);
           void     (*sa_sigaction)(int, siginfo_t *, void *);
           sigset_t   sa_mask;
           int        sa_flags;
           void     (*sa_restorer)(void);
       };

cdecl(1) only understands a subset of C native types, so replace siginfo_t with void and you can see roughly what is required:

$ cdecl
Type `help' or `?' for help
cdecl> explain void     (*sa_sigaction)(int, void *, void *);
declare sa_sigaction as pointer to function
    (int, pointer to void, pointer to void) returning void

Expert C Programming: Deep C Secrets has an excellent chapter devoted to understanding more complicated declarations, and even includes a version of cdecl, in case you wish to extend it to include more types and typedef handling. It's well worth reading.

I liked the "Don't panic" part. That declaration is scary, but I've never seen something like that in a real working code. Cheers!

AusCBloke · Accepted Answer · 2011-10-22 05:59:10Z

This has to do with part 3 and is a kind of reply/addition to sarnold's comment. He's right in that with or without the static, the string literals are always going to be apart of the ~~.data~~ .rodata segment and essentially only created once. However, without the use of the word static, the actual array, that is the array of char pointers, will in fact be created on the stack each time the function is called.

With the use of static:

Dump of assembler code for function month_name:
   0x08048394 <+0>:   push   ebp
   0x08048395 <+1>:   mov    ebp,esp
   0x08048397 <+3>:   cmp    DWORD PTR [ebp+0x8],0x0
   0x0804839b <+7>:   jle    0x80483a3 <month_name+15>
   0x0804839d <+9>:   cmp    DWORD PTR [ebp+0x8],0xc
   0x080483a1 <+13>:  jle    0x80483aa <month_name+22>
   0x080483a3 <+15>:  mov    eax,ds:0x8049720
   0x080483a8 <+20>:  jmp    0x80483b4 <month_name+32>
   0x080483aa <+22>:  mov    eax,DWORD PTR [ebp+0x8]
   0x080483ad <+25>:  mov    eax,DWORD PTR [eax*4+0x8049720]
   0x080483b4 <+32>:  pop    ebp
   0x080483b5 <+33>:  ret

Without the use of static:

Dump of assembler code for function month_name:
   0x08048394 <+0>:   push   ebp
   0x08048395 <+1>:   mov    ebp,esp
   0x08048397 <+3>:   sub    esp,0x40
   0x0804839a <+6>:   mov    DWORD PTR [ebp-0x34],0x8048514
   0x080483a1 <+13>:  mov    DWORD PTR [ebp-0x30],0x8048522
   0x080483a8 <+20>:  mov    DWORD PTR [ebp-0x2c],0x804852a
   0x080483af <+27>:  mov    DWORD PTR [ebp-0x28],0x8048533
   0x080483b6 <+34>:  mov    DWORD PTR [ebp-0x24],0x8048539
   0x080483bd <+41>:  mov    DWORD PTR [ebp-0x20],0x804853f
   0x080483c4 <+48>:  mov    DWORD PTR [ebp-0x1c],0x8048543
   0x080483cb <+55>:  mov    DWORD PTR [ebp-0x18],0x8048548
   0x080483d2 <+62>:  mov    DWORD PTR [ebp-0x14],0x804854d
   0x080483d9 <+69>:  mov    DWORD PTR [ebp-0x10],0x8048554
   0x080483e0 <+76>:  mov    DWORD PTR [ebp-0xc],0x804855e
   0x080483e7 <+83>:  mov    DWORD PTR [ebp-0x8],0x8048566
   0x080483ee <+90>:  mov    DWORD PTR [ebp-0x4],0x804856f
   0x080483f5 <+97>:  cmp    DWORD PTR [ebp+0x8],0x0
   0x080483f9 <+101>: jle    0x8048401 <month_name+109>
   0x080483fb <+103>: cmp    DWORD PTR [ebp+0x8],0xc
   0x080483ff <+107>: jle    0x8048406 <month_name+114>
   0x08048401 <+109>: mov    eax,DWORD PTR [ebp-0x34]
   0x08048404 <+112>: jmp    0x804840d <month_name+121>
   0x08048406 <+114>: mov    eax,DWORD PTR [ebp+0x8]
   0x08048409 <+117>: mov    eax,DWORD PTR [ebp+eax*4-0x34]
   0x0804840d <+121>: leave  
   0x0804840e <+122>: ret

As you can see in the second example (without static), the array is allocated on the stack each time:

0x08048397 <+3>:   sub    esp,0x40

and the pointers are loaded into the array:

0x0804839a <+6>:   mov    DWORD PTR [ebp-0x34],0x8048514
0x080483a1 <+13>:  mov    DWORD PTR [ebp-0x30],0x8048522
...

So there's obviously a little more to be set up each time the function is called if you decide not to use static.

Ry- · Accepted Answer · 2011-10-22 04:12:56Z

1

3) It has nothing to do with that - static creates the array once, as opposed to creating it every time the function runs. Since the data in the array never changes, it is more efficient not to re-create it every time. Your example function would work fine, every time. It's a value. It won't be deleted before you can return it. That would be very unintuitive.

answered Oct 22, 2011 at 4:12

Ry-♦

226k56 gold badges496 silver badges504 bronze badges

2 Comments

sarnold Over a year ago

readelf --string-dump=.rodata shows identical output whether or not static is applied to the name array. I think using string literals changes the answer somewhat.

aroth Over a year ago

I wouldn't call #3 a "great" use of static, more like a necessary use of static in a language that has no scope between global and function/block-local. In most other languages, the proper place for a static value declaration is outside of the function definition and inside of the object/class definition/header. As written that C code is not intuitive to read, but it is useful in that it causes the static value to only be scoped within that one function. The alternative would make it be globally scoped.

Akronix · Accepted Answer · 2014-03-23 12:15:20Z

4) Adding some more information in the reply for the 4) point:
I'm following the next book to learn C: C for pascal Programmers by Norman J. Landis.
It's quite old and it's thought to be a bridge from pascal to C; but I find it so so so useful, completed and explained at the lowest level of the machine. For me it's an awesome book.
The chapter 5.3.1 in the appendix A talks precisely about this. (Blockquotes is content extracted from the book)
Definition of base type:

The type specifier appearing in the declaration containing the declarator is called the >base type

Basically, in bool x => bool is the base type and in int x[] => the base type for the array is int and the base type for the x is array of int.

In order to interpret complex declarators, the following rules apply:

Apply asterisk operators first.

Apply the "function of base type"( () ) and "array of returning base type" ( [] ) >operators afterward, from right to left. Of course, parentheses may enclose a declarator to alter the order of evaluation.

And there it is the same example changing the letter x by a letter w:

How I 'parse' this: char ((w())[])();

I'm going from outside of the parentheses to inside, after I follow the 2 rules said above. Steps:

Outside any parentheses, we find the function declarator. Then, so far we have a function returning a char.
Now, we enter in the parentheses and process prior pointer and after array.
Such pointer, is a pointer of "the upper base type", which is, we say, a function returning a char. Then we got pointer of function returning a char, so far.
Following to the array, it's an array of "the upper base type". And "the upper base type" = pointer to function returning a char.
Now, go into the deepest parentheses, we find a pointer and a function. Same manner, first pointer, after function.
We process the pointer => pointer to an array of pointers to functions returning a char.
And finally the function declarator, and we got: Function returning a pointer to an array of pointers to functions returning a char.

I hope now it's much clear.

But you'll need some time and practice to really understand and hand this, but once you get it, it's pretty easy ;)

Collectives™ on Stack Overflow

Pointers and Pointer Functions

4 Answers 4

1 Comment

Comments

2 Comments

How I 'parse' this: char ((w())[])();

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

Comments

2 Comments

How I 'parse' this: char (*(*w())[])();

Comments

Your Answer

Sign up or log in

Post as a guest

Related

How I 'parse' this: char ((w())[])();