4

I have a some confusion when it comes to multidimensional arrays. The question that came closest to helping me in my understanding was this post

Pointer address in a C multidimensional array

I have a multidimensional array initialized as follows int zippo[4][2] = {{2, 4}, {6, 8}, {1, 3}, {5, 7}};

When I print the variables zippo and *zippo it shows the same memory address for both, but when I print **zippo it prints 2 (the first value in the first subarray). My question is how does the compiler know that when zippo is dereferenced twice to print the first value of the first array? For example, for the sake of simplicity, if the memory address of zippo is 30 and the value of zippo and *zippo is 15, then you should have the following representation in memory?

memory addresses

It is my understanding that *zippo goes to memory location 15 to find the value at the location, which just so happens to be 15. So, shouldn't dereferencing it another time cause 15 to be printed yet again?

1
  • There are already hundreds of questions about jagged arrays (something like int **). Why do you expect a completely different type can be used?? If you have an int, you cannot printf a _Complex! Commented Dec 26, 2016 at 4:05

3 Answers 3

1

You're thinking too low-level. Your question pertains to variable names and types (at the language level).

When you declare int zippo[4][2] = {{2, 4}, {6, 8}, {1, 3}, {5, 7}};, you end up with an array of four arrays of two ints. They can be accessed in many ways, depending on what you need to express. Here are some of the sub-objects involved:

| 2 | 4 | 6 | 8 | 1 | 3 | 5 | 7 |    The storage for zippo: 8 contiguous ints

|<--+---+---+---+---+---+---+-->|    zippo (the whole array), is an int[4][2]
|<--+-->|                            zippo[0] (also known as *zippo) is an int[2]
                |<--+-->|            zippo[2] is also an int[2]
|<->|                                zippo[0][0] (also known as **zippo) is an int
            |<->|                    zippo[1][1] is also an int

You can see that these sub-objects can overlap, and in some case share addresses. What still makes them distincts objects (for you, the language, and the compiler) is their type.

For example, zippo[0] and zippo[0][0] (which is its first half) have the same address, but one of them is an int, while the other is an array of two ints.

That is why you can't keep indexing into zippo[0][0], or try to use zippo[0] inside an integer calculation: even though they share the same storage, they're different objects with different meanings.

And even though indexing into arrays involves pointer arithmetic, there is no actual chain of pointers, no int*** that your first understanding implies. It's all variable names.

Sign up to request clarification or add additional context in comments.

Comments

0

No *zippo do not go to location 15 then find the value at this location. If it was so then printf(" * ((int **) zippo) = %p\n", * ((int **) zippo) ); would output the same thing as printf(" *zippo = %p\n", *zippo); and that is not the case.

When I run this code this is what I obtain :

#include <stdio.h>

int zippo[4][2] = {{2, 4}, {6, 8}, {1, 3}, {5, 7}};

int main(){
    printf("zippo[0] = %p\n", (void *) (zippo[0]) );
    printf("  zippo = %p\n", (void *) zippo);
    printf("  *zippo = %p\n", (void *) (*zippo) );
    printf("  **zippo = %d\n", (int) ( **zippo) );
    printf("  * ((int **) zippo)  = %p\n", (void *) (* ((int **) zippo) ));
}

This is what I obtain :

zippo = 0x804a040
*zippo = 0x804a040
**zippo = 2
* ((int **) zippo)  = 0x2

I compiled this code using gcc -Wall -Wextra -Wpedantic -pedantic to ensure that no warning is hidden and the option -m32 to have 32 bits adresses (same size as int).

I actually had a hard time understanding what is happening in there so I decided to have a look at the corresponding assembly code. using gcc -S file.c -o file.s I obtain the following.

First variable declaration :

    .globl  zippo
    .data
    .align 32
    .type   zippo, @object
    .size   zippo, 32
zippo:
    .long   2
    .long   4
    .long   6
    .long   8
    .long   1
    .long   3
    .long   5
    .long   7
    .section    .rodata
.LC0:
    .string "zippo[0] = %p\n"
.LC1:
    .string "  zippo = %p\n"
.LC2:
    .string "  *zippo = %p\n"
.LC3:
    .string "  **zippo = %d\n"
.LC4:
    .string "  * ((int **) zippo)  = %p\n"

The correcsponding assembly for printf("zippo[0] = %p\n", (void *) (zippo[0]) ); :

movl    $zippo, %esi
movl    $.LC0, %edi
movl    $0, %eax
call    printf

The correcsponding assembly for printf(" zippo = %p\n", (void *) zippo); :

movl    $zippo, %esi
movl    $.LC1, %edi
movl    $0, %eax
call    printf

The correcsponding assembly for printf(" *zippo = %p\n", (void *) (*zippo) );

movl    $zippo, %esi
movl    $.LC2, %edi
movl    $0, %eax
call    printf

The correcsponding assembly for printf(" **zippo = %d\n", (int) ( **zippo) ); :

movl    $zippo, %eax
movl    (%rax), %eax
movl    %eax, %esi
movl    $.LC3, %edi
movl    $0, %eax
call    printf

The correcsponding assembly for printf(" * ((int **) zippo) = %p\n", (void *) (* ((int **) zippo) ));

movl    $zippo, %eax
movq    (%rax), %rax
movq    %rax, %rsi
movl    $.LC4, %edi
movl    $0, %eax
call    printf

As you can notice here, for the 3 first printf, the corresponding assembly is exactely the same (what changes is LCx which corresponds to format). Same thing for the last 2 printf.

My understanding is that as the compiler is aware that zippo is a 2 dimensional array, and therefore knows that *zippo is 1 dimensional array whose data starts at the adress of the first element.

12 Comments

It is not ignored!
The code invokes undefined behaviour. You read an int array as pointer to int. The fact that %p expects a void * and you have to cast is another cause of UB (the compiler should actually complain). "compiler knows that *zippo dosen't even exist in memory" is nonsense. Of course does *zippo exist. It is an array.
@olaf The code do not invoke an undefined behavior. If changing the last line with printf(" * ((int **) zippo) = %d\n", * ((int **) zippo) ); then the ouput is always 2 (and of course compiler is complaining about expecting int * not int). Another thing : I converted the C code above to assembly to understand what happens exactly and it seems like *zippo is treated as zippo. I'll be updating my answer but I would like to have your opinion on this first. Thx
rq : 2 is the first value of the array
No idea what you mean. Never use a cast if you don't really understand all implications. Wildly casting just to silence the compiler is a guarantee for disaster.
|
-1

Although it looks like if the memory address of zippo is 30 and the value of zippo and *zippo is 15 is happening, it is not happening. Think in terms of data types. Take a super dimensional array.

int zappo[2][3][4][5][6] = {{{{{45,55,66,77,88,99},{12,22,32....

When you define a variable like this (on the stack, not using chained malloc) the compiler does not allocate one value zappo, another for *zappo and another for **zappo etc. It writes 45,55,66,77,88,99,12,22,32 in a contiguous block of memory, say at 0xfe. Now at compile time, it knows that

zappo is a pointer, and has a value 0xfe
*zappo is a pointer, and has a value 0xfe
**zappo is a pointer, and has a value 0xfe
***zappo is a pointer, and has a value 0xfe
****zappo is a pointer, and has a value 0xfe
*****zappo is a pointer, and has a value 0xfe, but this one points to an int!

The compiler thinks in terms of data type. So only the last dereference results in an int, the rest, to just one address. This is not the same as declaring

int *****zappo;

and painstakingly creating the array structure manually (in the heap, with some alloc). That's where you can use your box analogy.

3 Comments

Ah this makes more sense. I am coming from Java and am used to arrays only being dynamically allocated. Thank you
Welcome. It is a quirk of C, known as array decaying into pointer. Further details can be found in the book Deep C secrets, Chap 4, for example. But not necessary for an operating knowledge.
Don't use implementation details at this level. There is no requirement for a stack in the C language for automatic variables. *zappo is not a pointer, but an array! **zappo is not a pointer, but an array! ... And none of them has a value of 0xfe. That's just their address. What is the value of an array? Arrays are not pointers (which you actually state with the second half)!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.