1

I am trying to print every fourth character of a string using pointers.

I was able to achieve it using the following code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char const *argv[]){

char* string = "WhatTheeHell";
char* p = string;

while(*p != '\0'){
    printf("%c\n", *p);
    p += 4;
}

return 0;

}

This correctly gives the output to me as:

W

T

H

Now, I tried another way to do it, which according to my knowledge of pointer arithmetic, should have worked, since the size of int on my machine is 4 bytes.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char const *argv[]){


    char* string = "WhatTheeHell";
    int* p = (int*)string;

    while(*p != '\0'){
        printf("%c\n", *p);
        p += 1;
    }

    return 0;

}

When I print the output for this, it gives me WTH followed by four newlines.

My question, why are the four newlines being printed? Shouldn't it be that after the H is printed, the *p gives a NULL character present at the end of the string literal and the loop gets terminated?

2
  • 3
    Why would *p be 0 when it includes 3 more bytes beyond the array? It's undefined behaviour. Commented Jul 19, 2019 at 9:29
  • 3 errors in one: array out of bounds, misaligned accesses, and strict aliasing violations. Commented Jul 19, 2019 at 22:49

2 Answers 2

4

For starters the first program is invalid. If a string has the number of characters that is not divisible by 4 you will have undefined behavior because the pointer can point to beyond the string using the pointer arithmetic

p += 4;

A valid approach can look the following way as it is shown in the demonstrative program

#include <stdio.h>

int main( void )
{
    const char *s = "WhatTheeHell";
    const size_t STEP = 4;

    if ( *s )
    {
        const char *p = s;
        do
        {
            printf( "%c\n", *p );
            for ( size_t i = 0; i < STEP && *p; ++i, ++p );
        } while ( *p );            
    }
}

Its output is

W
T
H

The second your program is also invalid. For starters it is not necessary thet a string literal is aligned by sizeof( int ) (moreover in general the sizeof( int ) can be greater or less than 4). So the program already has undefined behavior.

In this condition

*p != '\0'

there are compared sizeof( int ) bytes (not a single byte ) with the integer constant '\0'. So the comparison will be always true when the pointer points inside the string (and even beyond the string if there are no zero bytes).

By the way also pay attention to that in C sizeof( '\0' ) is equal to sizeof( int ).

Sign up to request clarification or add additional context in comments.

Comments

3

When you compare *p to \0, *p is not a single char but a (presumably) four-byte value (the char is promoted to integer before comparison).
Nothing guaranties that after the \0 of your string there are other zero bytes: this is undefined behaviour.

Thus your loop continues until it reaches by chance a zero integer.

When printing with "%c\n", only the lower byte of the integer is considered; if its value is zero (nothing is guaranteed) it does not print anything visible, just the following \n

1 Comment

Yeah, I thought that was the problem too. But then, how is it printing the W too? I mean, when it dereferences the first time, it should print W followed by the next three bytes right, since W just takes one byte?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.