4

I came across a interview question that asked to remove the repeated char from a given string, in-place. So if the input was "hi there" the output expected was "hi ter". It was also told to consider only alphabetic repititions and all the alphabets were lower case. I came up with the following program. I have comments to make my logic clear. But the program does not work as expectd for some inputs. If the input is "hii" it works, but if its "hi there" it fails. Please help.

#include <stdio.h>
int main() 
{
    char str[] = "programming is really cool"; // original string.
    char hash[26] = {0}; // hash table.
    int i,j; // loop counter.
// iterate through the input string char by char.
for(i=0,j=0;str[i];)
{
    // if the char is not hashed.
    if(!hash[str[i] - 'a'])
    {
        // hash it.
        hash[str[i] - 'a'] = 1;
        // copy the char at index i to index j.
        str[j++] = str[i++];
    }
    else
    {
        // move to next char of the original string.
        // do not increment j, so that later we can over-write the repeated char.
        i++;
    }
}

// add a null char.
str[j] = 0;

// print it.
printf("%s\n",str); // "progamin s ely c" expected.

return 0;

}

2
  • 2
    Have you made a typo in the expected output; should it not be either "hi there" -> "hi ter" or "Hi there" -> "Hi ther"? Commented Feb 11, 2010 at 11:33
  • That one functions is an associative array, but definitely not a hash table. Commented Feb 11, 2010 at 12:04

7 Answers 7

11

when str[i] is a non-alphabet, say a space and when you do:

hash[str[i] - 'a']

your program can blow.

ASCII value of space is 32 and that of a is 97 so you are effectively accessing array hash with a negative index.

To solve this you can ignore non-alphabets by doing :

if(! isalpha(str[i]) {
    str[j++] = str[i++]; // copy the char.
    continue;  // ignore rest of the loop.
}
Sign up to request clarification or add additional context in comments.

1 Comment

You might need to actually do an explicit check for lower case only as well?
2

This is going to break on any space characters (or anything else outside the range 'a'..'z') because you are accessing beyond the bounds of your hash array.

Comments

2
void striprepeatedchars(char *str)
{
    int seen[UCHAR_MAX + 1];
    char *c, *n;

    memset(seen, 0, sizeof(seen));

    c = n = str;
    while (*n != '\0') {
        if (!isalpha(*n) || !seen[(unsigned char) *n]) {
            *c = *n;
            seen[(unsigned char) *n]++;
            c++;
        }
        n++;
    }
    *c = '\0';
}

Comments

2

This is code golf, right?

d(s){char*i=s,*o=s;for(;*i;++i)!memchr(s,*i,o-s)?*o++=*i:0;*o=0;}

1 Comment

You should post that to IOCCC... ;)
1

...

// iterate through the input string char by char.
for(i=0,j=0;str[i];)
{
  if (str[i] == ' ')
  {
    str[j++] = str[i++];
    continue;
  }

    // if the char is not hashed.
    if(!hash[str[i] - 'a'])
    {

...

1 Comment

If there is a non-alphabet other than space, say like a '?', this will fail again.
1
#include <stdio.h>
#include <string.h>

int hash[26] = {0};

static int in_valid_range (char c);
static int get_hash_code (char c);

static char * 
remove_repeated_char (char *s)
{
  size_t len = strlen (s);
  size_t i, j = 0;
  for (i = 0; i < len; ++i)
    {
      if (in_valid_range (s[i]))
    {
      int h = get_hash_code (s[i]);
      if (!hash[h])
        {
          s[j++] = s[i];
          hash[h] = 1;
        }
    }
      else
    {
      s[j++] = s[i];
    }
    }
  s[j] = 0;
  return s;
}

int
main (int argc, char **argv)
{
  printf ("%s\n", remove_repeated_char (argv[1]));
  return 0;
}

static int 
in_valid_range (char c)
{
  return (c >= 'a' && c <= 'z');
}

static int 
get_hash_code (char c)
{
  return (int) (c - 'a');
}

1 Comment

Two things: in_valid_range() is just islower(), so can be removed for brevity; get_hash_code() (and all treatments of hash codes) should be unsigned, since 'char' might not be, thus (int) (c - 'a') can be negative.
1
char *s;
int i = 0;

for (i = 0; s[i]; i++)
{
    int j;
    int gap = 0;
    for (j = i + 1; s[j]; j++)
    {
        if (gap > 0)
            s[j] = s[j + gap];
        if (!s[j])
            break;
        while (s[i] == s[j])
        {
            s[j] = s[j + gap + 1];
            gap++;
        }
    }
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.