6

While reading one book named Cracking the coding interview by Gayle Laakmann, i came across this question

Design an algorithm and write code to remove the duplicate characters in a string without using any additional buffer. NOTE: One or two additional variables are fine. An extra copy of the array is not.

and this code :-

 public static void removeDuplicates(char[] str) {
        if (str == null) {
            return;
        }
        int len = str.length;
        if (len < 2) {
            return;
        }

        int tail = 1;

        for (int i = 1; i < len; ++i) {
            int j;
            for (j = 0; j < tail; ++j) {
                if (str[i] == str[j]) {
                    break;
                }
            }
            if (j == tail) {
                str[tail] = str[i];
                ++tail;
            }
        }
        str[tail] = 0;
    }

which is supposed to remove duplicate character from the array. I don't quiet seem to understand what the algorithm is doing by replacing the same character again and again. I thought it's only me who feels that the algorithm is not working but infact when i ran this code it's giving me wrong outputs. Is this serious error in book or have i not understood the question?

5 Answers 5

7

Algo seems to be working but not clearing leftover characters. Changed code to following and it works: Note: Replaced:

str[tail] = 0;

with :

    for(; tail < len;tail++){
        str[tail] = 0;
    }

public static void removeDuplicates(char[] str) {
        if (str == null) {
            return;
        }
        int len = str.length;
        if (len < 2) {
            return;
        }

        int tail = 1;

        for (int i = 1; i < len; ++i) {
            int j;
            for (j = 0; j < tail; ++j) {
                if (str[i] == str[j]) {
                    break;
                }
            }

            if (j == tail) {
                str[tail] = str[i];
                ++tail;
            }

        }
        for(; tail < len;tail++){
            str[tail] = 0;
        }

    }
Sign up to request clarification or add additional context in comments.

1 Comment

for char[] str = {'a','a'}; it gives [a, ]
2

A solution using a bit-vector.

Time: O(n), where n = length of the string

Space: O(1)

void removeduplicatas(char str[]){
    int i, checker = 0, bitvalue = 0, value = 0, tail = 0;
    i = 0;
    tail = 0;
    while(str[i]){
        value = str[i] - 'a';
        bitvalue = 1 << value;
        if((checker & bitvalue) == 0 ){
            str[tail++] = str[i];
            checker |= bitvalue;
        }
        i++;
    }
    str[tail] = '\0';
}

Comments

1

In Java arrays are of fixed size. So the called function cannot change the size of the input array if it finds any duplicates. Your function is just making the start index of the sub-array which has duplicates to 0. So when you print the array contents in the calling function the element which has been made 0 does not get printed but elements following it (if any) do get printed.

The answer by YoK makes all the elements of the sub-array that are duplicates to 0. So that when you print it in the calling function the duplicates don't get printed. But you need to remember that the size of the array is still unchanged.

Alternatively you can return the size of the sub-array which has unique characters. Which in your case is tail.

One more alternative is to pass the input as a StringBuffer and make the changes in-place as:

public static void removeDuplicates(StringBuffer str) {                        

        int len = str.length();

        // if the string as less than 2 char then it can't have duplicates.
        if (len < 2) {                         
                return;
        }

        // fist character will never be duplicate.
        // tail is the index of the next unique character.
        int tail = 1;

        // iterate from 2nd character.
        for (int i = 1; i < len; ++i) {
                int j;

                // is char at index i already in my list of uniq char?
                for (j = 0; j < tail; ++j) {
                        if (str.charAt(i) == str.charAt(j)) {
                                break;
                        }      
                }

                // if no then add it to my uniq char list.
                if (j == tail) {                       
                        str.setCharAt(tail, str.charAt(i));

                        // increment tail as we just added a new ele.
                        ++tail;
                }
        }
        // at this point the characters from index [0,tail) are unique
        // if there were any duplicates they are between [tail,input.length)
        // so truncate the length of input to tail.
        str.setLength(tail);
}

Ideone Link

Comments

1

This is one solution using C++ and recursion to loop through each character of the string and using the above method of bitstring in a fixed width char. You need to make sure that the fixed wide string is longer than the necessary k type characters to check.

#include <cstdint>
#include <iostream>

bool CheckUniqueChars(char *string, uint32_t index, uint32_t checker){

char character = string[index];

if(character=='\0'){
    return true;
}else{
    int value = character - 'a';

    if((checker&(1<<value))>0){
        return false;
    }else{
       checker |= (1<<value);
       return CheckUniqueChars(string,++index,checker);
    }
   }
}


int main(int argc, char *argv[]){

    char *string = argv[1];
    uint32_t idx=0,checker=0;

 if(CheckUniqueChars(string,idx,checker)){
        std::cout << "all characters are unique" << std::endl;
 }else{
    std::cout << "there are duplicate characters" << std::endl;
 }

 return 0;
}

Comments

0

I improvised code given by YoK to avoid using

for(; tail < len;tail++){
       str[tail] = 0;
}

Instead we can set blank in first loop itself.

public static void removeDuplicates(char[] str){
    if (str == null) {
        return;
    }
    int len = str.length;
    if (len < 2) {
        return;
    }

    int tail = 1;

    for(int i=1;i<len;++i){
        int j;
        for(j=0;j<tail;++j){
            if(str[i] == str[j]) break;
        }
        if(j==tail){
            str[tail] = str[i];
            if(i!=tail)str[i]=0;
            ++tail;
        }else{
            str[i]=0;
        }

    }
}

1 Comment

Its true that algorithm given in that book, doesn't work. It has already been corrected by other answers like answer by YoK. I improved YoK's answer to avoid using another for loop, edited my answer. Thanks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.