2

Is there any efficient way (that not includes "contains" or "indexOf" methods) to remove duplicate characters (included) from a given String.

For example:

input: "abcdcb" output: "ad"

input: "abracadabra" output: "cd"

I have tried using RegEx but something got wrong:

public static String function(String str) {
    return str.replaceAll("([a-z]+)\\1", "");
}

Edit: Longest, non efficient solution:

public static String function(String s) {
    String result = "";

    for (int i = 0; i < s.length(); i++)
        if(s.length() - s.replace("" + s.charAt(i), "").length() == 1)
            result += "" + s.charAt(i);
    
    return result;
}
2
  • 2
    Create a Map<Character, Integer> in which you will store how many times each character appeared. Then only collect those which appeared once. Commented Jun 25, 2022 at 19:16
  • @Pshemo maybe something earlier than the 8th edition? Commented Jun 25, 2022 at 19:23

3 Answers 3

6

A single regular expression cannot do this.

As Pshemo suggested, create a Map that counts the character occurrences:

Map<Integer, Long> codePointCounts = s.codePoints().boxed().collect(
    Collectors.groupingBy(
        c -> c, LinkedHashMap::new, Collectors.counting()));

(Using LinkedHashMap will preserve the order of the code points when they are placed in the Map.)

Then discard all the Map entries which don’t have a count of one:

codePointCounts.values().retainAll(Set.of(1L));

Removing a Map value will remove its corresponding key, so any remaining Map keys are unique codepoints from the original String. You can create a String from an array of ints:

int[] codePoints = codePointCounts.keySet().stream()
    .mapToInt(Integer::intValue).toArray();

String uniqueChars = new String(codePoints, 0, codePoints.length);
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for your answer. It stays O(1) or there will be collisions while removing the keys?
It's not O(1) because it has to visit all the characters in the string. So, it is O(n). The hash map still gives approximately O(1) even with collisions.
3

Here's another approach. I map characters to indices of an int array. The array is used to count the number of times a character appears in the string. If the count is zero, then it is the first time it's seen and so it's added to the char array r. At the end, we return a string constructed from the char array r.

    public static String function(String s) {
        int[]  a = new int[27];
        char[] r = new char[s.length()];
        int len = 0;
        for(int i=0; i<s.length(); i++)
            if (a[s.charAt(i)-'a']++ ==0) 
               r[len++] = s.charAt(i);
        return new String(r, 0, len);
    }

Comments

2

Using VGR's approach but only using java.util.stream

String value="abracadabra";             

String uniqueChars = Arrays.asList(value.split(""))   //creating a list from String
        .stream()
        .collect(Collectors.groupingBy(s->s,LinkedHashMap::new, Collectors.counting())) // grouping by all values with their no of occurrences(as count)
        .entrySet() //after creating the map traversing through the entryset 
        .stream()
        .filter(m->m.getValue()==1L) // fetch values which have count=1
        .map(Map.Entry::getKey) 
        .collect(Collectors.joining());

System.out.println(uniqueChars);
//output = cd

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.