0

I am writing a Comparator with which I can sort a String array using character count. The Strings are all lowercase ASCII characters.

Eg.

input = {"art", "bash", "tar"};
Expected sorted output = {"art", "tar", "bash"}; 

Here, it does not matter if "bash" is at the beginning, but "art" and "tar" should be one after the other, in no particular order as they have matching character count.

I have this code, where I am using a simple array to check and keep the character count and compare.

Comparator<String> comparator = new Comparator<String>() {
    
    @Override
    public int compare(String s1, String s2){
        if (s1.equals(s2)) return 0;
        
        int[] a1 = getCountArray(s1);
        int[] a2 = getCountArray(s2);
        if (Arrays.equals(a1, a2)) return 0; // character count matches

        return s1.compareTo(s2);
    }

    private int[] getCountArray(String s) {
        int[] array = new int[256];
        for (int i=0; i<s.length(); i++){
            array[s.charAt(i) - 'a'] ++;
        }
        return array;
    }

};
String[] input = {"art", "bash", "tar"};
Arrays.sort(input, comparator); 

However, it does not work. What am I doing wrong?

9
  • 4
    Can you explain what you mean by "sort a String array using character count"? From what I understand character count should make art and tar equal, but for some reason you placed bash between them which is unclear to me. Commented Dec 14, 2020 at 22:00
  • 2
    Both of your if conditions needs a closing ). In your second if condition there is a missing s in Arrays. Commented Dec 14, 2020 at 22:17
  • 1
    @MichaelChatiskatzi sorry, corrected. I removed some redundant code around it from my project to make it concise and missed a couple things. Commented Dec 14, 2020 at 22:44
  • 1
    I suspect that you may be looking for Arrays.compare method and use it like @Override public int compare(String s1, String s2){ return Arrays.compare(getCountArray(s1), getCountArray(s2)); }. If that does not give you results which you ware expecting then please point out difference between expected and actual result. Commented Dec 14, 2020 at 23:01
  • 1
    Anyway why are you using s.charAt(i) - 'a' at array[s.charAt(i) - 'a']? Since your array has 256 length why you want to shift value of index? If you will try to shift character which position is smaller than 'a' then you will end up with negative index.Shifting would make sense if you are sure that you are dealing only with characters in range a-z but for them you don't need array of length 256 (since there are less characters in a-z range). Commented Dec 14, 2020 at 23:05

1 Answer 1

2

Your comparator is not consistent:

  • when comparing "art" against "bash" it returns that "art" is smaller than "bash"
  • when comparing "bash" against "tar" it returns "bash" is smaller than "tar"
  • therefore when comparing "art" against "tar" it must return "art" is smaller than "tar" (since "art" < "bash" < "tar"), but your comparator returns "art" is the same as "tar".

When you call Arrays.sort(input, comparator); the sort method will do pair-wise comparisons and will always find that "bash" has to be before "tar".

To fix this you must replace return s1.compareTo(s2) with something that compares the character count arrays a1 and a2.

Sign up to request clarification or add additional context in comments.

3 Comments

> When you call Arrays.sort(input, comparator); the sort method will do pair-wise comparisons and will always find that "bash" has to be before "tar". I think this is what causing the problem and makes sense. Can you explain in a bit more detail, or point me to somewhere I can read about it?
@Ufder if you want to know how Arrays.sort() is implemented you can look directly into the source code. The JDK contains a file "lib/src.zip" with the source code of much of the Java standard library. The sort algorithm is implemented in the class java.util.TimSort
@Ufder there’s no sense in looking into implementation details. You expect the result order "art", "tar", "bash" despite your comparator says that "bash" is smaller than "tar", which contradicts your expected result. So it can’t work that way. Even if a particular implementation returned the expected result, it implied that you got the result from broken code by chance. You can’t expect reliable results from sorting with an inconsistent comparator.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.