1

Hey I need to count frequency of words and return a string listing them. I have to omit words that have less than 4 characters and words that have count of less than 10. I have to order them from highest to lowest count as well as alphabetically if count is same. Here's the code.

import java.util.*;
import java.util.stream.*;

public class Words {

    public String countWords(List<String> lines) {

    String text = lines.toString();
    String[] words = text.split("(?U)\\W+");

    Map<String, Long> freq = Arrays.stream(words).sorted()
        .collect(Collectors.groupingBy(String::toLowerCase,
            Collectors.counting()));

    LinkedHashMap<String, Long> freqSorted = freq.entrySet().stream()
        .filter(x -> x.getKey().length() > 3)
        .filter(y -> y.getValue() > 9)
        .sorted(Map.Entry.comparingByValue(Comparator.reverseOrder()))
        .collect(Collectors.toMap(Map.Entry::getKey,
            Map.Entry::getValue, (oldValue, newValue) -> oldValue,
            LinkedHashMap::new));

    return freqSorted.keySet().stream()
        .map(key -> key + " - " + freqSorted.get(key))
        .collect(Collectors.joining("\n", "", ""));
    }
}

I can't change the argument of this method. I have trouble sorting it alphabetically after sorting it by value. Tried using thenCompare but couldn't make it work. Aside from that I'd appreciate any feedback on how to reduce number of lines so I don't have to stream 3 times.

1
  • Can you provide sample input where your program produces different output from what is expected? Commented Jul 2, 2022 at 17:12

3 Answers 3

2

Another aproach to do it in one go without intermediate collecting into maps is to wrap your grouping collector in collectingAndThen, where you can format your final result :

public String countWords(List<String> lines) {
    String text = lines.toString();
    String[] words = text.split("(?U)\\W+");

    return Arrays.stream(words)
                .filter(s -> s.length() > 3)
                .collect(Collectors.collectingAndThen(
                         Collectors.groupingBy(String::toLowerCase, Collectors.counting()),
                         map -> map.entrySet()
                                 .stream()
                                 .filter(e -> e.getValue() > 9)
                                 .sorted(Map.Entry.<String, Long>comparingByValue().reversed()
                                         .thenComparing(Map.Entry.comparingByKey()))
                                 .map(e -> String.format("%s - %d", e.getKey(), e.getValue()))
                                 .collect(Collectors.joining(System.lineSeparator()))));
}
Sign up to request clarification or add additional context in comments.

Comments

1

Here is one approach. I am using your frequency count map as the source.

  • first define a comparator.
  • then sort putting the existing map into sorted order
  • toMap takes a key, value, merge function, and final map of LinkedhashMap to preserve the order.
Comparator<Entry<String, Long>> comp =
        Entry.comparingByValue(Comparator.reverseOrder());
comp = comp.thenComparing(Entry.comparingByKey());

Map<String, Long> freqSorted = freq.entrySet().stream()
        .filter(x -> x.getKey().length() > 3
                && x.getValue() > 9)
        .sorted(comp)
        .collect(Collectors.toMap(Entry::getKey,
                Entry::getValue, (a, b) -> a,
                LinkedHashMap::new));

Notes:

  • To verify that the sorting is proper you can comment out the filter and use fewer words.
  • you do not need to sort your initial stream of words when preparing the frequency count as they will be sorted in the final map.
  • the merge function is syntactically required but not used since there are no duplicates.
  • I chose not to use TreeMap as once the stream is sorted, there is no need to sort again.

1 Comment

It works! Ye I forgot to delete sorting from freq Map after I decided to just stream filtering and sorting to different map
0

The problem should be your LinkedHasMap because it only keeps insertion order and therefore can't be sorted. You can try using TreeMap since it can be sorted and keeps the order.

And I think you shouldn't focus about getting as less lines as possible instead try to get it as readable as possible for the future. So I think what you have there is fine because you split the streams in logical parts; Counting, Sorting and joining!

To swap to TreeMap just change the variable and collector type Would look like this:

import java.util.*;
import java.util.stream.*;

public class Words {

    public String countWords(List<String> lines) {

    String text = lines.toString();
    String[] words = text.split("(?U)\\W+");

    Map<String, Long> freq = Arrays.stream(words).sorted()
        .collect(Collectors.groupingBy(String::toLowerCase,
            Collectors.counting()));

    TreeMap<String, Long> freqSorted = freq.entrySet().stream()
        .filter(x -> x.getKey().length() > 3)
        .filter(y -> y.getValue() > 9)
        .sorted(Map.Entry.comparingByValue(Comparator.reverseOrder()))
        .collect(Collectors.toMap(Map.Entry::getKey,
            Map.Entry::getValue, (oldValue, newValue) -> oldValue,
            TreeMap::new));

    return freqSorted.keySet().stream()
        .map(key -> key + " - " + freqSorted.get(key))
        .collect(Collectors.joining("\n", "", ""));
    }
}

2 Comments

Well any advice on how to edit this LinkeHashSet to TreeSet? I'm having some troubles
I edited my question. Just changing the types from LinkedHashMap to TreeMap should do the job

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.