6

I have an instances of Student class.

class Student {
    String name;
    String addr;
    String type;

    public Student(String name, String addr, String type) {
        super();
        this.name = name;
        this.addr = addr;
        this.type = type;
    }

    @Override
    public String toString() {
        return "Student [name=" + name + ", addr=" + addr + "]";
    }

    public String getName() {
        return name;
    }

    public String getAddr() {
        return addr;
    }
}

And I have a code to create a map , where it store the student name as the key and some processed addr values (a List since we have multiple addr values for the same student) as the value.

public class FilterId {

public static String getNum(String s) {
    // should do some complex stuff, just for testing
    return s.split(" ")[1];
}

public static void main(String[] args) {
    List<Student> list = new ArrayList<Student>();
    list.add(new Student("a", "test 1", "type 1"));
    list.add(new Student("a", "test 1", "type 2"));
    list.add(new Student("b", "test 1", "type 1"));
    list.add(new Student("c", "test 1", "type 1"));
    list.add(new Student("b", "test 1", "type 1"));
    list.add(new Student("a", "test 1", "type 1"));
    list.add(new Student("c", "test 3", "type 2"));
    list.add(new Student("a", "test 2", "type 1"));
    list.add(new Student("b", "test 2", "type 1"));
    list.add(new Student("a", "test 3", "type 1"));
    Map<String, List<String>> map = new HashMap<>();

    // This will create a Map with Student names (distinct) and the test numbers (distinct List of tests numbers) associated with them.
    for (Student student : list) {
        if (map.containsKey(student.getName())) {
            List<String> numList = map.get(student.getName());
            String value = getNum(student.getAddr());

            if (!numList.contains(value)) {
                numList.add(value);
                map.put(student.getName(), numList);
            }
        } else {
            map.put(student.getName(), new ArrayList<>(Arrays.asList(getNum(student.getAddr()))));
        }
    }

    System.out.println(map.toString());

}
}

Output would be : {a=[1, 2, 3], b=[1, 2], c=[1, 3]}

How can I just do the same in java8 in a much more elegant way, may be using the streams ?

Found this Collectors.toMap in java 8 but could't find a way to actually do the same with this.

I was trying to map the elements as CSVs but that it didn't work since I couldn't figure out a way to remove the duplicates easily and the output is not what I need at the moment.

Map<String, String> map2 = new HashMap<>();
map2 = list.stream().collect(Collectors.toMap(Student::getName, Student::getAddr, (a, b) -> a + " , " + b));
System.out.println(map2.toString());
// {a=test 1 , test 1 , test 1 , test 2 , test 3, b=test 1 , test 1 , test 2, c=test 1 , test 3}

4 Answers 4

17

With streams, you could use Collectors.groupingBy along with Collectors.mapping:

Map<String, Set<String>> map = list.stream()
    .collect(Collectors.groupingBy(
        Student::getName,
        Collectors.mapping(student -> getNum(student.getAddr()),
            Collectors.toSet())));

I've chosen to create a map of sets instead of a map of lists, as it seems that you don't want duplicates in the lists.


If you do need lists instead of sets, it's more efficient to first collect to sets and then convert the sets to lists:

Map<String, List<String>> map = list.stream()
    .collect(Collectors.groupingBy(
        Student::getName,
        Collectors.mapping(s -> getNum(s.getAddr()),
            Collectors.collectingAndThen(Collectors.toSet(), ArrayList::new))));

This uses Collectors.collectingAndThen, which first collects and then transforms the result.


Another more compact way, without streams:

Map<String, Set<String>> map = new HashMap<>(); // or LinkedHashMap
list.forEach(s -> 
    map.computeIfAbsent(s.getName(), k -> new HashSet<>()) // or LinkedHashSet
        .add(getNum(s.getAddr())));

This variant uses Iterable.forEach to iterate the list and Map.computeIfAbsent to group transformed addresses by student name.

Sign up to request clarification or add additional context in comments.

11 Comments

Like the use of Collectors.collectingAndThen.
Great. One small thins, append something to the key, what if I need my output as {student-a=[1, 2, 3],student-b=[1, 2], student-c=[1, 3]}
@prime just change Student::getName by s -> "student-" + s.getName()
@FedericoPeraltaSchaffner great. One thing, what if we need to exclude some elements from the getNum, like output will be {a=[1, 2], b=[1, 2], c=[1, 3]} Now the a's list does not has the 3 because it failed some condition in getNum, how can we handle a condition like that. ex : If the type is type 3 then we don't need to add that to the list.
@prime if you are on Java 9, change getNum so that it returns a stream, if the condition is true, return a one element stream, if it's false return an empty stream. Then, when collecting, use Collectors.flatMapping instead of Collectors.mapping. there are other ways, maybe you need to write another question for that case, linking to this question for context
|
5

First of all, the current solution is not really elegant, regardless of any streaming solution.

The pattern of

if (map.containsKey(k)) {
    Value value = map.get(k);
    ...
} else {
    map.put(k, new Value());
}

can often be simplified with Map#computeIfAbsent. In your example, this would be

// This will create a Map with Student names (distinct) and the test
// numbers (distinct List of tests numbers) associated with them.
for (Student student : list)
{
    List<String> numList = map.computeIfAbsent(
        student.getName(), s -> new ArrayList<String>());
    String value = getNum(student.getAddr());
    if (!numList.contains(value))
    {
        numList.add(value);
    }
}

(This is a Java 8 function, but it is still unrelated to streams).


Next, the data structure that you want to build there does not seem to be the most appropriate one. In general, the pattern of

if (!list.contains(someValue)) {
    list.add(someValue);
}

is a strong sign that you should not use a List, but a Set. The set will contain each element only once, and you will avoid the contains calls on the list, which are O(n) and thus may be expensive for larger lists.

Even if you really need a List in the end, it is often more elegant and efficient to first collect the elements in a Set, and afterwards convert this Set into a List in one dedicated step.

So the first part could be solved like this:

// This will create a Map with Student names (distinct) and the test
// numbers (distinct List of tests numbers) associated with them.
Map<String, Collection<String>> map = new HashMap<>();
for (Student student : list)
{
    String value = getNum(student.getAddr());
    map.computeIfAbsent(student.getName(), s -> new LinkedHashSet<String>())
        .add(value);
}

It will create a Map<String, Collection<String>>. This can then be converted into a Map<String, List<String>> :

// Convert the 'Collection' values of the map into 'List' values 
Map<String, List<String>> result = 
    map.entrySet().stream().collect(Collectors.toMap(
        Entry::getKey, e -> new ArrayList<String>(e.getValue())));

Or, more generically, using a utility method for this:

private static <K, V> Map<K, List<V>> convertValuesToLists(
    Map<K, ? extends Collection<? extends V>> map)
{
    return map.entrySet().stream().collect(Collectors.toMap(
        Entry::getKey, e -> new ArrayList<V>(e.getValue())));
}

I do not recommend this, but you also could convert the for loop into a stream operation:

Map<String, Set<String>> map = 
    list.stream().collect(Collectors.groupingBy(
        Student::getName, LinkedHashMap::new,
        Collectors.mapping(
            s -> getNum(s.getAddr()), Collectors.toSet())));

Alternatively, you could do the "grouping by" and the conversion from Set to List in one step:

Map<String, List<String>> result = 
    list.stream().collect(Collectors.groupingBy(
        Student::getName, LinkedHashMap::new,
        Collectors.mapping(
            s -> getNum(s.getAddr()), 
            Collectors.collectingAndThen(
                Collectors.toSet(), ArrayList<String>::new))));

Or you could introduce an own collector, that does the List#contains call, but all this tends to be far less readable than the other solutions...

3 Comments

Upvoted! This is a very complete answer. I think you could have used replaceAll on each map's values instead of streaming on the entry sets and then collecting to new maps.
@FedericoPeraltaSchaffner Yes, replaceAll (which I admittedly did not have on the screen until now) may be an alternative, depending on which type information you want to have for the map values (Collection vs ? extends Collection - and maybe it must be a List...?). But this is only one of many degrees of freedom in the answer to this question. I think that when these degrees of freedom show up as different options of deeply nested Collector implementations, one should consider breaking it down into a for loop with simple, named operations, for the sake of readability...
Agreed, downstream collectors tend to become unreadable after the 2nd level, that's why I'd stay with the Map.computeIfAbsent solution. And using Collectors.collectingAndThen has always seemed too verbose to add a simple finishing transformation...
3

I think you are looking for something like below

   Map<String,Set<String>> map =  list.stream().
           collect(Collectors.groupingBy(
                    Student::getName,
                    Collectors.mapping(e->getNum(e.getAddr()), Collectors.toSet())
                ));

   System.out.println("Map : "+map);

Comments

2

Here is a version that collects everything in sets, and converts the final result to array lists:

/*
import java.util.*;
import java.util.stream.*;
import static java.util.stream.Collectors.*;
import java.util.function.*;
*/

Map<String, List<String>> map2 = list.stream().collect(groupingBy(
  Student::getName, // we will group the students by name
  Collector.of(
    HashSet::new, // for each student name, we will collect result in a hash set
    (arr, student) -> arr.add(getNum(student.getAddr())), // which we fill with processed addresses
    (left, right) -> { left.addAll(right); return left; }, // we merge subresults like this
    (Function<HashSet<String>, List<String>>) ArrayList::new // finish by converting to List
  )
));
System.out.println(map2);

// Output:
// {a=[1, 2, 3], b=[1, 2], c=[1, 3]}

EDIT: made the finisher shorter using Marco13's hint.

5 Comments

@Marco13 there's always something I can learn from Marco13 :] Thank you for the hint! I updated the answer. A separate variable declaration for the collector does not seem to be necessary. A (Function<HashSet<String>, List<String>>) cast seems to be enough to select the right constructor of ArrayList. So, it's not a cast, it's rather some kind of type ascription, to be more precise.
OK, then I'll delete the distracting comment. But I also was a bit irritated why pulling it out into a variable (or "casting" as in your case) seemed to be necessary. I wonder whether this is expected, why the type inference seemed to hit a limit there, or whether there is a "nicer" solution, and consider asking this as a separate question (if I can't figure it out on my own, but I'll first have to take a closer look at that)
I had another look at this. The reason of why it cannot infer the type of the finisher is likely related to the fact that the return type cannot be derived from the Collector.of call alone: None of its arguments defines R, the return type, and it seems like it is not able to hand the target type information from the map2 declaration through the collect and groupingBy calls up to the Collector.of call. Related: stackoverflow.com/a/24798163/3182664 (BTW: The current code issues an "unchecked" warning, which can be avoided with ArrayList<String>::new)
@Marco13 openjdk version "1.8.0_144" issues nothing, compiles just fine even with -Werror ?..
You're right. I tried it in Eclipse Neon (pretty old...) and it issued a warning, but with the latest (Eclipse Oxygen) everything is fine. Seems to be a ECJ bug that has been fixed in the meantime.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.