Java 8 stream refactoring

Question

Is it possible to express the following logic more succinctly using Java 8 stream constructs:

public static Set<Pair> findSummingPairsLookAhead(int[] data, int sum){
    Set<Pair> collected = new HashSet<>();
    Set<Integer> lookaheads = new HashSet<>();

    for(int i = 0; i < data.length; i++) {
        int elem = data[i];
        if(lookaheads.contains(elem)) {
            collected.add(new Pair(elem, sum - elem));
        }
        lookaheads.add(sum - elem);
    }

    return collected;
}

Something to the effect of Arrays.stream(data).forEach(...).

Thanks in advance.

Misha · Accepted Answer · 2017-09-29 01:31:24Z

An algorithm that involves mutating a state during iteration is not well-suited for streams. However, it is often possible to rethink an algorithm in terms of bulk operations that do not explicitly mutate any intermediate state.

In your case, the task is to collect a set of Pair(x, sum - x) where sum - x appears before x in the list. So, we can first build a map of numbers to the index of their first occurrence in the list and then use that map to filter the list and build the set of pairs:

Map<Integer, Integer> firstIdx = IntStream.range(0, data.length)
                         .boxed()
                         .collect(toMap(i -> data[i], i -> i, (a, b) -> a));

Set<Pair> result = IntStream.range(0, data.length)
                         .filter(i -> firstIdx.contains(sum - data[i]))
                         .filter(i -> firstIdx.get(sum - data[i]) < i)
                         .mapToObj(i -> new Pair(data[i], sum - data[i]))
                         .collect(toSet());

You can shorten the two filters by either using && or getOrDefault if you find that clearer.

Szymon Stepniak · Accepted Answer · 2017-09-28 21:22:42Z

It's worth mentioning that your imperative style implementation is probably the most effective way to express your expectations. But if you really want to implement same logic using Java 8 Stream API, you can consider utilizing .reduce() method, e.g.

import org.apache.commons.lang3.tuple.Pair;

import java.util.Arrays;
import java.util.Set;
import java.util.concurrent.ConcurrentSkipListSet;  

final class SummingPairsLookAheadExample {

    public static void main(String[] args) {
        final int[] data = new int[]{1,2,3,4,5,6};
        final int sum = 8;

        final Set<Pair> pairs = Arrays.stream(data)
                .boxed()
                .parallel()
                .reduce(
                        Pair.of(Collections.synchronizedSet(new HashSet<Pair>()), Collections.synchronizedSet(new HashSet<Integer>())),
                        (pair,el) -> doSumming(pair, el, sum),
                        (a,b) -> a
                ).getLeft();

        System.out.println(pairs);
    }

    synchronized private static Pair<Set<Pair>, Set<Integer>> doSumming(Pair<Set<Pair>, Set<Integer>> pair, int el, int sum) {
        if (pair.getRight().contains(el)) {
            pair.getLeft().add(Pair.of(el, sum - el));
        }
        pair.getRight().add(sum - el);
        return pair;
    }
}

Output

[(5,3), (6,2)]

The first parameter in .reduce() method is accumulator's initial value. This object will be passed to each iteration step. In our case we use a pair of Set<Pair> (expected result) and Set<Integer> (same as variable lookaheads in your example). Second parameter is a lambda (BiFunction) that does the logic (extracted to a separate private method to make code more compact). And the last one is binary operator. It's pretty verbose, but it does not rely on any side effects. @Eugene pointed out that my previous example had issues with parallel execution, so I've updated this example to be safe in parallel execution as well. If you don't run it in parallel you can simply remove synchronized keyword from helper method and use regular sets instead of synchronized one as initial values for accumulator.

nice try, but there are race conditions here. try to add parallel and see how it breaks
@Eugene Good catch, thanks Eugene! I've fixed the code to run correctly in parallel execution as well.

123-xyz · Accepted Answer · 2017-09-29 05:34:38Z

0

Are you trying to get the unique Pairs which's sum equals to specified sum?

Arrays.stream(data).boxed()
        .collect(Collectors.groupingBy(i -> i <= sum / 2 ? i : sum - i, toList())).values().stream()
        .filter(e -> e.size() > 1 && (e.get(0) * 2 == sum || e.stream().anyMatch(i -> i == sum - e.get(0))))
        .map(e -> Pair.of(sum - e.get(0), e.get(0)))
        .collect(toList());

A list with unique pairs is returned. you can change it to set by toSet() if you want.

edited Sep 29, 2017 at 5:34

answered Sep 29, 2017 at 4:45

123-xyz

6374 silver badges6 bronze badges

Comments

Eugene · Accepted Answer · 2017-09-28 21:03:24Z

What you have in place is fine (and the java-8 gods are happy). The main problem is that you are relying on side-effects and streams are not very happy about it - they even mention it explicitly in the documentation.

Well I can think of this (I've replaced Pair with SimpleEntry so that I could compile)

public static Set<AbstractMap.SimpleEntry<Integer, Integer>> findSummingPairsLookAhead2(int[] data, int sum) {
    Set<Integer> lookaheads = Collections.synchronizedSet(new HashSet<>());

    return Arrays.stream(data)
            .boxed()
            .map(x -> {
                lookaheads.add(sum - x);
                return x;
            })
            .filter(lookaheads::contains)
            .collect(Collectors.mapping(
                    x -> new AbstractMap.SimpleEntry<Integer, Integer>(x, sum - x),
                    Collectors.toSet()));

}

But we are still breaking the side-effects property of map - in a safe way, but still bad. Think about people that will come after you and look at this code; they might find it at least weird.

If you don't ever plan to run this in parallel, you could drop the Collections.synchronizedSet - but do that at your own risk.

Collectives™ on Stack Overflow

Java 8 stream refactoring

4 Answers 4

Comments

Output

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Output

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related