8

Folks,

Consider the following example, given a list of Trade objects my code needs to return an array containing trade volume for 24 hours, 7 days, 30 days and all times.

Using plain old iterator this requires only a single iteration over the collection.

I'm trying to do the same using a Java 8 streams and Lambda expressions. I came up with this code, which looks elegant, works fine, but requires 4 iterations over the list:

public static final int DAY = 24 * 60 * 60;

public double[] getTradeVolumes(List<Trade> trades, int timeStamp) {
    double volume = trades.stream().mapToDouble(Trade::getVolume).sum();
    double volume30d = trades.stream().filter(trade -> trade.getTimestamp() + 30 * DAY > timeStamp).mapToDouble(Trade::getVolume).sum();
    double volume7d = trades.stream().filter(trade -> trade.getTimestamp() + 7 * DAY > timeStamp).mapToDouble(Trade::getVolume).sum();
    double volume24h = trades.stream().filter(trade -> trade.getTimestamp() + DAY > timeStamp).mapToDouble(Trade::getVolume).sum();
    return new double[]{volume24h, volume7d, volume30d, volume};
}

How can I achieve the same using only a single iteration over the list ?

10
  • 1
    The overall question is: Why would you want to do it in only one iteration? Commented Jun 25, 2014 at 11:55
  • Since it's more efficient. Commented Jun 25, 2014 at 11:56
  • 2
    Nope! It's not more efficient! I asked the question as I expected that answer. And your assumption is not correct: Who says that two iterations each doing one operation must be less efficient than only one iteration doing two operations? Commented Jun 25, 2014 at 11:59
  • Consider the case where I have 1,000,000 trades and 100 different volumes to calculate and the method trade.getTimestamp() is expensive. Using an iterator I only need to call it 1,000,000 times, while using lambda I need to call it 100 million times. Commented Jun 25, 2014 at 12:04
  • I suggest you compare/measure the performance to see what difference it makes. It should be easy to put a test together. It might make a big difference in which case, use a loop, or a little difference in which case do what you feel is clearer. Commented Jun 25, 2014 at 12:26

3 Answers 3

9

This problem is similar to the "summary statistics" collector. Take a look at the IntSummaryStatistics class:

public class IntSummaryStatistics implements IntConsumer {
    private long count;
    private long sum;
    ...

    public void accept(int value) {
        ++count;
        sum += value;
        min = Math.min(min, value);
        max = Math.max(max, value);
   }

   ...

}

It is designed to work with collect(); here's the implementation of IntStream.summaryStatistics()

public final IntSummaryStatistics summaryStatistics() {
    return collect(IntSummaryStatistics::new, IntSummaryStatistics::accept,
                   IntSummaryStatistics::combine);
}

The benefit of writing a Collector like this is then your custom aggregation can run in parallel.

Sign up to request clarification or add additional context in comments.

2 Comments

That's was my initial approach but note the different filter applied before the mapping. How would you define the collector and still apply the filter ?
The accept method would conditionally increment the various sums. Basically, you're shoveling all the data through the collector, which can pick off the bits it wants. For sequential processing, this is basically equivalent to the for-loop, but it parallelizes cleanly whereas the for-loop does not.
1

Thanks Brian, I ended up implementing the code below, it's not as simple as I hoped but at least it iterates only once, its parallel ready and it passes my unit tests. Any improvements ideas are welcomed.

public double[] getTradeVolumes(List<Trade> trades, int timeStamp) {
    TradeVolume tradeVolume = trades.stream().collect(
            () -> new TradeVolume(timeStamp),
            TradeVolume::accept,
            TradeVolume::combine);
    return tradeVolume.getVolume();
}

public static final int DAY = 24 * 60 * 60;

static class TradeVolume {

    private int timeStamp;
    private double[] volume = new double[4];

    TradeVolume(int timeStamp) {
        this.timeStamp = timeStamp;
    }

    public void accept(Trade trade) {
        long tradeTime = trade.getTimestamp();
        double tradeVolume = trade.getVolume();
        volume[3] += tradeVolume;
        if (!(tradeTime + 30 * DAY > timeStamp)) {
            return;
        }
        volume[2] += tradeVolume;
        if (!(tradeTime + 7 * DAY > timeStamp)) {
            return;
        }
        volume[1] += tradeVolume;
        if (!(tradeTime + DAY > timeStamp)) {
            return;
        }
        volume[0] += tradeVolume;
    }

    public void combine(TradeVolume tradeVolume) {
        volume[0] += tradeVolume.volume[0];
        volume[1] += tradeVolume.volume[1];
        volume[2] += tradeVolume.volume[2];
        volume[3] += tradeVolume.volume[3];
    }

    public double[] getVolume() {
        return volume;
    }
}

3 Comments

You posited that getTimestamp was the expensive part, so you could factor that out within accept` rather than executing it three times.
Now after your edit, you are comparing timestamp with timestamp which makes no sense. If you name a local variable the same as the instance field you have to qualify the access to the latter with this.. And note that your conditions are related: for a positive i you can say that if x+i>y is true you know that x+7*i>y and x+30*i>y are true as well (as long as you can preclude overflow). Or if x+30*i>y is false, x+7*i>y and x+i>y must be false as well. So there’s room to improve your conditional code…
Fixed, I'm trying to simplify complex code into a simplified example thus causing these problems.
0

It might be possible to use a Collectors.groupingBy method to partition the data however the equation would be complicated and not intent revealing.

Since getTimestamp() is an expensive operation, it is probably best to keep it as a pre-Java 8 iteration so you only have to calculate the value once per Trade.

Just because Java 8 adds shiny new tools, don't try to turn it into a hammer to hammer in all nails.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.