9

I have multiple input arrays of Float all of equal size and a single dimension. I want to create a single output Array that contains the average of all the input arrays.

e.g.

//input arrays
float[] array1 = new float[] { 1, 1, 1, 1 };
float[] array2 = new float[] { 2, 2, 2, 2 };
float[] array3 = new float[] { 3, 3, 3, 3 };
float[] array4 = new float[] { 4, 4, 4, 4 };

//the output should be
float[2.5, 2.5, 2.5, 2.5]

I would also like to calculate the standard deviation of the input arrays to.

What is the fastest approach to do this task?

Thanks in advance. Pete

4 Answers 4

10

LINQ to the rescue

This anwer details how to use LINQ to achieve the goal, with maximum reusability and versatility as the major objective.

Take 1 (package LINQ into a method for convenience)

Take this method:

float[] ArrayAverage(params float[][] arrays)
{
    // If you want to check that all arrays are the same size, something
    // like this is convenient:
    // var arrayLength = arrays.Select(a => a.Length).Distinct().Single();

    return Enumerable.Range(0, arrays[0].Length)
               .Select(i => arrays.Select(a => a.Skip(i).First()).Average())
               .ToArray();
}

It works by taking the range [0..arrays.Length-1] and for each number i in the range it calculates the average of the ith element of each array. It can be used very conveniently:

float[] array1 = new float[] { 1, 1, 1, 1 };
float[] array2 = new float[] { 2, 2, 2, 2 };
float[] array3 = new float[] { 3, 3, 3, 3 };
float[] array4 = new float[] { 4, 4, 4, 4 };

var averages = ArrayAverage(array1, array2, array3, array4);

This can already be used on any number of arrays without modification. But you can go one more step and do something more general.

Take 2 (generalizing for any aggregate function)

float[] ArrayAggregate(Func<IEnumerable<float>, float> aggregate, params float[][] arrays)
{
    return Enumerable.Range(0, arrays[0].Length)
               .Select(i => aggregate(arrays.Select(a => a.Skip(i).First())))
               .ToArray();
}

This can be used to calculate any aggregate function:

var output = ArrayAggregate(Enumerable.Average, array1, array2, array3, array4);

Instead of Enumerable.Average you can substitute any method, extension method, or anonymous function -- which is useful, as there's no built-in standard deviation aggregate function and also this way the ArrayAggregate function is very versatile. But we can still do better.

Take 3 (generalizing for any aggregate function and any type of array)

We can also make a generic version that works with any built-in type:

T[] ArrayAggregate<T>(Func<IEnumerable<T>, T> aggregate, params T[][] arrays)
{
    return Enumerable.Range(0, arrays[0].Length)
               .Select(i => aggregate(arrays.Select(a => a.Skip(i).First())))
               .ToArray();
}

As you can probably tell, this is not the fastest code to do the job. If your program spends all day calculating averages, use something more close to the metal. However, if you want reusability and versatility I don't think you can do much better than the above.

Sign up to request clarification or add additional context in comments.

Comments

4

The fastest way in terms of performance, unless you'd like to unroll the for loop is

float[] sums = new float[4];

for(int i = 0; i < 4; i++)
{
    sums[i] = (array1[i]+ array2[i] + array3[i] + array4[i])/4;
}

5 Comments

Thanks Yuriy. I'm not that good at performance stuff - are For-Loops really the fastest way to do this? Thanks for the answer, I will implement it now.
Maybe bitshifting twice will be faster than dividing by 4? ;) Or maybe the compiler takes care of that.
@k_b doubtful, but possible to check. Unfortunately I'm too lazy.
Why did you do the division in a separate loop?
@Xenophile no idea, will fix. I like Jon's answer better anyway.
0
  static void Main()
  {
     float[] array1 = new float[] { 1, 1, 1, 1 };
     float[] array2 = new float[] { 2, 2, 2, 2 };
     float[] array3 = new float[] { 3, 3, 3, 3 };
     float[] array4 = new float[] { 4, 4, 4, 4 };  
     float[] avg = CrossAverage (array1, array2, array3, array4);
     Console.WriteLine (string.Join ("|", avg.Select(f => f.ToString ()).ToArray()));
  }

  private static float[] CrossAverage (params float [][] arrays)
  {
     int [] count = new int [arrays[0].Length];
     float [] sum = new float [arrays[0].Length];
     for (int j = 0; j < arrays.Length; j++)
     {
        for (int i = 0; i < count.Length; i++)
        {
           count[i] ++;
           sum[i] += arrays[j][i];
        }
     }
     float [] avg = new float [arrays[0].Length];
     for (int i = 0; i < count.Length; i++)
     {
        avg[i] = sum[i] / count[i];
     }
     return avg;
  }

Don't forget bounds checking and divide by 0 checking.

Comments

0

And for the standard deviation after calculating the averages (into the sums array):

// std dev
float[] stddevs = new float[4];

for (int i = 0; i < 4; i++)
{
    stddevs[i] += (array1[i] - sums[i]) * (array1[i] - sums[i]);
    stddevs[i] += (array2[i] - sums[i]) * (array2[i] - sums[i]);
    stddevs[i] += (array3[i] - sums[i]) * (array3[i] - sums[i]);
    stddevs[i] += (array4[i] - sums[i]) * (array4[i] - sums[i]);
}

for (int i = 0; i < 4; i++)
    stddevs[i] = (float)Math.Sqrt(stddevs[i]/4);

In general, accessing the array directly rather than using LINQ will be a performance win due to allowing the compiler/JIT to optimize. At the very least, array bounds checks can be eliminated and the overhead of using an enumerator will be avoided.

1 Comment

Thanks all!! I like the for-loop options as they apparently offer a performance advantage over the LINQ solutions. One last caveat to this problem - would the array manipulation be faster in C++?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.