1

I have an array called "all_data" that looks like this:

array([[[102, 107, 111],
        [101, 106, 110],
        [100, 105, 109],
        ...,
        [221, 166,  99],
        [221, 166,  99],
        [221, 166,  99]],

       [[ 95, 100, 104],
        [ 98, 103, 107],
        [102, 107, 111],
        ...,
        [219, 165,  95],
        [218, 164,  94],
        [218, 164,  94]]])

My goal is to take the average of each of the column values of the same index. For example, since this larger array has 2 2d arrays (though my data can have up to 200 2d arrays that would need to be averaged), the end result of averaging would be 1 2d array with the first sublist being [98.5, 103.5, 107.5]

When I try to use numpy and do all_data.mean(axis=2), I get an array that looks like this unusually:

array([[106.66666667, 105.66666667, 104.66666667, ..., 162.        ,
        162.        , 162.        ],
       [ 99.66666667, 102.66666667, 106.66666667, ..., 159.66666667,
        158.66666667, 158.66666667]])

I'm not sure what the problem is because I thought it should be averaging the column values for each sublist, but something different is happening.

Any help would be appreciated

2
  • Your data is 3d, and you want to average along 2nd dimension, use axis=1 instead. Commented Aug 3, 2021 at 18:53
  • so outer is a list of 2d arrays, mid is a 2d array, and inner is a normal list? and you want to get the average for each column of each 2d array? how organized? Commented Aug 3, 2021 at 18:53

1 Answer 1

2

The solution is to set axis=0.

all_data.mean(axis=0)

How numpy handles axes is by collapsing that axis. The axis number is the order that that axis is reached. For 2D arrays, we have something like this:

arr = [[x,x,x], [x,x,x]] = [[x,x,x],
                            [x,x,x]]
  • When axis 0 is collapsed, we get this shape: [x, x, x]
  • When axis 1 is collapsed, we get this shape: [x, x]

For larger multidimensional arrays, the same thing applies, it's just harder to visualize. Another way to think about collapsing is to just remove that dimension by setting it to 1. Just for more clarity, here's the 3D example:

arr = [[[x, x, x],
        [x, x, x],
        [x, x, x],
        [x, x, x]],

       [[x, x, x],
        [x, x, x],
        [x, x, x],
        [x, x, x]]]

arr, as a 3d matrix, has the dimensions 2x4x3

Collapsing along axis 0, we get a 1x4x3 --> 4x3 shape:

[[x, x, x],
 [x, x, x],
 [x, x, x],
 [x, x, x]]

Collapsing along axis 1, we get a 2x1x3 --> 2x3 shape:

[[x, x, x],
 [x, x, x]]

Collapsing along axis 2, we get a 2x4x1 --> 2x4 shape:

[[x, x, x, x],
 [x, x, x, x]]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.