Calculating the averages of elements in one array based on data in another array

Question

I need to average the Y values corresponding to the values in the X array...

X=np.array([  1,  1,  2,  2,  2,  2,  3,  3 ... ])

Y=np.array([ 10, 30, 15, 10, 16, 10, 15, 20 ... ])

In other words, the equivalents of the 1 values in the X array are 10 and 30 in the Y array, and the average of this is 20, the equivalents of the 2 values are 15, 10, 16, and 10, and their average is 12.75, and so on...

How can I calculate these average values?

np.bincount(X-1, Y) / np.bincount(X-1), if the groups are ascending starting from 1 — Michael Szczesny
– Michael Szczesny, Commented Jun 14, 2022 at 19:33

j1-lee · Accepted Answer · 2022-06-14 19:33:43Z

5

One option is to use a property of linear regression (with categorical variables):

import numpy as np

x = np.array([  1,  1,  2,  2,  2,  2,  3,  3 ])
y = np.array([ 10, 30, 15, 10, 16, 10, 15, 20 ])

x_dummies = x[:, None] == np.unique(x)
means = np.linalg.lstsq(x_dummies, y, rcond=None)[0]
print(means) # [20.   12.75 17.5 ]

answered Jun 14, 2022 at 19:33

j1-lee

13.9k3 gold badges16 silver badges27 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

It_is_Chris · Accepted Answer · 2022-06-14 19:30:42Z

4

You can try using pandas

import pandas as pd
import numpy as np

N = pd.DataFrame(np.transpose([X,Y]),
             columns=['X', 'Y']).groupby('X')['Y'].mean().to_numpy()
# array([20.  , 12.75, 17.5 ])

answered Jun 14, 2022 at 19:30

It_is_Chris

14.2k3 gold badges27 silver badges45 bronze badges

2 Comments

mozway Over a year ago

why so complicated? pd.Series(Y).groupby(X).mean().to_numpy() ;)

It_is_Chris Over a year ago

That makes a lot more sense. I always forget that your can groupby an array.

TreshUp · Accepted Answer · 2022-06-14 19:32:01Z

2

import numpy as np

X = np.array([  1,  1,  2,  2,  2,  2,  3,  3])

Y = np.array([ 10, 30, 15, 10, 16, 10, 15, 20])

# Only unique values
unique_vals = np.unique(X);

# Loop for every value
for val in unique_vals:
    # Search for proper indexes in Y
    idx = np.where(X == val)
    # Mean for finded indexes
    aver = np.mean(Y[idx])
    print(f"Average for {val}: {aver}")

Result:

Average for 1: 20.0

Average for 2: 12.75

Average for 3: 17.5

answered Jun 14, 2022 at 19:32

TreshUp

969 bronze badges

Comments

Hossein Biniazian · Accepted Answer · 2022-06-14 19:28:19Z

you can use something like the below code :

import numpy as np

X=np.array([  1,  1,  2,  2,  2,  2,  3,  3])

Y=np.array([ 10, 30, 15, 10, 16, 10, 15, 20])


def groupby(a, b):
    # Get argsort indices, to be used to sort a and b in the next steps
    sidx = b.argsort(kind='mergesort')
    a_sorted = a[sidx]
    b_sorted = b[sidx]

    # Get the group limit indices (start, stop of groups)
    cut_idx = np.flatnonzero(np.r_[True,b_sorted[1:] != b_sorted[:-1],True])

    # Split input array with those start, stop ones
    out = [a_sorted[i:j] for i,j in zip(cut_idx[:-1],cut_idx[1:])]
    return out

group_by_array=groupby(Y,X)
for item in group_by_array:
    print(np.average(item))

I use the information in the below link to answer the question: Group numpy into multiple sub-arrays using an array of values

rsenne · Accepted Answer · 2022-06-14 19:35:28Z

1

I think this solution should work:

avg_arr = []
i = 1
while i <= np.max(x):
    inds = np.where(x == i)
    my_val = np.average(y[inds[0][0]:inds[0][-1]])
    avg_arr.append(my_val)
    i+=1

Definitely, not the cleanest, but I was able to test it quickly and it does indeed work.

answered Jun 14, 2022 at 19:35

rsenne

3671 silver badge9 bronze badges

Collectives™ on Stack Overflow

Calculating the averages of elements in one array based on data in another array

5 Answers 5

Comments

2 Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

2 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related