The shape of I and D doesn't matter: you can clearly ravel the arrays without changing the outcome:
index = np.ravel(I)
data = np.ravel(D)
Now you can sort both arrays according to I:
sorter = np.argsort(index)
index = index[sorter]
data = data[sorter]
This is helpful because now index looks like this:
0, 0, 0, 1, 1, 1
And data is this:
0.1, 0.2, 0.2, 0.1, 0.4, 0.1
Adding together runs of consecutive numbers should be easier than processing random locations. Let's start by finding the indices where the runs start:
runs = np.r_[0, np.flatnonzero(np.diff(index)) + 1]
Now you can use the fact that ufuncs like np.add have a partial reduce operation called reduceat. This allows you to sum regions of an array:
a = np.add.reduceat(data, runs)
If I is guaranteed to contain all indices in [0, A.size) at least once, you're done: just assign to A instead of a. If not, you can make the mapping using the fact that the start of each run in index is the target index:
A = np.zeros(n)
A[index[runs]] = a
Algorithmic complexity analysis:
ravel is O(1) in time and space if the data is in an array. If it's a list, this is O(MN) in time and space
argsort is O(MN log MN) in time and O(MN) in space
- Indexing by
sorter is O(MN) in time and space
- Computing
runs is O(MN) in time and O(MN + M) = O(MN) in space
reduceat is a single pass: O(MN) in time, O(M) in space
- Reassigning
A is O(M) in time and space
Total: O(MN log MN) time, O(MN) space
TL;DR
def make_A(D, I, M):
index = np.ravel(I)
data = np.ravel(D)
sorter = np.argsort(index)
index = index[sorter]
if index[0] < 0 or index[-1] >= M:
raise ValueError('Bad indices')
data = data[sorter]
runs = np.r_[0, np.flatnonzero(np.diff(index)) + 1]
a = np.add.reduceat(data, runs)
if a.size == M:
return a
A = np.zeros(M)
A[index[runs]] = a
return A
0.5by doing0.1+0.4? If so how are you getting0.6?Iand addA[I[x][y]] += D[x][y]I?I