I have a matrix X of size (d,N). In other words, there are N vectors with d dimensions each. For example,
X = [[1,2,3,4],[5,6,7,8]]
there are N=4 vectors of d=2 dimensions.
Also, I have rag array (list of lists). Indices are indexing columns in the X matrix. For example,
I = [ [0,1], [1,2,3] ]
The I[0]=[0,1] indexes columns 0 and 1 in matrix X. Similarly the element I[1] indexes columns 1,2 and 3. Notice that elements of I are lists that are not of the same length!
What I would like to do, is to index the columns in the matrix X using each element in I, sum the vectors and get a vector. Repeat this for each element of I and thus build a new matrix Y. The matrix Y should have as many d-dimensional vectors as there are elements in I array. In my example, the Y matrix will have 2 vectors of 2 dimensions.
In my example, the element I[0] tells to get columns 0 and 1 from matrix X. Sum the two vectors 2-dimensional vectors of matrix X and put this vector in Y (column 0). Then, element I[1] tells to sum the columns 1,2 and 3 of matrix X and put this new vector in Y (column 1).
I can do this easily using a loop but I would like to vectorize this operation if possible. My matrix X has hundreds of thousands of columns and the I indexing matrix has tens of thousands elements (each element is a short lists of indices).
My loopy code :
Y = np.zeros( (d,len(I)) )
for i,idx in enumerate(I):
Y[:,i] = np.sum( X[:,idx], axis=1 )