First idea; prevent multiple calls to np.arange and concatenate should be much faster then hstack:
import numpy as np
x=np.array([5,7,2])
>>>a=np.arange(1,x.max()+1)
>>> np.hstack([a[:k] for k in x])
array([1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 6, 7, 1, 2])
>>> np.concatenate([a[:k] for k in x])
array([1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 6, 7, 1, 2])
If there are many nonunique values this seems more efficient:
>>>ua,uind=np.unique(x,return_inverse=True)
>>>a=[np.arange(1,k+1) for k in ua]
>>>np.concatenate(np.take(a,uind))
array([1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 6, 7, 1, 2])
Some timings for your case:
x=np.random.randint(0,20,1000000)
Original code
#Using hstack
%timeit np.hstack([np.arange(1,n+1) for n in x])
1 loops, best of 3: 7.46 s per loop
#Using concatenate
%timeit np.concatenate([np.arange(1,n+1) for n in x])
1 loops, best of 3: 5.27 s per loop
First code:
#Using hstack
%timeit a=np.arange(1,x.max()+1);np.hstack([a[:k] for k in x])
1 loops, best of 3: 3.03 s per loop
#Using concatenate
%timeit a=np.arange(1,x.max()+1);np.concatenate([a[:k] for k in x])
10 loops, best of 3: 998 ms per loop
Second code:
%timeit ua,uind=np.unique(x,return_inverse=True);a=[np.arange(1,k+1) for k in ua];np.concatenate(np.take(a,uind))
10 loops, best of 3: 522 ms per loop
Looks like we gain a 14x speedup with the final code.
Small sanity check:
ua,uind=np.unique(x,return_inverse=True)
a=[np.arange(1,k+1) for k in ua]
out=np.concatenate(np.take(a,uind))
>>>out.shape
(9498409,)
>>>np.sum(x)
9498409
ydoesnt work because elements of the numpy array will be different lengths.xand maximum ofx? Also isxunique?