1

I want to generate a following array a:

nv = np.random.randint(3, 10+1, size=(1000000,))
a = np.concatenate([np.arange(1,i+1) for i in nv])

Thus, the output would be something like -

[0, 1, 2, 3, 0, 1, 2, 3, 4, 0, 1, 2, 0, 1, 2, 3, 4, 5, 0, ...]

Does there exist any better way to do it?

2
  • 1
    Typo - np.range should be np.arange, I believe. Commented Dec 24, 2016 at 3:55
  • np.r_[:4, :5, :3, :6] is a nice compact way of generating such an array 'by-hand'. It doesn't offer any advantages if starting with a nv array. Commented Dec 24, 2016 at 6:11

3 Answers 3

2

Here's a vectorized approach using cumulative summation -

def ranges(nv, start = 1):
    shifts = nv.cumsum()
    id_arr = np.ones(shifts[-1], dtype=int)
    id_arr[shifts[:-1]] = -nv[:-1]+1
    id_arr[0] = start # Skip if we know the start of ranges is 1 already
    return id_arr.cumsum()

Sample runs -

In [23]: nv
Out[23]: array([3, 2, 5, 7])

In [24]: ranges(nv, start=0)
Out[24]: array([0, 1, 2, 0, 1, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 5, 6])

In [25]: ranges(nv, start=1)
Out[25]: array([1, 2, 3, 1, 2, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 6, 7])

Runtime test -

In [62]: nv = np.random.randint(3, 10+1, size=(100000,))

In [63]: %timeit your_func(nv) # @MSeifert's solution
10 loops, best of 3: 129 ms per loop

In [64]: %timeit ranges(nv)
100 loops, best of 3: 5.54 ms per loop
Sign up to request clarification or add additional context in comments.

2 Comments

oh so much faster, now I have to upvote your answer (even though I don't fully understand the magic id_arr[shifts[:-1]] = -nv[:-1]+1 line)
@MSeifert Yeah, not straight-forward. Might explain later on.
1

Instead of doing this with numpy methods you could use normal python ranges and just convert the result to an array:

from itertools import chain
import numpy as np

def your_func(nv):
    ranges = (range(1, i+1) for i in nv)
    flattened = list(chain.from_iterable(ranges))
    return np.array(flattened)

This doesn't need to utilize hard to understand numpy slicings and constructs. To show a sample case:

import random

>>> nv = [random.randint(1, 10) for _ in range(5)]
>>> print(nv)
[4, 2, 10, 5, 3]

>>> print(your_func(nv))
[ 1  2  3  4  1  2  1  2  3  4  5  6  7  8  9 10  1  2  3  4  5  1  2  3]

Comments

0

Why two steps?

a = np.concatenate([np.arange(0,np.random.randint(3,11)) for i in range(1000000)])

3 Comments

I believe the first step was just to create a sample input.
Yes, it is just a sample input.
Got it. Well, I guess I'll leave this here in case anyone else didn't realize that the problem was to build the range array from an existing list of lengths.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.