1

Given the following list (or numpy array):

x = [4, 3, 1, 2]

I want to generate another list (or numpy array) with 1+4+3+2=10 elements such as:

y = [1, 1, 1, 1, 2, 2, 2, 3, 4, 4]

Where y will have x[i] successive elements with a value of i.

Other examples:

x = [0,3,1]
y = [2,2,2,3]

x = [2,0,2]
y = [1,1,3,3]

x = [1,1,1,1,1]
y = [1,2,3,4,5]

How can this be done efficiently? Thanks a lot.

1
  • Welcome. It would be great if you could also include the code that you have written. Commented Nov 17, 2020 at 11:06

4 Answers 4

1

This do the work:

x = [4,3,1,2]

y = []
for index, num in enumerate(x):
    for i in range(num):
        y.append(index + 1)
        
print(y)

or if you prefer with list comprehension in one line:

x = [4,3,1,2]

y = [index + 1 for index, num in enumerate(x) for i in range(num)]
        
print(y)

Output:

[1, 1, 1, 1, 2, 2, 2, 3, 4, 4]
Sign up to request clarification or add additional context in comments.

3 Comments

@Burak Perfect! If you could mark the answer as solver would be great
@LorenzoZane another variant would be enumerate(x, start=1)
@LorenzoZane Btw I found the second solution the most economic in pure Python. numpy is a different category (np.repeat is >20x perfomance for large dataset).
1

If you wanted to use numpy:

x = np.array([4, 3, 1, 2])
a = np.arange(1, x.size+1)

np.repeat(a, x)

Output:

array([1, 1, 1, 1, 2, 2, 2, 3, 4, 4])

2 Comments

Why np.repeat(a, [*x]) instead of np.repeat(a, x)?
@mathfux - Thank you, it was redundant. For some reason, when I originally wrote it, that method didn't work - but now it's fine. I must have done something silly. Edited - thanks.
0

An unnecessarily complicated way to do this is the following, utilizing list comprehensions and the reduce function:

from functools import reduce

def f(ls):
    return reduce(lambda x, y: x+y, [[i+1]*ls[i] for i in range(len(ls))])

>>> f([0, 3, 1])
[2, 2, 2, 3]
>>> f([2, 0, 2])
[1, 1, 3, 3]
>>> f([1, 1, 1, 1, 1])
[1, 2, 3, 4, 5]

1 Comment

This is brilliant but it costs performance. It's 8 times slower than the fastest solution in pure Python.
0
In [52]: alist, reps = [1,2,3,4], [4,3,1,2]

np.repeat does this kind of repetition nicely - but it makes things arrays (which takes time):

In [53]: np.repeat(alist,reps)
Out[53]: array([1, 1, 1, 1, 2, 2, 2, 3, 4, 4])

List repeat can be used with:

In [54]: [[i]*j for i,j in zip([1,2,3,4],[4,3,1,2])]
Out[54]: [[1, 1, 1, 1], [2, 2, 2], [3], [4, 4]]

and the lists of lists can be flattened with:

In [55]: [k for l in ([i]*j for i,j in zip(alist, reps)) for k in l]
Out[55]: [1, 1, 1, 1, 2, 2, 2, 3, 4, 4]
In [56]: list(itertools.chain(*([i]*j for i,j in zip(alist, reps))))
Out[56]: [1, 1, 1, 1, 2, 2, 2, 3, 4, 4]

Some timings:

In [57]: timeit np.repeat(alist,reps)
10.9 µs ± 398 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

If we start with arrays, the result is much faster:

In [58]: %%timeit a,b = np.array(alist), np.array(reps)
    ...:  np.repeat(a,b)
    ...: 
2.97 µs ± 103 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

But for this small sample, the list methods are faster:

In [59]: timeit [k for l in ([i]*j for i,j in zip(alist, reps)) for k in l]
2.33 µs ± 70.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [60]: timeit list(itertools.chain(*([i]*j for i,j in zip(alist, reps))))
2.46 µs ± 76.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

For larger cases, I expect the [58] times to scale the best. I'm not sure about the others.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.