Python: Create an array using values and indices of a given array

Question

Given the following list (or numpy array):

x = [4, 3, 1, 2]

I want to generate another list (or numpy array) with 1+4+3+2=10 elements such as:

y = [1, 1, 1, 1, 2, 2, 2, 3, 4, 4]

Where y will have x[i] successive elements with a value of i.

Other examples:

x = [0,3,1]
y = [2,2,2,3]

x = [2,0,2]
y = [1,1,3,3]

x = [1,1,1,1,1]
y = [1,2,3,4,5]

How can this be done efficiently? Thanks a lot.

Welcome. It would be great if you could also include the code that you have written. — Neeraj Hanumante
– Neeraj Hanumante, Commented Nov 17, 2020 at 11:06

lorenzozane · Accepted Answer · 2020-11-17 10:47:40Z

1

This do the work:

x = [4,3,1,2]

y = []
for index, num in enumerate(x):
    for i in range(num):
        y.append(index + 1)
        
print(y)

or if you prefer with list comprehension in one line:

x = [4,3,1,2]

y = [index + 1 for index, num in enumerate(x) for i in range(num)]
        
print(y)

Output:

[1, 1, 1, 1, 2, 2, 2, 3, 4, 4]

answered Nov 17, 2020 at 10:47

lorenzozane

1,2791 gold badge7 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

lorenzozane Over a year ago

@Burak Perfect! If you could mark the answer as solver would be great

mathfux Over a year ago

@LorenzoZane another variant would be enumerate(x, start=1)

mathfux Over a year ago

@LorenzoZane Btw I found the second solution the most economic in pure Python. numpy is a different category (np.repeat is >20x perfomance for large dataset).

s3dev · Accepted Answer · 2020-11-17 13:23:14Z

1

If you wanted to use numpy:

x = np.array([4, 3, 1, 2])
a = np.arange(1, x.size+1)

np.repeat(a, x)

Output:

array([1, 1, 1, 1, 2, 2, 2, 3, 4, 4])

edited Nov 17, 2020 at 13:23

answered Nov 17, 2020 at 11:41

s3dev

9,8713 gold badges34 silver badges49 bronze badges

2 Comments

mathfux Over a year ago

Why np.repeat(a, [*x]) instead of np.repeat(a, x)?

s3dev Over a year ago

@mathfux - Thank you, it was redundant. For some reason, when I originally wrote it, that method didn't work - but now it's fine. I must have done something silly. Edited - thanks.

Alex Mandelias · Accepted Answer · 2020-11-17 10:57:35Z

0

An unnecessarily complicated way to do this is the following, utilizing list comprehensions and the reduce function:

from functools import reduce

def f(ls):
    return reduce(lambda x, y: x+y, [[i+1]*ls[i] for i in range(len(ls))])

>>> f([0, 3, 1])
[2, 2, 2, 3]
>>> f([2, 0, 2])
[1, 1, 3, 3]
>>> f([1, 1, 1, 1, 1])
[1, 2, 3, 4, 5]

answered Nov 17, 2020 at 10:57

Alex Mandelias

5271 gold badge5 silver badges13 bronze badges

1 Comment

mathfux Over a year ago

This is brilliant but it costs performance. It's 8 times slower than the fastest solution in pure Python.

hpaulj · Accepted Answer · 2020-11-17 18:01:21Z

In [52]: alist, reps = [1,2,3,4], [4,3,1,2]

np.repeat does this kind of repetition nicely - but it makes things arrays (which takes time):

In [53]: np.repeat(alist,reps)
Out[53]: array([1, 1, 1, 1, 2, 2, 2, 3, 4, 4])

List repeat can be used with:

In [54]: [[i]*j for i,j in zip([1,2,3,4],[4,3,1,2])]
Out[54]: [[1, 1, 1, 1], [2, 2, 2], [3], [4, 4]]

and the lists of lists can be flattened with:

In [55]: [k for l in ([i]*j for i,j in zip(alist, reps)) for k in l]
Out[55]: [1, 1, 1, 1, 2, 2, 2, 3, 4, 4]
In [56]: list(itertools.chain(*([i]*j for i,j in zip(alist, reps))))
Out[56]: [1, 1, 1, 1, 2, 2, 2, 3, 4, 4]

Some timings:

In [57]: timeit np.repeat(alist,reps)
10.9 µs ± 398 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

If we start with arrays, the result is much faster:

In [58]: %%timeit a,b = np.array(alist), np.array(reps)
    ...:  np.repeat(a,b)
    ...: 
2.97 µs ± 103 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

But for this small sample, the list methods are faster:

In [59]: timeit [k for l in ([i]*j for i,j in zip(alist, reps)) for k in l]
2.33 µs ± 70.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [60]: timeit list(itertools.chain(*([i]*j for i,j in zip(alist, reps))))
2.46 µs ± 76.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

For larger cases, I expect the [58] times to scale the best. I'm not sure about the others.

Collectives™ on Stack Overflow

Python: Create an array using values and indices of a given array

4 Answers 4

3 Comments

2 Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

3 Comments

2 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related