2

Let A be a numpy array like :

A = np.array([1, 2, 3, 4, 5])

I want to find the cleaner way to produce a new array with each value repeated two times:

B = np.array([1, 1, 2, 2, 3, 3, 4, 4, 5, 5])

Do you think this is the simpler way to do it ?

import numpy as np
B = np.tile(A,2).reshape(2,-1).flatten('F')

3 Answers 3

11

Use repeat()

In [1]: import numpy as np

In [2]: A = np.array([1, 2, 3, 4, 5])

In [3]: np.repeat(A,2)
Out[3]: array([1, 1, 2, 2, 3, 3, 4, 4, 5, 5])
Sign up to request clarification or add additional context in comments.

Comments

3

You can use numpy.column_stack and numpy.ndarray.flatten:

In [12]: numpy.column_stack((A, A)).flatten()                                                    
Out[12]: array([1, 1, 2, 2, 3, 3, 4, 4, 5, 5])

Timing comparison:

In [27]: A = numpy.array([1, 2, 3, 4, 5]*1000)                                                   

In [28]: %timeit numpy.column_stack((A, A)).flatten()                                            
10000 loops, best of 3: 44.7 µs per loop                                                         

In [29]: %timeit numpy.repeat(A, 2)                                                              
10000 loops, best of 3: 104 µs per loop                                                          

In [30]: %timeit numpy.tile(A,2).reshape(2,-1).flatten('F')                                      
10000 loops, best of 3: 129 µs per loop     

4 Comments

Thanks for the timing comparison. Could you also paste the full code for this timing comparison? (with import timeit, etc.)
@Basj That's the whole code :), I used IPython shell for this. You can also try it online: pythonanywhere.com/try-ipython
Don't know why but I get a different timing order (for an x10 size A): In [19]: %timeit numpy.column_stack((A, A)).flatten() 1000 loops, best of 3: 657 us per loop In [20]: %timeit numpy.tile(A,2).reshape(2,-1).flatten('F') 1000 loops, best of 3: 574 us per loop
I actually get: repeat 1.42 us, column_stack 4.66 us and tile 9.09 us
2

If you need to do this operation in a time critical region, the following code is the fastest (using Numpy 1.9 development version):

In [1]: A = numpy.array([1, 2, 3, 4, 5]*1000) 
In [2]: %timeit numpy.array([A, A]).T.ravel('F')
100000 loops, best of 3: 6.44 µs per loop

Note that flatten would make an additional copy, so ravel should be used instead.

If you prefer readability, the column_stack and repeat functions are better:

In [3]: %timeit numpy.column_stack((A, A)).ravel()
100000 loops, best of 3: 15.4 µs per loop

In [4]: timeit numpy.repeat(A, 2)
10000 loops, best of 3: 53.9 µs per loop

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.