python why use numpy.r_ instead of concatenate

Question

In which case using objects like numpy.r_ or numpy.c_ is better (more efficient, more suitable) than using functions like concatenate or vstack for example ?

I am trying to understand a code where the programmer wrote something like:

return np.r_[0.0, 1d_array, 0.0] == 2

where 1d_array is an array whose values can be 0, 1 or 2. Why not using np.concatenate (for example) instead ? Like :

return np.concatenate([[0.0], 1d_array, [0.0]]) == 2

It is more readable and apparently it does the same thing.

Just a notational convenience. np.r_[1:5, 3:7] versus np.concatenate(np.arange(....). Same speed. It all ends up as a concatenate call. — hpaulj
– hpaulj, Commented Jun 10, 2016 at 10:30
Full code is in github.com/numpy/numpy/blob/master/numpy/lib/index_tricks.py. r_ is an AxisConcatenator object. Instructive reading. — hpaulj
– hpaulj, Commented Jun 10, 2016 at 10:56

Community · Accepted Answer · 2017-05-23 12:25:49Z

np.r_ is implemented in the numpy/lib/index_tricks.py file. This is pure Python code, with no special compiled stuff. So it is not going to be any faster than the equivalent written with concatenate, arange and linspace. It's useful only if the notation fits your way of thinking and your needs.

In your example it just saves converting the scalars to lists or arrays:

In [452]: np.r_[0.0, np.array([1,2,3,4]), 0.0]
Out[452]: array([ 0.,  1.,  2.,  3.,  4.,  0.])

error with the same arguments:

In [453]: np.concatenate([0.0, np.array([1,2,3,4]), 0.0])
...
ValueError: zero-dimensional arrays cannot be concatenated

correct with the added []

In [454]: np.concatenate([[0.0], np.array([1,2,3,4]), [0.0]])
Out[454]: array([ 0.,  1.,  2.,  3.,  4.,  0.])

hstack takes care of that by passing all arguments through [atleast_1d(_m) for _m in tup]:

In [455]: np.hstack([0.0, np.array([1,2,3,4]), 0.0])
Out[455]: array([ 0.,  1.,  2.,  3.,  4.,  0.])

So at least in simple cases it is most similar to hstack.

But the real usefulness of r_ comes when you want to use ranges

np.r_[0.0, 1:5, 0.0]
np.hstack([0.0, np.arange(1,5), 0.0])
np.r_[0.0, slice(1,5), 0.0]

r_ lets you use the : syntax that is used in indexing. That's because it is actually an instance of a class that has a __getitem__ method. index_tricks uses this programming trick several times.

They've thrown in other bells-n-whistles

Using an imaginary step, uses np.linspace to expand the slice rather than np.arange.

np.r_[-1:1:6j, [0]*3, 5, 6]

produces:

array([-1. , -0.6, -0.2,  0.2,  0.6,  1. ,  0. ,  0. ,  0. ,  5. ,  6. ])

There are more details in the documentation.

I did some time tests for many slices in https://stackoverflow.com/a/37625115/901925

Nico Schlömer · Accepted Answer · 2023-02-13 08:52:05Z

20

I was also interested in this question and compared the speed of

numpy.c_[a, a]
numpy.stack([a, a]).T
numpy.vstack([a, a]).T
numpy.column_stack([a, a])
numpy.concatenate([a[:,None], a[:,None]], axis=1)

which all do the same thing for any input vector a. Here's what I found (using perfplot):

For smaller numbers, numpy.concatenate is the winner, for larger stack/vstack.

The plot was created with

import numpy as np
import perfplot

b = perfplot.bench(
    setup=np.random.rand,
    kernels=[
        lambda a: np.c_[a, a],
        lambda a: np.stack([a, a]).T,
        lambda a: np.vstack([a, a]).T,
        lambda a: np.column_stack([a, a]),
        lambda a: np.concatenate([a[:, None], a[:, None]], axis=1),
    ],
    labels=["c_", "stack", "vstack", "column_stack", "concat"],
    n_range=[2**k for k in range(22)],
    xlabel="len(a)",
)
b.save("out.png")
b.show()

edited Feb 13, 2023 at 8:52

answered Sep 22, 2016 at 12:13

Nico Schlömer

59.7k35 gold badges216 silver badges291 bronze badges

1 Comment

Shmil The Cat Over a year ago

came for np._r stayed for perfplot :)

piRSquared · Accepted Answer · 2016-06-10 09:02:18Z

6

All the explanation you need:

https://sourceforge.net/p/numpy/mailman/message/13869535/

I found the most relevant part to be:

"""
For r_ and c_ I'm summarizing, but effectively they seem to be doing
something like:

r_[args]:
    concatenate( map(atleast_1d,args),axis=0 )

c_[args]:
    concatenate( map(atleast_1d,args),axis=1 )

c_ behaves almost exactly like hstack -- with the addition of range
literals being allowed.

r_ is most like vstack, but a little different since it effectively
uses atleast_1d, instead of atleast_2d.  So you have
>>> numpy.vstack((1,2,3,4))
array([[1],
       [2],
       [3],
       [4]])
but
>>> numpy.r_[1,2,3,4]
array([1, 2, 3, 4])
"""

edited Jun 10, 2016 at 9:02

answered Jun 10, 2016 at 8:51

piRSquared

296k68 gold badges509 silver badges654 bronze badges

3 Comments

dodell Over a year ago

You should at least describe the content of that page, incase the hyperlink breaks.

piRSquared Over a year ago

@dodell fair enough

hpaulj Over a year ago

I think the comparison of r_ and c_ to vstack and hstack is misleading, even wrong. In the case of 1,2,3,4 the 4 operations produce shapes (4,), (1,4), (4,1), (4,) respectively. In this simple case r_ and hstack produce the same thing, and c_ and vstack are the transpose of each other.

Collectives™ on Stack Overflow

python why use numpy.r_ instead of concatenate

3 Answers 3

Comments

1 Comment

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related