11

In python numpy package, I am having trouble understanding the situation where an ndarray has the 2nd dimension being empty. Here is an example:

    In[1]: d2 = np.random.rand(10)
    In[2]: d2.shape = (-1, 1)

    In[3]: print d2.shape
    In[4]: print(d2)

    In[5]: print d2[::2, 0].shape
    In[6]: print d2[::2, 0]

    Out[3]:(10, 1)
    Out[4]:
[[ 0.12362278]
 [ 0.26365227]
 [ 0.33939172]
 [ 0.91501369]
 [ 0.97008342]
 [ 0.95294087]
 [ 0.38906367]
 [ 0.1012371 ]
 [ 0.67842086]
 [ 0.23711077]]

    Out[5]: (5,)
    Out[6]: [ 0.12362278  0.33939172  0.97008342  0.38906367  0.67842086]

My understanding is that d2 is a 10 rows by 1 column ndarray. Out[6] is obviously a 1 by 5 array, how can the dimensions be (5,) ? What does the empty 2nd dimension mean?

2
  • 7
    (5,) is just Python's way of stringifying a tuple with one entry, because (5) would be ambiguous (or rather just 5). Commented Apr 4, 2016 at 21:06
  • 3
    "Out[6] is obviously a 1 by 5 array" - no, there's no "1 by" on that array. It's 1-dimensional, with its only dimension having length 5. Commented Apr 4, 2016 at 22:34

3 Answers 3

13

Let me just give you one example that illustrate one important difference.

d1 = np.array([1,2,3,4,5]) # array([1, 2, 3, 4, 5])
d1.shape -> (5,) # row array.    
d1.size -> 5
# Note: d1.T is the same as d1.

d2 = d1[np.newaxis] # array([[1, 2, 3, 4, 5]]). Note extra []
d2.shape -> (1,5) 
d2.size -> 5
# Note: d2.T will give a column array
array([[1],
       [2],
       [3],
       [4],
       [5]])
d2.T.shape -> (5,1)
Sign up to request clarification or add additional context in comments.

Comments

7

I also thought ndarrays would represent even 1-d arrays as 2-d arrays with a thickness of 1. Maybe because of the name "ndarray" makes us think high dimensional, however, n can be 1, so ndarrays can just have one dimension.

Compare these

x = np.array([[1], [2], [3], [4]])
x.shape
# (4, 1)
x = np.array([[1, 2, 3, 4]])
x.shape
#(1, 4)
x = np.array([1, 2, 3, 4])
x.shape
#(4,)

and (4,) means (4).

If I reshape x and back to (4), it comes back to original

x.shape = (2,2)
x
# array([[1, 2],
#       [3, 4]])
x.shape = (4)
x
# array([1, 2, 3, 4])

Comments

1

The main thing to understand here is that indexing with an integer is different than indexing with a slice. For example, when you index a 1d array or a list with an integer you get a scalar but when you index with a slice, you get an array or a list respectively. The same thing applies to 2d+ arrays. So for example:

# Make a 3d array:
import numpy as np
array = np.arange(60).reshape((3, 4, 5))

# Indexing with ints gives a scalar
print array[2, 3, 4] == 59
# True

# Indexing with slices gives a 3d array
print array[:2, :2, :2].shape
# (2, 2, 2)

# Indexing with a mix of slices and ints will give an array with < 3 dims
print array[0, :2, :3].shape
# (2, 3)
print array[:, 2, 0:1].shape
# (3, 1)

This can be really useful conceptually, because sometimes its great to think of an array as a collection of vectors, for example I can represent N points in space as an (N, 3) array:

n_points = np.random.random([10, 3])
point_2 = n_points[2]
print all(point_2 == n_points[2, :])
# True

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.