6

Suppose I have a numpy array c constructed as follows:

a = np.zeros((2,4))
b = np.zeros((2,8))
c = np.array([a,b])

I would have expected c.shape to be (2,1) or (2,) but instead it is (2,2). Additionally, what I want to do is concatenate a column vector of ones onto a, but by accessing it through c in the following way:

c0 = c[0] # I would have expected this to be 'a'
np.concatenate((np.ones((c0.shape[0], 1)), c0), axis=1)

This of course doesn't work because c[0] does not equal a as I expected, and I get

ValueError: all the input arrays must have same number of dimensions

I need some way to have an array (or list) of pairs, each pair component being a numpy array, and I need to access the first array in the pair in order to concatenate a column vector of ones to it. My application is machine learning and my data will be coming to me in the format described, but I need to modify the data at the start in order to add a bias element to it.

EDIT: I'm using Python 2.7 and Numpy 1.8.2

12
  • 4
    Your example code does not work for me, I get a ValueError: could not broadcast input array from shape (2,4) into shape (2) when assigning c. Commented Jul 18, 2015 at 16:21
  • 1
    Why not just use c = [a,b]? It is possible to make c an array of object dtype, in which you could store NumPy arrays of arbitrary shape -- c = np.empty((2,), dtype='object'); c[:] = [a,b]-- , but object arrays do not enjoy any speed benefit over a plain Python list. You might use it for NumPy slicing syntax, but I have yet to see a compelling use case. Commented Jul 18, 2015 at 16:36
  • 1
    By the way, with NumPy v.1.9.0, np.array([a,b]) raises the same ValueError that Dux mentioned. Commented Jul 18, 2015 at 16:38
  • 1
    If you have two numpy arrays in your pair, that do not have the same shape, you cannot combine them to one array. They don't fit. Except if you use HappyLeapSecond's solution Commented Jul 18, 2015 at 16:40
  • 1
    If I understand correctly, c = [a, b] fits the bill. Commented Jul 18, 2015 at 16:52

2 Answers 2

5

I believe what you want to use is hstack:

a = np.zeros((2,4))  # 4 column vectors of length 2
b = np.ones((2,1))   # 1 column vector of length 2

c = np.hstack((a, b))
print c
# [[ 0.  0.  0.  0.  1.]
#  [ 0.  0.  0.  0.  1.]]

Regarding the problem concatenating your a and b: This cannot be done in a obvious way. Concatenation means stacking on top of each other in an additional dimension. Your data does not fit on one another though...

Sign up to request clarification or add additional context in comments.

4 Comments

This ignores the fact that I will have essentially an array of pairs of arrays though, which is my problem. This looks like another way to do np.concatenate but I still need to go through the steps of accessing the array in my way described.
Ok, seems I did not understand your question after all. What is your desired output? A two-dimensional array, or a three-dimensional one?
@aconkey: you say, "I need some way to have an array (or list) of pairs", which is what this gives (by my interpretation, anyway). Could you describe specifically what it is about this "array of pairs" that does not match what you mean by an "array of pairs"?
@tom10: OP wants c[0] to return a.
4

Generally, nested NumPy arrays of NumPy arrays are not very useful. If you are using NumPy for speed, usually it is best to stick with NumPy arrays with a homogenous, basic numeric dtype.

To place two items in a data structure such that c[0] returns the first item, and c[1] the second, a list (or tuple) such as c = [a, b] will do.


By the way, if you are using the statemodels package, then you can add a constant column with sm.add_constant:

import numpy as np
import statsmodels.api as sm

a = np.random.randint(10, size=(2,4))
print(a)
# [[2 3 9 6]
#  [0 2 1 1]]
print(sm.add_constant(a))
[[ 1.  2.  3.  9.  6.]
 [ 1.  0.  2.  1.  1.]]

Note however that if a already contains a constant column, no extra column is added:

In [126]: sm.add_constant(np.zeros((2,4)))
Out[126]: 
array([[ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]])

1 Comment

"nested NumPy arrays of NumPy arrays are not very useful." This comment isn't very useful. Some libraries use nested NumPy arrays of NumPy arrays and there is no avoiding them.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.