Differences between array class and matrix class in numpy for matrix operation

Question

I was trying to do matrix dot product and transpose with Numpy, and I found array can do many things matrix can do, such as dot product, point wise product, and transpose.

When I have to create a matrix, I have to create an array first.

example:

import numpy as np

array = np.ones([3,1]) 

matrix = np.matrix(array)

Since I can do matrix transpose and dot product in array type, I don't have to convert array into matrix to do matrix operations.

For example, the following line is valid, where A is an ndarray :

dot_product = np.dot(A.T, A )

The previous matrix operation can be expressed with matrix class variable A

dot_product = A.T * A

The operator * is exactly the same as point-wise product for ndarray. Therefore, it makes ndarray and matrix almost indistinguishable and causes confusions.

The confusion is a serious problem, as said in REP465

Writing code using numpy.matrix also works fine. But trouble begins as soon as we try to integrate these two pieces of code together. Code that expects an ndarray and gets a matrix, or vice-versa, may crash or return incorrect results. Keeping track of which functions expect which types as inputs, and return which types as outputs, and then converting back and forth all the time, is incredibly cumbersome and impossible to get right at any scale.

It will be very tempting if we stick to ndarray and deprecate matrix and support ndarray with matrix operation methods such as .inverse(), .hermitian(), outerproduct(), etc, in the future.

The major reason I still have to use matrix class is that it handles 1d array as 2d array, so I can transpose it.

It is very inconvenient so far how I transpose 1d array, since 1d array of size n has shape (n,) instead of (1,n). For example, if I have to do the inner product of two arrays :

A = [[1,1,1],[2,2,2].[3,3,3]] 

B = [[1,2,3],[1,2,3],[1,2,3]]

np.dot(A,B) works fine, but if

B = [1,1,1]

,its transpose is still a row vector.

I have to handle this exception when the dimensions of input variable is unknown.

I hope this help some people with the same trouble, and hope to know if there is any better way to handle matrix operation like in Matlab, especially with 1d array. Thanks.

hpaulj · Accepted Answer · 2017-01-25 22:20:50Z

Your first example is a column vector:

In [258]: x = np.arange(3).reshape(3,1)
In [259]: x
Out[259]: 
array([[0],
       [1],
       [2]])
In [260]: xm = np.matrix(x)

dot produces the inner product, and dimensions operate as: (1,2),(2,1)=>(1,1)

In [261]: np.dot(x.T, x)
Out[261]: array([[5]])

the matrix product does the same thing:

In [262]: xm.T * xm
Out[262]: matrix([[5]])

(The same thing with 1d arrays produces a scalar value, np.dot([0,1,2],[0,1,2]) # 5)

element multiplication of the arrays produces the outer product (so does np.outer(x, x) and np.dot(x,x.T))

In [263]: x.T * x
Out[263]: 
array([[0, 0, 0],
       [0, 1, 2],
       [0, 2, 4]])

For ndarray, * IS element wise multiplication (the .* of MATLAB, but with broadcasting added). For element multiplication of matrix use np.multiply(xm,xm). (scipy sparse matrices have a multiply method, X.multiply(other))

You quote from the PEP that added the @ operator (matmul). This, as well as np.tensordot and np.einsum can handle larger dimensional arrays, and other mixes of products. Those don't make sense with np.matrix since that's restricted to 2d.

With your 3x3 A and B

In [273]: np.dot(A,B)
Out[273]: 
array([[ 3,  6,  9],
       [ 6, 12, 18],
       [ 9, 18, 27]])
In [274]: C=np.array([1,1,1])
In [281]: np.dot(A,np.array([1,1,1]))
Out[281]: array([3, 6, 9])

Effectively this sums each row. np.dot(A,np.array([1,1,1])[:,None]) does the same thing, but returns a (3,1) array.

np.matrix was created years ago to make numpy (actually one of its predecessors) feel more like MATLAB. A key feature is that it is restricted to 2d. That's what MATLAB was like back in the 1990s. np.matrix and MATLAB don't have 1d arrays; instead they have single column or single row matrices.

If the fact that ndarrays can be 1d (or even 0d) is a problem there are many ways of adding that 2nd dimension. I prefer the [None,:] kind of syntax, but reshape is also useful. ndmin=2, np.atleast_2d, np.expand_dims also work.

np.sum and other operations that reduced dimensions have a keepdims=True parameter to counter that. The new @ gives an operator syntax for matrix multiplication. As far as I know, np.matrix class does not have any compiled code of its own.

============

The method that implements * for np.matrix uses np.dot:

 def __mul__(self, other):
    if isinstance(other, (N.ndarray, list, tuple)) :
        # This promotes 1-D vectors to row vectors
        return N.dot(self, asmatrix(other))
    if isscalar(other) or not hasattr(other, '__rmul__') :
        return N.dot(self, other)
    return NotImplemented

Thanks, this is really helpful. I think we can say matrix class should be deprecated since ndarray seems to be more versatile ?

Collectives™ on Stack Overflow

Differences between array class and matrix class in numpy for matrix operation

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related