1

I'm trying to apply a vectorized function over a 2-d array in numpy row-wise, and I'm encountering ValueError: setting an array element with a sequence.

import numpy as np

X = np.array([[0, 1], [2, 2], [3, 0]], dtype=float)
coeffs = np.array([1, 1], dtype=float)

np.apply_along_axis(
    np.vectorize(lambda row: 1.0 / (1.0 + np.exp(-coeffs.dot(row)))),
    0, X
)

I don't totally know how to interpret this error. How am I setting an array element with a sequence?

When I test the lambda function on a single row, it works and returns a single float. Somehow it's failing within the scope of this vectorized function, which leads me to believe that either the vectorized function is wrong or I'm not using apply_along_axis correctly.

Is it possible to use a vectorized function in this context? If so, how? Can a vectorized function take an array or am I misunderstanding the documentation?

4
  • 1
    Why are you calling np.vectorize on a function that's supposed to take rows? Commented Jul 12, 2017 at 20:51
  • I ended up using the solution suggested by Divakar, but I'm interested in understanding if it's possible to use vectorize because I thought the row-wise implementation was slightly easier to interpret. Commented Jul 12, 2017 at 21:00
  • 1
    As ironic as this might sound, np.vectorize isn't a vectorized operation, as you seem to be asking about. From the docs - "The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.". Commented Jul 12, 2017 at 21:03
  • Yeah, you're right. I think vectorize is just an all around bad idea. Thanks for the help! Commented Jul 12, 2017 at 21:12

2 Answers 2

2

You are sum-reducing the second axis of X against the only axis of coeffs. So, you could simply use np.dot(X,coeffs) for sum-reductions.

Thus, a vectorized solution would be -

1.0 / (1.0 + np.exp(-X.dot(coeffs)))

Sample run -

In [227]: X = np.array([[0, 1], [2, 2], [3, 0]], dtype=float)
     ...: coeffs = np.array([1, 1], dtype=float)
     ...: 

# Using list comprehension    
In [228]: [1.0 / (1.0 + np.exp(-coeffs.dot(x))) for x in X]
Out[228]: [0.7310585786300049, 0.98201379003790845, 0.95257412682243336]

# Using proposed method
In [229]: 1.0 / (1.0 + np.exp(-X.dot(coeffs)))
Out[229]: array([ 0.73105858,  0.98201379,  0.95257413])

The correct way to use np.apply_along_axis would be to drop np.vectorize and apply it along the second axis of X, i.e. every row of X -

np.apply_along_axis(lambda row: 1.0 / (1.0 + np.exp(-coeffs.dot(row))), 1,X)
Sign up to request clarification or add additional context in comments.

1 Comment

This is right, and I actually ended up doing something very similar rather than messing with vectorize. I guess I'm more interested in why the earlier code didn't work. I'm going to edit my question
0

In v 1.12 vectorize docs says:

By default, pyfunc is assumed to take scalars as input and output.

In your attempt:

np.apply_along_axis(
    np.vectorize(lambda row: 1.0 / (1.0 + np.exp(-coeffs.dot(row)))),
    0, X
)

apply_along_axis iterates on all axes except 0, and feeds the resulting 1d array to its function. So for 2d it will iterate on 1 axis, and feed the other. Divakar shows it iterating on the 0 axis, and feeding rows. So it's basically the same as the list comprehension with an array wrapper.

apply_along_axis makes more sense with 3d or higher inputs, where it's more fiddly to iterate on 2 axes and feed the third to your function.

Writing your lambda as a function:

def foo(row):
    return 1.0/(1.0+np.exp(-coeffs.dot(row)))

Given an array (row) it returns a scalar:

In [768]: foo(X[0,:])
Out[768]: 0.7310585786300049

But given a scalar, it returns an array:

In [769]: foo(X[0,0])
Out[769]: array([ 0.5,  0.5])

That explains the sequence error message. vectorize expected your function to return a scalar, but it got an array.

signature

In v 1.12 vectorize adds a signature parameter, which lets us feed something bigger than a scalar to the function. I explored it in:

https://stackoverflow.com/a/44752552/901925

Using the signature I get vectorize to work with:

In [784]: f = np.vectorize(foo, signature='(n)->()')
In [785]: f(X)
Out[785]: array([ 0.73105858,  0.98201379,  0.95257413])

the same thing as this:

In [787]: np.apply_along_axis(foo,1,X)
Out[787]: array([ 0.73105858,  0.98201379,  0.95257413])

timings

In [788]: timeit np.apply_along_axis(foo,1,X)
10000 loops, best of 3: 80.8 µs per loop
In [789]: timeit f(X)
1000 loops, best of 3: 181 µs per loop
In [790]: np.array([foo(x) for x in X])
Out[790]: array([ 0.73105858,  0.98201379,  0.95257413])
In [791]: timeit np.array([foo(x) for x in X])
10000 loops, best of 3: 22.1 µs per loop

list comprehension is fastest, vectorize slowest.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.