5

I am having problems to vectorize a multidimensional function.
Consider the following example:

def _cost(u):
    return u[0] - u[1]

cost = np.vectorize(_cost)

>>> x = np.random.normal(0, 1,(10, 2))
>>> cost(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/lucapuggini/MyApps/scientific_python_3_5/lib/python3.5/site-packages/numpy/lib/function_base.py", line 2218, in __call__
    return self._vectorize_call(func=func, args=vargs)
  File "/Users/lucapuggini/MyApps/scientific_python_3_5/lib/python3.5/site-packages/numpy/lib/function_base.py", line 2281, in _vectorize_call
    ufunc, otypes = self._get_ufunc_and_otypes(func=func, args=args)
  File "/Users/lucapuggini/MyApps/scientific_python_3_5/lib/python3.5/site-packages/numpy/lib/function_base.py", line 2243, in _get_ufunc_and_otypes
    outputs = func(*inputs)
TypeError: _cost() missing 1 required positional argument: 'v'

Background information: I encountered the problem while trying to generalize the following code (Particle Swarm Optimization Algorithm) to multivariate data:

import numpy as np
import matplotlib.pyplot as plt


def pso(cost, sim, space_dimension, n_particles, left_lim, right_lim, f1=1, f2=1, verbose=False):

    best_scores = np.array([np.inf]*n_particles)
    best_positions = np.zeros(shape=(n_particles, space_dimension))
    particles = np.random.uniform(left_lim, right_lim, (n_particles, space_dimension))
    velocities = np.zeros(shape=(n_particles, space_dimension))

    for i in range(sim):
        particles = particles + velocities
        print(particles)
        scores = cost(particles).ravel()
        better_positions = np.argwhere(scores < best_scores).ravel()
        best_scores[better_positions] = scores[better_positions]
        best_positions[better_positions, :] = particles[better_positions, :]
        g = best_positions[np.argmin(best_scores), :]

        u1 = np.random.uniform(0, f1, (n_particles, 1))
        u2 = np.random.uniform(0, f2, (n_particles, 1))
        velocities = velocities + u1 * (best_positions - particles) + u2 * (g - particles)

        if verbose and i % 50 == 0:
            print('it=', i, ' score=', cost(g))


            x = np.linspace(-5, 20, 1000)
            y = cost(x)

            plt.plot(x, y)
            plt.plot(particles, cost(particles), 'o')
            plt.vlines(g, y.min()-2, y.max())
            plt.show()


    return g, cost(g)




def test_pso_1_dim():

    def _cost(x):
        if 0 < x < 15:
            return np.sin(x)*x 
        else:
            return 15 + np.min([np.abs(x-0), np.abs(x-15)])

    cost = np.vectorize(_cost)

    sim = 100
    space_dimension = 1
    n_particles = 5
    left_lim, right_lim = 0, 15
    f1, f2  = 1, 1

    x, cost_x = pso(cost, sim, space_dimension, n_particles,
                    left_lim, right_lim, f1, f2, verbose=False)

    x0 = 11.0841839
    assert np.abs(x - x0) < 0.01

    return 

Please advise me if vectorization is not a good idea in this case.

2 Answers 2

6

As mentioned in the notes for vectorize:

The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop.

So while vectorizing your code may be a good idea via numpy types and functions, you probably shouldn't do this using numpy.vectorize.

For the example you gave, your cost might be simply and efficiently calculated as a function operating on a numpy array:

def cost(x):
    # Create the empty output 
    output = np.empty(x.shape)
    # Select the first group using a boolean array
    group1 = (0 < x) & (x < 15) 
    output[group1] = np.sin(x[group1])*x[group1]
    # Select second group as inverse (logical not) of group1 
    output[~group1] = 15 + np.min(
        [np.abs(x[~group1]-0), np.abs(x[~group1]-15)],
        axis=0)
    return output
Sign up to request clarification or add additional context in comments.

Comments

3

np.vectorize feeds scalars to your function. For example:

In [1090]: def _cost(u):
      ...:     return u*2

In [1092]: cost=np.vectorize(_cost)
In [1093]: cost(np.arange(10)
      ...: )
Out[1093]: array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])
In [1094]: cost(np.ones((3,4)))
Out[1094]: 
array([[ 2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.],
       [ 2.,  2.,  2.,  2.]])

But your function acts as though it is getting a list or array with 2 values. What were you intending?

A function with 2 scalars:

In [1095]: def _cost(u,v):
      ...:     return u+v
      ...:     
      ...: 
In [1096]: cost=np.vectorize(_cost)

In [1098]: cost(np.arange(3),np.arange(3,6))
Out[1098]: array([3, 5, 7])

In [1099]: cost([[1],[2]],np.arange(3,6))
Out[1099]: 
array([[4, 5, 6],
       [5, 6, 7]])

Or with your 2 column x:

In [1103]: cost(x[:,0],x[:,1])
Out[1103]: 
array([-1.7291913 , -0.46343403,  0.61574928,  0.9864683 , -1.22373097,
        1.01970917,  0.22862683, -0.11653917, -1.18319723, -3.39580376])

which is the same as doing an array sum on axis 1

In [1104]: x.sum(axis=1)
Out[1104]: 
array([-1.7291913 , -0.46343403,  0.61574928,  0.9864683 , -1.22373097,
        1.01970917,  0.22862683, -0.11653917, -1.18319723, -3.39580376])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.