Calculate the softmax of an array column-wise using numpy

Question

Following https://classroom.udacity.com/courses/ud730/lessons/6370362152/concepts/63815621490923, I'm trying to write a "softmax" function which, when given a 2-dimensional array as input, calculates the softmax of each column. I wrote the following script to test it:

import numpy as np

#scores=np.array([1.0,2.0,3.0])

scores=np.array([[1,2,3,6],
                [2,4,5,6],
                [3,8,7,6]])

def softmax(x):
    if x.ndim==1:
        S=np.sum(np.exp(x))
        return np.exp(x)/S
    elif x.ndim==2:
        result=np.zeros_like(x)
        M,N=x.shape
        for n in range(N):
            S=np.sum(np.exp(x[:,n]))
            result[:,n]=np.exp(x[:,n])/S
        return result
    else:
        print("The input array is not 1- or 2-dimensional.")

s=softmax(scores)
print(s)

However, the result "s" turns out to be an array of zeros:

[[0 0 0 0]
 [0 0 0 0]
 [0 0 0 0]]

If I remove the "/S" in the for-loop, the 'un-normalized' result is as I would expect it to be; somehow the "/S" division appears to make all the elements zero instead dividing each element by S as I would expect it to. What is wrong with the code?

The problem is in result=np.zeros_like(x), that creates an array of integers if x is an array of integers. When you assign, term by term, the results of the normalization (all numbers are in interval 0<n<1) to an array of integers, these normalized numbers are converted to integers and hence forced to zero. On the contrary, the non normalized exponentials are just rounded toward zero. If you want to use a loop, you can instantiate result as np.zeros(x.shape), but if you'd like to: 1) not use an auxiliary array and 2) remove the loop, please have a look at my answer below. ciao — gboffi
– gboffi, Commented Apr 20, 2016 at 10:01

Kurt Peek · Accepted Answer · 2016-04-20 09:10:38Z

6

The reason for the "zeros" lies in the data type of the inputs, which are of the "int" type. Converting the input to "float" solved the problem:

import numpy as np

#scores=np.array([1.0,2.0,3.0])

scores=np.array([[1,2,3,6],
                [2,4,5,6],
                [3,8,7,6]])

def softmax(x):
    x=x.astype(float)
    if x.ndim==1:
        S=np.sum(np.exp(x))
        return np.exp(x)/S
    elif x.ndim==2:
        result=np.zeros_like(x)
        M,N=x.shape
        for n in range(N):
            S=np.sum(np.exp(x[:,n]))
            result[:,n]=np.exp(x[:,n])/S
        return result
    else:
        print("The input array is not 1- or 2-dimensional.")

s=softmax(scores)
print(s)

Note that I've added "x=x.astype(float)" to the first line of the function definition. This yields the expected output:

[[ 0.09003057  0.00242826  0.01587624  0.33333333]
 [ 0.24472847  0.01794253  0.11731043  0.33333333]
 [ 0.66524096  0.97962921  0.86681333  0.33333333]]

answered Apr 20, 2016 at 9:10

Kurt Peek

58.5k104 gold badges354 silver badges572 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

gboffi Over a year ago

The reason for the "zeros" DOES NOT lies in the data type of the inputs, it depends on how you instantiate the matrix result, using np.zeros_like(x) instead of np.zeros(x.shape)

gboffi · Accepted Answer · 2017-11-15 10:42:22Z

3

The problem in your code is how you instantiate the placeholder for the results that you're about to compute, that is

    result=np.zeros_like(x)

because if x is an array of integers, also result is an array of integers and when you assign to it,

        result[:,n]=np.exp(x[:,n])/S

a conversion to integer is enforced. When you normalize dividing by S all the numbers converted to integers are in the interval (0, 1], the conversion is done truncating towards zero and so you have an array of zeros.

You said that, if you don't normalize, result is different from zero... that's because in this case you convert to integers numbers LARGER than 1.

A possible solution, that you can use in your code as is, consists in instantiating an array of float, irrispective of the type of x

    result=np.zeros(x.shape)

but I have to say that your code computes the exponential twice and uses loops where you could use vectorized operations.

Here it is a different implementation that (a) avoids loops and (b) avoids unnecessary evaluations of the exponential,

def sm(a):
    s = np.exp(a)
    if a.ndim == 1:
        return s/s.sum()
    elif a.ndim == 2:
        return s/s.sum(0) 
    else:
        return

A small test,

In [32]: sm(np.array([[1,2,3,6],
                [2,4,5,6],
                [3,8,7,6]]))
Out[32]: 
array([[ 0.09003057,  0.00242826,  0.01587624,  0.33333333],
       [ 0.24472847,  0.01794253,  0.11731043,  0.33333333],
       [ 0.66524096,  0.97962921,  0.86681333,  0.33333333]])

In [33]:

Note that it works perfectly also with an integer array as input.

Addendum

Following the suggestion from n13 the function can be rewritten as

def sm(a):
    s = np.exp(a)
    if a.ndim <3: return s/s.sum(0)

Thank you n13.

PS when I wrote the addendum I had not realized that n13 had posted an answer on its own...

edited Nov 15, 2017 at 10:42

answered Apr 20, 2016 at 9:33

gboffi

25.4k10 gold badges62 silver badges98 bronze badges

2 Comments

n13 Over a year ago

Note: the if-else is not needed here, s.sum(0) - the else branch here - handles all input dimensions.

gboffi Over a year ago

@n13 I hadn't noticed the possible nice generalization — answer edited accordingly. Thanks a lot.

n13 · Accepted Answer · 2017-11-15 03:39:51Z

2

Numpy has some nifty matrix operations that makes this problem a lot easier and simpler to solve.

Calculating the exponential works on a matrix of any dimension

the sum() method takes an argument axis which allows us to restrict the sum to a given axis - columns maps to axis 0 in our case.

def softmax(x):
    exp = np.exp(x) # exp just calculates exp for all elements in the matrix
    return exp / exp.sum(0) # sum axis = 0 argument sums over axis representing columns

answered Nov 15, 2017 at 3:39

n13

6,97356 silver badges40 bronze badges

Collectives™ on Stack Overflow

Calculate the softmax of an array column-wise using numpy

3 Answers 3

1 Comment

Addendum

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Addendum

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related