Softmax function in neural network (Python)

Question

I am learning the neural network and implement it in python. I firstly define a softmax function, I follow the solution given by this question Softmax function - python. The following is my codes:

def softmax(A):
    """
    Computes a softmax function. 
    Input: A (N, k) ndarray.
    Returns: (N, k) ndarray.
    """
    s = 0
    e = np.exp(A)
    s = e / np.sum(e, axis =0)
    return s

I was given a test codes to see if the sofmax function is correct. The test_array is the test data and test_output is the correct output for softmax(test_array). The following is the test codes:

# Test if your function works correctly.
test_array = np.array([[0.101,0.202,0.303],
                       [0.404,0.505,0.606]]) 
test_output = [[ 0.30028906,  0.33220277,  0.36750817],
               [ 0.30028906,  0.33220277,  0.36750817]]
print(np.allclose(softmax(test_array),test_output))

However according to the softmax function that I defined. Testing the data by softmax(test_array) returns

print (softmax(test_array))

[[ 0.42482427  0.42482427  0.42482427]
 [ 0.57517573  0.57517573  0.57517573]]

Could anyone indicate me what is the problem of the function softmax that I defined?

grovina · Accepted Answer · 2017-11-19 01:13:26Z

The problem is in your sum. You are summing in axis 0 where you should keep axis 0 untouched.

To sum over all the entries in the same example, i.e., in the same line, you have to use axis 1 instead.

def softmax(A):
    """
    Computes a softmax function. 
    Input: A (N, k) ndarray.
    Returns: (N, k) ndarray.
    """
    e = np.exp(A)
    return e / np.sum(e, axis=1, keepdims=True)

Use keepdims to preserve shape and be able to divide e by the sum.

In your example, e evaluates to:

[[ 1.10627664  1.22384801  1.35391446]
 [ 1.49780395  1.65698552  1.83308438]]

then the sum for each example (denominator in the return line) is:

[[ 3.68403911]
 [ 4.98787384]]

The function then divides each line by its sum and gives the result you have in test_output.

As MaxU pointed out, it is a good practice to remove the max before exponentiating, in order to avoid overflow:

e = np.exp(A - np.sum(A, axis=1, keepdims=True))

MaxU - stand with Ukraine · Accepted Answer · 2017-11-19 01:09:57Z

2

Try this:

In [327]: def softmax(A):
     ...:     e = np.exp(A)
     ...:     return  e / e.sum(axis=1).reshape((-1,1))

In [328]: softmax(test_array)
Out[328]:
array([[ 0.30028906,  0.33220277,  0.36750817],
       [ 0.30028906,  0.33220277,  0.36750817]])

or better this version which will prevent overflow when large values are exponentiated:

def softmax(A):
    e = np.exp(A - np.max(A, axis=1).reshape((-1, 1)))
    return  e / e.sum(axis=1).reshape((-1,1))

edited Nov 19, 2017 at 1:09

answered Nov 19, 2017 at 1:04

MaxU - stand with Ukraine

212k37 gold badges402 silver badges436 bronze badges

Comments

daquexian · Accepted Answer · 2017-11-19 05:25:23Z

2

You can print np.sum(e, axis=0) by yourself. You will see it is an array with 3 elements [ 2.60408059 2.88083353 3.18699884]. Then e / np.sum(e, axis=0) represents the 3-element array above divides every element of e(which is a 3-element array too). Apparently it is not you want.

You should change np.sum(e, axis=0) to np.sum(e, axis=1, keepdims=True), so that you will get

[[ 3.68403911]                  
 [ 4.98787384]]

instead, which is what you actually want. And you will get the right result.

And I recommand you read the rules of broadcasting in numpy. It describes how plus/subtract/multiply/divide works on two arrays with different sizes.

edited Nov 19, 2017 at 5:25

answered Nov 19, 2017 at 1:11

daquexian

1893 silver badges17 bronze badges

Comments

Mateen Ulhaq · Accepted Answer · 2017-11-19 01:05:48Z

0

Perhaps this may be enlightening:

>>> np.sum(test_output, axis=1)
array([ 1.,  1.])

Notice that each row is normalized. In other words, they want you to compute softmax for each row independently.

answered Nov 19, 2017 at 1:05

Mateen Ulhaq

27.9k21 gold badges121 silver badges155 bronze badges

1 Comment

Jassy.W Over a year ago

Thanks @Mateen Ulhaq

Collectives™ on Stack Overflow

Softmax function in neural network (Python)

4 Answers 4

Comments

Comments

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related