Working with backpropagation algorithm using softmax function in the neural network

Question

I am creating a Neural Network from scratch for MNIST data, so I have 10 classes in the output layer. I need to perform backpropagation and for that, I need to calculate dA*dZ for the last layer where dA is the derivative of the loss function L wrt the softmax activation function A and dZ is the derivative of the softmax activation functionA wrt to z where z=wx+b. The size obtained for dA is 10*1 whereas the size obtained for dZ is 10*10.

Is it correct? If yes, who do I multiply dA*dZ as they have different dimension.

sempersmile · Accepted Answer · 2020-04-18 11:39:25Z

2

You are almost there. However, you need to transpose dA, e.g. with numpy.transpose(dA). Then you will have the right dimensions of dA and dZ to perform matrix multiplication.

answered Apr 18, 2020 at 11:39

sempersmile

4812 silver badges9 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Navdeep Over a year ago

Thanks btw is it the correct way to apply the softmax the softmax derivative?

sempersmile Over a year ago

Yes, it's right, but don't forget to multiply by the derivative of z with respect to w. Following link might be helpful for you: stats.stackexchange.com/questions/235528/…

Collectives™ on Stack Overflow

Working with backpropagation algorithm using softmax function in the neural network

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related