Could you check the back propagation code highlighted in the screen shot please? The question statement states the following :
1. Apply the sigmoid activation function to the net hidden layer outputs respectively.
2. Apply the softmax activation function to the net output of the final layer.
In this snippet while calculating DW3 and db3, slope i.e., derivative of SoftMax is not taken into the account, similarly while calculating dh2 i.e. dot product at the hidden layer2 its only l3_error i.e., slope aka derivative of SoftMax function is missing
Could you check the code till it propagates to first layer for such issues please? @Javeria_Tariq please assist. Thanks