Please check the following statements
The backpropagation from output to hidden layer basically involves:
-
Slope of the loss function i.e., Error which is equal to (
outy - targety
). -
Slope of the activation function, i.e., the sigmoid derivative of the node from where the error is coming from (output layer, in this case, i.e.,
outy(1 - outy)
. - The value of the node that feeds into the weight, (i.e., hidden layer unit,
outh
).
based on the above statements the following pseudo code is written on page
l2_error = out_y - target_y
dw = l2_error . out_h
db = l2_error
My doubt : 2nd statement above and the 2nd statement in the pseudo code do not match i.e., while calculating the derivative of weight, sigmoid derivative is not taken into account and dw is just a product of error i.e. actual minus target and output of hidden layer which confuses me. And, this is not the case in backpropagation from hidden to input layer
@Javeria_Tariq could you explain my doubt please?