Check the pseudo code in the image snippet
While Calculating the dW1 you are simply multiplying l1_error with input i.e. X.T
where dh is dot product at hidden layer which is calculated earlier
Now dh is the dot product of weights i.e. W2 and l2_error
Now check your formula dE/DW1 , in this formula you are taking two derivatives in account
- slope at final output multiplied by l2 error into weights at 2nd layer which is again multiplied by
2 slope of at hidden layer multiplied by input weights
Why the slope which is calculated at output layer is not taken into account into your pseudo code snippet? in dh you are just calculating dot products of weights and error but not slope?
Am I missing something here?
Course: https://www.educative.io/courses/beginners-guide-to-deep-learning
Lesson: Back Propagation - A Beginner's Guide to Deep Learning