Back Propagation: https://www.educative.io/courses/beginners-guide-to-deep-learning/q2615KlEy67

Vikrant · August 9, 2023, 4:52am

def backward_propagation(X, Y, out_h, out_y):
l2_error = out_y - Y # actual - target
slope_output = out_y * (1 - out_y) # derivative of the sigmoid activation function at the output layer
dw = l2_error * slope_output * out_h # gradient of the weights connecting the output and hidden layers
db = l2_error * slope_output # gradient of the bias at the output layer

dh = np.dot(W2.T, l2_error)
l1_error = np.multiply(dh, out_h * (1 - out_h))
dW1 = np.dot(l1_error, X.T)
db1 = l1_error
return dw, db, dW1, db1

@Javeria_Tariq there are 3 issues in the above code snippet

dw = l2_error * slope_output * out_h # gradient of the weights connecting the output and hidden layers
this will give error during multiplication while updating the weights because you have not transposed out_h
I believe it should be
dW2 = np.dot((l2_error * slope_output) , out_h.T) # gradient of the weights connecting the output and hidden layers
Correct me in case I’m wrong
db1 = l1_error
Earlier in the code this was i.e.,
db1 = np.sum(l1_error, axis=1, keepdims=True) # derivative of layer 1 bias is simply the error at layer 1
Though, I am not sure if it will make any difference

Finally, you forgot to update the changes you have made in the code section below i.e., Backpropagation for the XOR operator#. You still have the old code there

import numpy as np

def sigmoid(z):
“”“sigmoid activation function on input z”""
return 1 / (1 + np.exp(-z)) # defines the sigmoid activation function

def forward_propagation(X, Y, W1, b1, W2, b2):
“”"
Computes the forward propagation operation of a neural network and
returns the output after applying the sigmoid activation function
“”"
net_h = np.dot(W1, X) + b1 # net output at the hidden layer
out_h = sigmoid(net_h) # actual after applying sigmoid
net_y = np.dot(W2, out_h) + b2 # net output at the output layer
out_y = sigmoid(net_y) # actual output at the output layer

return out_h, out_y

def backward_propagation(X, Y, out_h, out_y, W2):
“”"
Computes the backpropagation operation of a neural network and
returns the derivative of weights and biases
“”"
l2_error = out_y - Y # actual - target
dW2 = np.dot(l2_error, out_h.T) # derivative of layer 2 weights is the dot product of error at layer 2 and hidden layer output
db2 = np.sum((l2_error * slope_output), axis = 1, keepdims=True) # derivative of layer 2 bias is simply the error at layer 2

dh = np.dot(W2.T, l2_error) # compute dot product of weights in layer 2 with error at layer 2
l1_error = np.multiply(dh, out_h * (1 - out_h)) # compute layer 1 error
dW1 = np.dot(l1_error, X.T) # derivative of layer 2 weights is the dot product of error at layer 1 and input
db1 = np.sum(l1_error, axis=1, keepdims=True) # derivative of layer 1 bias  is simply the error at layer 1

return dW1, db1, dW2, db2 # return the derivatives of parameters

def update_parameters(W1, b1, W2, b2, dW1, db1, dW2, db2, learning_rate):
“”“Updates weights and biases and returns thir values”""
W1 = W1 - learning_rate * dW1 # update weights in layer 1
W2 = W2 - learning_rate * dW2 # update weights in layer 2
b1 = b1 - learning_rate * db1 # update bias in layer 1
b2 = b2 - learning_rate * db2 # update bias in layer 2
return W1, b1, W2, b2 # return updated parameters

Initializing parameters

np.random.seed(42) # initializing with the same random number

X = np.array([[0, 0, 1, 1], [0, 1, 0, 1]]) # input array
Y = np.array([[0, 1, 1, 0]]) # output label
n_h = 2 # number of neurons in the hidden layer
n_x = X.shape[0] # number of neurons in the input layer
n_y = Y.shape[0] # number of neurons in the output layer
W1 = np.random.randn(n_h, n_x) # weights from the input layer
b1 = np.zeros((n_h, 1)) # bias in the hidden layer
W2 = np.random.randn(n_y, n_h) # weights from the hidden layer
b2 = np.zeros((n_y, 1)) # bias in the output layer
num_iterations = 100000
learning_rate = 0.01

forward propagation pass

A1, A2 = forward_propagation(X, Y, W1, b1, W2, b2)

backpropagation pass

dW1, db1, dW2, db2 = backward_propagation(X, Y, A1, A2, W2)

update the parameters

W1, b1, W2, b2 = update_parameters(W1, b1, W2, b2, dW1, db1, dW2, db2, learning_rate)

forward propagation pass

A1, A2 = forward_propagation(X, Y, W1, b1, W2, b2)

calculate prediction

pred = (A2 > 0.5) * 1
print(“Predicted label:”, pred) # the predicted value

Course: https://www.educative.io/courses/beginners-guide-to-deep-learning
Lesson: https://www.educative.io/courses/beginners-guide-to-deep-learning/q2615KlEy67

Javeria_Tariq · August 9, 2023, 5:36am

Hi @Vikrant !!
Thanks for pointing this out. It has been updated.
Happy Learning

Vikrant · August 9, 2023, 8:20am

Hi @Javeria_Tariq Thank you for your commitment, however I am not sure how this code is executing. Check out this particular place . You are calculating the slope in 4th statement in the below code. And, you have not updated DW2 and db2?

Backpropagation for the XOR operator

Code the backpropagation for the XOR operator in the code below:

l2_error = out_y - Y # actual - target

dW2 = np.dot(l2_error, out_h.T)  # derivative of layer 2 weights is the dot product of error at layer 2 and hidden layer output

db2 = np.sum(l2_error, axis=1, keepdims=True)  # derivative of layer 2 bias is simply the error at layer 2

slope_output = out_y * (1 - out_y)  # derivative of the sigmoid activation function at the output layer

dh = np.dot(W2.T, l2_error)  # compute dot product of weights in layer 2 with error at layer 2

l1_error = np.multiply(dh, out_h * (1 - out_h))  # compute layer 1 error

dW1 = np.dot(l1_error, X.T)  # derivative of layer 1 weights is the dot product of error at layer 1 and input

db1 = np.sum(l1_error, axis=1, keepdims=True)  # derivative of layer 1 bias is simply the error at layer 1

Course: https://www.educative.io/courses/beginners-guide-to-deep-learning
Lesson: Back Propagation - A Beginner's Guide to Deep Learning