educative.io

🍀 Challenge: Backpropagation - 3 Layered Neural Network - A Beginner's Guide

Hi Team, Could you provide me the generic guidelines so that I can avoid dim error while multiplying and transposing the matrices.

My following code is based on the “Train the XOR Multilayer Perceptron” in previous chapter to maintain consistency.
However the solution you’ve provided for this challenge is little different .My solution below , please also see the comments i.e., how this code is different from your solution

def backpropagation(y, out_y, out_h2, out_h1, w3, w2, x):
“”"
Computes the backpropagation operation for the
3-layered neural network and returns the gradients
of weights and biases
“”"

l3_error = out_y - y

dw3 = np.dot(l3_error, out_h2.T) # dw3 = np.dot(out_h2.T, l3_error), this is provided in your solution
db3 = np.sum(l3_error, axis=1, keepdims=True) #In your code axis =0

dh_2 = np.dot(w3.T, l3_error) # dh2 = np.dot(w3, l3_error) , this is provided in your solution

l2_error = np.multiply(dh_2, out_h2 * (1 -out_h2)) # l2_error = np.multiply(dh2.T, out_h2 * (1 - out_h2)) , this is provided in your solution
dw2 = np.dot(l2_error, out_h1.T) # dW2 = np.dot(out_h1.T, l2_error) , this is provided in your solution
db2 = np.sum(l2_error, axis=1, keepdims=True)

dh_1 = np.dot(w2.T, l2_error) # dh1 = np.dot(w2, l2_error.T), this is provided in your solution

l1_error = np.multiply(dh_1, out_h1 * (1-out_h1)) # l1_error = np.multiply(dh1.T, out_h1 * (1 - out_h1)) as per your solution
dw1 = np.dot(l1_error, x.T) # np.dot(x.T, l1_error), as per your solution
db1 = np.sum(l1_error, axis=1, keepdims=True)
return dW3, db3, dW2, db2, dW1, db1 

Above solution follows the below example from previous chapter in which dot products were performed for the backward propagation different way i.e., dot product and transposing of matrices

It is very tedious to visualize the shapes or remember the order while writing the code.

If you see the conceptual logic then you would find my logic was correct however the transposing of matrices and dot product order was different because of which I got stuck

Question1 or Doubt1 : I need inputs so that I can figure out these errors while writing the logic otherwise it is going to be trial and error. Could you assist ?
Question2 why the pattern which is written in the below code was not followed ?
Question 3: While resolving the challenge how you have determined the dot products and transpose of matrices swiftly?
Question 4: Are there any guidelines or best practices to figure this out ?
Ideally above code should be an augment of the code from the previous chapter which is not the case
Finally I’m copying code from the previous chapter below:

Train the XOR Multilayer Perceptron
def backward_propagation(X, Y, out_h, out_y, W2):
“”"
Computes the backpropagation operation of a neural network and
returns the derivative of weights and biases
“”"
l2_error = out_y - Y # actual - target
dW2 = np.dot(l2_error, out_h.T) # derivative of layer 2 weights is the dot product of error at layer 2 and hidden layer output
db2 = np.sum(l2_error, axis = 1, keepdims=True) # derivative of layer 2 bias is simply the error at layer 2

dh = np.dot(W2.T, l2_error) # compute dot product of weights in layer 2 with error at layer 2
l1_error = np.multiply(dh, out_h * (1 - out_h)) # compute layer 1 error
dW1 = np.dot(l1_error, X.T) # derivative of layer 2 weights is the dot product of error at layer 1 and input
db1 = np.sum(l1_error, axis=1, keepdims=True) # derivative of layer 1 bias  is simply the error at layer 1

return dW1, db1, dW2, db2 # return the derivatives of parameters

HI @Vikrant !!
Getting inputs to figure out errors while writing logic
It can be challenging to get the correct inputs when developing neural network models, especially with backpropagation. Here are some strategies to help you avoid errors and better understand the process:

  1. Understand the Shapes: Make sure you have a good understanding of the shapes of your weight matrices, input data, and output data. Check the dimensions at each step of the backpropagation process.

  2. Debugging with Print Statements: Insert print statements at different stages of your code to check the shapes of matrices and intermediate results. This can help you identify any inconsistencies or unexpected shapes.

  3. Use Dummy Data: When developing your code, start with small, dummy datasets that are easy to visualize. Follow the data flow step-by-step to validate the shapes of intermediate results.

  4. Gradually Increase Complexity: Begin with a simple network and gradually increase its complexity. This allows you to identify issues early and understand how each layer’s dimensions should align.

  5. Visualize the Matrices: Plotting or visualizing the matrices can provide insights into their shapes and help identify issues.

The pattern in the code
The pattern followed in the original code (from the previous chapter) is correct, and your version is introducing unnecessary complexities. It’s essential to maintain consistency in the way you handle matrix operations, as this ensures correctness and makes your code more readable and maintainable.

In the provided code, there are multiple inconsistencies in the order of dot products and transposing matrices, which can lead to incorrect results. The correct pattern should be followed, as shown in the original code, to ensure the backpropagation process is accurate.

Determine dot products and transpose matrices swiftly.
Determining the correct dot products and transpose of matrices swiftly requires a good understanding of linear algebra and neural network principles. Here are some tips:

  1. Understand the Neural Network Architecture: Understand the architecture of the neural network, including the number of layers, the number of neurons in each layer, and the connections between layers.

  2. Learn the Backpropagation Algorithm: Study and understand the backpropagation algorithm step-by-step, including how errors flow backward through the network and how gradients are computed for weights and biases.

  3. Practice and Debug: As with any skill, practice is essential. Work on various examples and use debugging techniques like print statements to understand intermediate results.

  4. Visualize Matrix Operations: Visualizing matrix operations can help you understand how data is flowing through the network and how matrices’ shapes change at each step.

Guidelines and best practices to figure this out
Here are some guidelines and best practices to avoid errors in matrix operations during backpropagation:

  1. Be Consistent: Always follow a consistent pattern for matrix operations and transposing. Stick to a standard notation to avoid confusion.

  2. Understand Matrix Shapes: Carefully understand the shapes of your weight matrices, input data, and output data at each step. Ensure that matrix dimensions align correctly for dot products.

  3. Use Clear Variable Names: Use meaningful variable names that reflect the purpose of the matrix or operation. This can make your code more readable and understandable.

  4. Step-by-Step Debugging: When working on complex problems, perform step-by-step debugging with smaller datasets to validate the correctness of your code.

  5. Validate Against Known Solutions: Compare your results with known solutions or previously implemented models to ensure correctness.

  6. Document Your Code: Add comments to your code explaining each step of the backpropagation process. This documentation will help you and others understand the logic behind the operations.

  7. Learn from Tutorials and Examples: Study well-documented tutorials and examples to understand how matrix operations are used in neural network backpropagation.

By following these guidelines, you can improve your understanding of matrix operations in backpropagation and minimize errors while developing neural network models.
I hope it helps. Happy Learning :blush: