https://www.educative.io/projects/diabetes-prediction-using-keras/q2wy4ylAMPk

Vikrant · September 21, 2023, 1:47am

Hi Team, Please refer to this project. Because it is a paid course therefore I have taken off code in the 9th row. However output is valid. You have 5 values in feature ‘Glucose’, which are Null and the other features i.e., BloodPressure, SkinThickness, Insulin and BMI have more null values yet your title says that :

Impute feature medians in columns that do not have the largest number of missing values

This title is wrong and misleading, validate your code and check what code is performing
Similarly, the next title

Use linear regression to predict the missing values in the columns that have the largest number of missing values

There are only 5 values in Glucose which are zero and which code is imputing via LR, however, your title says something else?

Task Statement : * Impute feature medians for values that do not have the maximum number of missing values.

I was puzzled for almost an hour or two to figure out the difference between the actual code and task statement. It is very misleading.

Please note that it is a project not course. I am not able to take off the links which are related to AWS course

Course: Learn the A to Z of Amazon Web Services (AWS) - Learn Interactively
Lesson: An Overview of AWS - Learn the A to Z of Amazon Web Services (AWS)

Vikrant · September 20, 2023, 11:15pm

@Javeria_Tariq @Saif_Ali @Muhammad_Ali_Shahid Could you please check? Thanks in advance.

Course: Learn the A to Z of Amazon Web Services (AWS) - Learn Interactively
Lesson: An Overview of AWS - Learn the A to Z of Amazon Web Services (AWS)

Omar_Farooq · September 22, 2023, 4:17pm

Hello Vikrant,

Hope you are fine.
We have two options for choosing the variable for linear regression during missing value handling:

Replace the most relevant feature (the one that that has the highest correlation with the target i.e., Glucose).
Replace the feature with the highest number of missing values.

There is no hard and fast rule when choosing from the above. The only thing that we need to keep in mind is the performance of the resulting classifier. The best option is the one that produces the better results.

We at educative are very thankful to you for pointing out this issue. The task description has been updated and will be visible to you once you reset the project. Please do give us your feedback once you have completed the project
All the best,
Omar

Vikrant · September 23, 2023, 6:24pm

Hi @Omar_Farooq Thank you for your inputs and updating the task statement.

Course: Introduction to Deep Learning & Neural Networks - Learn Interactively
Lesson: The Principles of the Convolution - Introduction to Deep Learning & Neural Networks