educative.io

Educative

What are cost functions in SVM cost formula

in SVM cost function, the first formula after the logistic cost function.

I see terms like cost1(w^T x^i ) and cost0(w^T x^I ). What are these cost functions ? My assumption is that SVM looks at raw scores of w^T dot X ( vector of input features ) to decide class label. If not, could you please provide the formula for these cost functions 0 and 1 ?

Are they same as log-reg cost functions ? ( I don’t think this is the case ) . Please confirm.


Course: https://www.educative.io/collection/5959747388833792/4596525314342912
Lesson: https://www.educative.io/collection/page/5959747388833792/4596525314342912/6353520329490432

Hi @Amit_Adiraju_Narasim !!
In the SVM cost function you mentioned, the terms “cost1(w^T x^i)” and “cost0(w^T x^i)” represent the cost functions associated with positive (y^i = 1) and negative (y^i = 0) examples, respectively. These cost functions are not the same as the logistic cost function used in logistic regression.

In SVM (Support Vector Machine), the decision boundary is determined by the raw scores of the dot product between the weight vector w and the input features x^i (w^T x^i). However, SVM aims to find a decision boundary that maximizes the margin between different classes, rather than directly predicting class probabilities like logistic regression.

The specific form of the cost functions cost1(w^T x^i) and cost0(w^T x^i) depends on the formulation of the SVM model. In the standard SVM formulation, the cost functions are defined as follows:

  • cost1(w^T x^i) = max(0, 1 - w^T x^i) - This cost function penalizes instances that are misclassified as negative (y^i = 0) by the margin between the raw score w^T x^i and 1. If the raw score is less than 1, the cost is positive, and if it is greater than or equal to 1, the cost is 0.

  • cost0(w^T x^i) = max(0, 1 + w^T x^i) - This cost function penalizes instances that are misclassified as positive (y^i = 1) by the margin between the raw score w^T x^i and -1. If the raw score is greater than -1, the cost is positive, and if it is less than or equal to -1, the cost is 0.

These cost functions introduce a hinge loss penalty, where the cost is incurred only if an example is misclassified and lies within the margin region. The aim of SVM is to minimize the overall cost, which includes both the hinge loss term and the regularization term (the second sum in the equation you provided).

To summarize, the cost functions used in SVM are different from the logistic cost function used in logistic regression, and they are designed to encourage a large margin between classes rather than directly estimating class probabilities.
I hope it helps. Happy Learning :blush:

Definitely helps ! Super clear explanation. Thanks much ! :slight_smile:


Course: https://www.educative.io/collection/5959747388833792/4596525314342912
Lesson: https://www.educative.io/collection/page/5959747388833792/4596525314342912/6353520329490432

1 Like