# How can I calculate the gradient of hinge loss?

With reference to the research paper entitled Sentiment Embeddings with Applications to Sentiment Analysis, I am trying to implement its sentiment ranking model in Python, for which I am required to optimise the following hinge loss function:

los Rank=∑tmax(0,1−δs(t)frank0(t)+δs(t)frank1(t))

Unlike the usual mean square error, I cannot find its gradient to perform backpropagation. How do I calculate the gradient of this loss function?

Answered by Amit raj

Gradient of hinge loss is difficult to work with when the derivative is needed because the derivative will be a piecewise function. max has one non-differentiable point in its solution, and thus the derivative has the same. This was a very prominent issue with non-separable cases of SVM (and a good reason to use ridge regression). Where hinge loss is defined as max(0, 1-v) and v is the decision boundary of the SVM classifier. More can be found on the Hinge Loss Wikipedia. As for your equation: you can easily pick out the v of the equation, however without more context of those functions it's hard to say how to derive. Unfortunately I don't have access to the paper and cannot guide you any further…