Additive Angular Margin Loss

ArcFace, also known as Additive Angular Margin Loss, is a loss function used in face recognition tasks. Its purpose is to improve the performance of deep face recognition under large intra-class appearance variations by explicitly optimizing feature embeddings to enforce higher similarity for intraclass samples and diversity for inter-class samples. Traditionally, the softmax loss function is used in these tasks, but it does not have the same optimization capabilities.

How ArcFace Works

The ArcFace loss function transforms the logits $W^{T}\_{j}x\_{i} = || W\_{j} || \text{ } || x\_{i} || \cos\theta\_{j}$, where $\theta\_{j}$ is the angle between the weight $W\_{j}$ and the feature $x\_{i}$. The individual weight $ || W\_{j} || = 1$ is fixed by $l\_{2}$ normalization, and the embedding feature $ ||x\_{i} ||$ is fixed by $l\_{2}$ normalization and re-scaled to $s$. This normalization step on features and weights makes predictions depend only on the angle between the feature and the weight. The learned embedding features are distributed on a hypersphere with a radius of $s$. Finally, an additive angular margin penalty $m$ is added between $x\_{i}$ and $W\_{y\_{i}}$ to simultaneously enhance the intra-class compactness and inter-class discrepancy.

Since the proposed additive angular margin penalty is equal to the geodesic distance margin penalty in the normalized hypersphere, the method is named ArcFace. The loss function is represented by:

$$ L\_{3} = -\frac{1}{N}\sum^{N}\_{i=1}\log\frac{e^{s\left(\cos\left(\theta\_{y\_{i}} + m\right)\right)}}{e^{s\left(\cos\left(\theta\_{y\_{i}} + m\right)\right)} + \sum^{n}\_{j=1, j \neq y\_{i}}e^{s\cos\theta\_{j}}} $$

Benefits of ArcFace

The ArcFace loss function's optimization capabilities improve the performance of deep face recognition under large intra-class appearance variations. As the authors of the original research show, the softmax loss provides roughly separable feature embedding but produces noticeable ambiguity in decision boundaries. In contrast, the ArcFace loss function can enforce a greater gap between the nearest classes, resulting in more accurate predictions. Additionally, ArcFace helps reduce the discrepancy in embeddings of images from the same class, resulting in better intraclass similarity.

ArcFace Alternatives

While ArcFace has proven to be effective, other options can also enforce intra-class compactness and inter-class distance. One alternative is Supervised Contrastive Learning. This approach was first proposed in a 2020 paper titled "Supervised Contrastive Learning" by Ting Chen et al., and it involves learning the representation of positive pairs of data points while maximizing the distance contrast with other dissimilar data points.

Both ArcFace and Supervised Contrastive Learning are effective at improving the performance of face recognition tasks under large intra-class appearance variations. The choice between these two approaches depends on the specific needs and requirements of the task at hand.

Great! Next, complete checkout for full access to SERP AI.
Welcome back! You've successfully signed in.
You've successfully subscribed to SERP AI.
Success! Your account is fully activated, you now have access to all content.
Success! Your billing info has been updated.
Your billing was not updated.