Spatially Separable Self-Attention

Spatially Separable Self-Attention: A Method to Reduce Complexity in Vision Transformers

As computer vision tasks become more complex and require higher resolution inputs, the computational complexity of vision transformers increases. Spatially Separable Self-Attention, or SSSA, is an attention module used in the Twins-SVT architecture that aims to reduce the computational complexity of vision transformers for dense prediction tasks.

SSSA is composed of locally-grouped self-attention (LSA) and global sub-sampled attention (GSA). The LSA is a self-attention mechanism that operates within a sub-window of the input image. This sub-windowing allows for more efficient computation by reducing the total number of tokens that need to be processed in each self-attention operation. Meanwhile, the GSA is a self-attention mechanism that operates globally across the entire input image. However, to reduce computational complexity, the GSA only interacts with representative keys generated by the sub-sampling functions from each sub-window.

Formally, SSSA can be written as:

$$\hat{\mathbf{z}}\_{i j}^{l}=\text { LSA }\left(\text { LayerNorm }\left(\mathbf{z}\_{i j}^{l-1}\right)\right)+\mathbf{z}\_{i j}^{l-1} $$ $$\mathbf{z}\_{i j}^{l}=\mathrm{FFN}\left(\operatorname{LayerNorm}\left(\hat{\mathbf{z}}\_{i j}^{l}\right)\right)+\hat{\mathbf{z}}\_{i j}^{l} $$ $$ \hat{\mathbf{z}}^{l+1}=\text { GSA }\left(\text { LayerNorm }\left(\mathbf{z}^{l}\right)\right)+\mathbf{z}^{l} $$ $$ \mathbf{z}^{l+1}=\text { FFN }\left(\text { LayerNorm }\left(\hat{\mathbf{z}}^{l+1}\right)\right)+\hat{\mathbf{z}}^{l+1}$$ $$i \in\{1,2, \ldots ., m\}, j \in\{1,2, \ldots ., n\} $$

The SSSA has been shown to achieve strong performance on various dense prediction tasks while maintaining computational efficiency. Its effectiveness makes it a promising technique for use in computer vision applications that require high-resolution input data.

Great! Next, complete checkout for full access to SERP AI.
Welcome back! You've successfully signed in.
You've successfully subscribed to SERP AI.
Success! Your account is fully activated, you now have access to all content.
Success! Your billing info has been updated.
Your billing was not updated.