Selective Kernel

What is Selective Kernel?

Selective Kernel is a type of bottleneck block used in Convolutional Neural Network (CNN) architectures. It consists of a sequence of 1x1 convolution, SK convolution, and another 1x1 convolution. The SK unit was introduced in the SKNet architecture to replace large kernel convolutions in the original bottleneck blocks of ResNeXt. The main purpose of the SK unit is to enable the network to choose appropriate receptive field sizes dynamically.

How does a Selective Kernel unit work?

There are three important hyper-parameters that determine the final settings of SK convolutions: the number of paths, group number, and reduction ratio. The number of paths determines the number of choices of kernels that could be aggregated, while the group number controls the cardinality of each path. The reduction ratio controls the number of parameters in the fuse operator. In essence, the SK unit chooses the kernel features that are most relevant and combines them to form the most effective model.

Why are Selective Kernel units important?

The main importance of the SK unit is its ability to adaptively select receptive field sizes, which is vital in improving the performance of CNNs. The traditional bottleneck blocks in ResNeXt use large and fixed kernel sizes that cannot be changed, which can lead to suboptimal results. However, the use of SK units ensures that the receptive field sizes are more flexible, and the CNN can dynamically choose the kernels that are relevant for a specific task.

Typical settings of SK convolutions

The standard format of an SK convolution is SK [M, G, r]. The values of M, G, and r determine the final configuration of the SK unit. One typical setting of SK convolutions is "SK [2, 32, 16]." This means that there are two paths with each path having 32 groups, and the reduction ratio is 16. However, depending on the specific application or task, different values of M, G, and r could be used.

Applications of Selective Kernel units

Selective Kernel units have shown significant improvements in various computer vision tasks, including object recognition, object detection, and semantic segmentation. In object recognition, SKNet outperformed both ResNet and InceptionV4 models on the ImageNet dataset. The use of SK units also improved the mAP (mean Average Precision) of the RetinaNet object detection model. Furthermore, SKNet models have been used to perform semantic segmentation of medical images with better accuracy compared to traditional CNN models.

Selective Kernel units represent a significant breakthrough in CNN architectures, allowing for the dynamic selection of relevant kernel features in real-time. SK units provide more flexibility and adaptability in terms of receptive field sizes, resulting in better performance for various computer vision tasks. The use of SK units has shown significant improvements in object recognition, object detection, and semantic segmentation, among other applications.