Ghost Module

A Ghost Module is a type of image block used in convolutional neural networks. Its purpose is to generate more features while using fewer parameters. To achieve this, a regular convolutional layer is split into two parts. The first part involves ordinary convolutions, but their total number is controlled. The second part involves a series of simple linear operations applied to the intrinsic feature maps generated in the first part to create more feature maps.

Why do we need Ghost Modules?

One of the main motivations for using Ghost Modules is to reduce the redundancy in intermediate feature maps, which are produced by conventional convolutional neural networks (CNNs). In conventional CNNs, these intermediate feature maps tend to contain much redundancy and are often similar to each other. Ghost Modules seek to eliminate this redundancy, thereby reducing the overall number of parameters and computations required.

In a regular convolutional layer, the number of FLOPs (floating-point operations) required can be calculated as n × h' × w' × c × k ×k, where n is the number of filters, c is the number of input channels, h and w are the height and width of the input data, and k is the kernel size of the convolutional filters. This value can often be very large, especially for CNNs that have a large number of filters and input channels.

How do Ghost Modules work?

In Ghost Modules, the output feature maps are considered to be "ghosts" of intrinsic feature maps that are generated using primary convolutions. These intrinsic feature maps are often of smaller size and are produced using ordinary convolutional filters. The number of intrinsic feature maps is typically much smaller than the number of output feature maps.

To obtain the final output feature maps, a series of cheap linear operations are applied to each intrinsic feature map. These linear operations are much less computationally expensive than regular convolutions and operate on each channel independently. The linear operations generate "ghost" feature maps that are combined with the intrinsic feature maps to produce the final output feature maps.

The number of ghost feature maps generated for each intrinsic feature map can vary, depending on the number of linear operations used. In practice, several different linear operations can be used, such as 3 x 3 and 5 x 5 linear kernels. The final output of a Ghost Module will typically contain m x s feature maps, where m is the number of intrinsic feature maps, and s is the number of ghost feature maps generated for each intrinsic feature map.

Advantages of Ghost Modules

Ghost Modules offer several advantages over conventional CNNs. First, they reduce the number of parameters and computations required, which can improve the efficiency of the network. Second, they can help reduce the number of intermediate feature maps, which can reduce overfitting and improve generalization performance. Finally, they can help improve the interpretability of the network by generating more meaningful and diverse feature maps.

Ghost Modules are a powerful tool for improving the efficiency and interpretability of convolutional neural networks. By reducing the redundancy in intermediate feature maps and generating more diverse and meaningful feature maps, Ghost Modules can help improve the overall performance of the network while reducing the computational cost. Although Ghost Modules are still a relatively new tool in the field of deep learning, they are already showing promise for a wide range of applications, from image processing and computer vision to natural language processing and robotics.