Differentiable Neural Architecture Search

Differentiable Neural Architecture Search (DNAS)

Are you tired of manually designing neural network architectures? Are you looking for a more efficient way to optimize ConvNet architectures? Look no further than Differentiable Neural Architecture Search (DNAS). DNAS uses gradient-based methods to explore a layer-wise search space, allowing for the selection of different building blocks for each layer of the ConvNet. DNAS represents the search space by a super net whose operators execute stochastically.

Optimizing ConvNet Architectures

Previous methods of optimizing ConvNet architectures involved enumerating and training individual architectures separately. However, DNAS avoids this lengthy process by finding a distribution that yields the optimal architecture. This is done using the Gumbel Softmax technique. By directly training the architecture distribution through gradient-based optimization such as SGD, DNAS becomes a more efficient way to optimize ConvNet architectures.

Training the Stochastic Super Net

The loss used to train the stochastic super net consists of both the cross-entropy loss that leads to better accuracy and the latency loss that penalizes the network's latency on a target device. This dual approach ensures that the architecture selected not only has high accuracy but also operates quickly on its intended device. However, estimating the latency of an architecture can be a challenge due to the enormous search space. To overcome this challenge, the latency of each operator in the search space is measured, and a lookup table model is used to compute the overall latency by adding up the latency of each operator. This also makes the latency differentiable with respect to layer-wise block choices.

The Benefits of DNAS

DNAS offers several benefits to the field of machine learning. First and foremost, it saves time by avoiding the lengthy process of manually designing neural network architectures. Furthermore, it finds a distribution that yields the optimal architecture, allowing for higher accuracy and faster operation on target devices. Additionally, it offers a method to estimate latency in an enormous search space, making it a more efficient way to optimize ConvNet architectures.

So why manually design neural network architectures when DNAS offers an easier and more efficient way to optimize them? Give it a try and see the benefits firsthand.