HITNet is a powerful framework for neural network based depth estimation.
Overcoming Computational Disadvantages
Traditional methods for depth estimation in images have to operate on a 3D volume which can be computationally intensive. However, HITNet integrates image warping, spatial propagation, and a high-resolution initialization step into the network architecture to overcome these disadvantages.
The Basic Principle
The approach used by HITNet is to represent image tiles as planar patches with a learned feature descriptor. Then, information from the high-resolution initialization and the current hypotheses is fused using spatial propagation. A convolutional neural network module updates the estimate of the planar patches and their attached features to increase the accuracy of the disparity predictions.
Iterative Improvement
The network iteratively increases the accuracy of the disparity predictions by using a local cost volume in a narrow band around the planar patch. In-network image warping allows the network to minimize image dissimilarity. Predictions are hierarchically upsampled to capture both fine details and large texture-free areas.
Critical Features
A critical feature of the HITNet architecture is that matches from the initialization module are provided at each resolution. This facilitates the recovery of thin structures that may not be represented at low resolution.
Overall, HITNet provides an efficient way to estimate depth in images, with the ability to capture both fine details and large texture-free areas.