Generative Adversarial Networks or GANs are deep learning models that can learn to generate realistic images from random noise. However, a variation of GANs called the Laplacian Generative Adversarial Network or LAPGAN introduces a new idea in image generation: refinement through successive stages.

The LAPGAN Architecture

The LAPGAN architecture is composed of a set of generative convolutional neural network (convnet) models. These models are trained to capture the distribution of coefficients for natural images according to the levels of the Laplacian pyramid. The Laplacian pyramid is a technique for multiscale image processing that computes a sequence of bandpass-filtered images at different levels of resolution. The top layer contains the original image while the bottom layers contain the difference between the images at two consecutive levels.

The sampling procedure following training involves generating a set of coefficients for each level of the pyramid using the corresponding generative model. The image is then reconstructed by adding the coefficients for each level. The reconstruction procedure starts with a residual image generated using the bottom layer of the pyramid and the corresponding generative model. Subsequent levels of the pyramid are then generated using a conditional generative model that takes an upsampled version of the previous image and a noise vector as inputs.

Training

The generative models are trained using the conditional generative adversarial network (CGAN) approach at each level of the pyramid. The training images are first transformed into Laplacian pyramids. At each level, a stochastic choice is made to either generate coefficients for the specific level using the corresponding generative model or using the standard Laplacian pyramid generation procedure.

The generative model at each level takes as inputs the coarse scale version of the image and a noise vector. The discriminator also takes as inputs the low-pass image and the coefficients or generated coefficients for the current level. The discriminator is trained to distinguish between real and generated images.

The Key Idea

The key idea behind LAPGAN is to break image generation into successive refinements. This approach gives up any notion of global fidelity and instead focuses on making each step plausible. This allows for the generation of images with higher resolution, while still maintaining the characteristics of natural images.

LAPGAN is an elegant and effective solution to image generation. Its multiscale approach consistently produces high quality and diverse images, with the ability to generate images of variable size and resolution.

Great! Next, complete checkout for full access to SERP AI.
Welcome back! You've successfully signed in.
You've successfully subscribed to SERP AI.
Success! Your account is fully activated, you now have access to all content.
Success! Your billing info has been updated.
Your billing was not updated.