Review: RED-Net — Residual Encoder-Decoder Network (Denoising / Super Resolution)
Image Restoration including Image Denoising, Super Resolution, JPEG Deblocking, Image Deblurring and Image Inpainting.
In this story, RED-Net (Residual Encoder-Decoder Network), for image restoration, is reviewed. Suppose we have a corrupted image y:
where x is the clean version of y; H is the degradation function and n is the additive noise. By using the same network architecture but trained with different dataset, i.e. with different sets of x and y, RED-Net can help for the tasks of Image Denoising, Super Resolution, JPEG Deblocking, Image Deblurring and Image Inpainting.
It is published in 2016 NIPS with over 200 citations. It also has a more detailed version of technical report in 2016 arXiv. (Sik-Ho Tsang @ Medium)
What Are Covered
- Network Architecture
- Ablation Study
- Results on Image Denoising, Super Resolution, JPEG Deblocking, Image Deblurring and Image Inpainting
1. Network Architecture
The network contains layers of symmetric convolution (encoder) and deconvolution (decoder).
Convolution
The convolutional layers act as the feature extractor, which capture the abstraction of image contents while eliminating noises/corruptions.
Deconvolution
The deconvolutional layers are then combined to recover the details of image contents. Deconvolutional layers associate a single input activation with multiple outputs. Deconvolution is usually used as learnable up-sampling layers.
Skip/Shortcut Connections
Skip/Shortcut connections are connected every a few (in this case, two) layers from convolutional feature maps to their mirrored deconvolutional feature maps. Thus, the response from a convolutional layer is directly propagated to the corresponding mirrored deconvolutional layer, both forwardly and backwardly. The passed convolutional feature maps are summed to the deconvolutional feature maps element-wise, and passed to the next layer after rectification.
2. Ablation Study
2.1. Different Combinations of Convolution and Deconvolution
- By using only 5 or 10 deconv (conv upsampling), the PSNR obtained is not good.
- By using only 5 or 10 conv, the PSNR obtained is better.
- By using 5 conv and 5 deconv, the PSNR obtained is much better.
2.2. Effectiveness of Skip/Shortcut Connections
- With skip connections, the PSNR is much better.
- The reason may be that deeper networks can destroy the image details, which is undesired for pixel-wise dense regression. Skip connections carry important image details, which helps to reconstruct clean image.
- Using very deep networks may easily suffer from training issues such as gradient vanishing. Using skip connections can help to address this problem.
- Without skip connections, network with more layers even increases the loss during training compared with those with fewer layers.
- With skip connections, 30-layer network is better than 20-layer network with smaller training loss.
- RED-net, which consists of long and short symmetric skip connections, is better than the ResNet building block in ResNet.
3. Results on Image Denoising, Super Resolution, JPEG Deblocking, and Image Inpainting
3.1. Image Denoising
- Reduce the noise of noisy images.
- Datasets: 14 common benchmark image, and BSD Dataset.
3.1.1. One Model for One Noise Level
- RED2n: n conv and n deconv with symmetric skip connections
- RED10 has already got the better results than other state-of-the-art approaches
- RED30 has even better results.
3.1.2. One Model for All Noise Levels
- PSNR is degraded comparing to separate models, but it still beats the existing methods.
3.2. Super Resolution
- Enlarge the size of image.
- Datasets: Set5, Set14, and BSD100
3.2.1. One Model for One Scaling Factor
- RED30 again obtains the highest PSNR, better than SRCNN.
- At the mean time for the development of RED-Net, VDSR and DRCN were invented, the concurrent works for super resolution.
- RED30 nearly performs the best for all datasets and scaling factors.
3.2.2. One Model for All Scaling Factors
- RED30 still performs quite well.
3.3. JPEG Deblocking
- Lossy compression, such as JPEG, introduces complex compression artifacts, particularly the blocking artifacts, ringing effects and blurring.
- Reduce the JPEG compression artifacts.
- Datasets: LIVE1
3.4. Image Deblurring
- Reduce the blurs in the image.
- RED30 performs the best with highest PSNR.
3.5. Image Inpainting
- Fill the holes or corrupted parts.
- RED30 has a better results compared with FoE.