Multi-focus image fusion

Multi-focus image fusion is a multiple image compression technique using input images with different focus depths to make one output image that preserves all information.

Overview

The main idea of image fusion is gathering important and the essential information from the input images into one single image which ideally has all of the information of the input images.[1][2][3][4] The research history of image fusion spans over 30 years and many scientific papers.[5][6] Image fusion generally has two aspects: image fusion methods and objective evaluation metrics.[6]

In visual sensor networks (VSN), sensors are cameras which record images and video sequences. In many applications of VSN, a camera can't give a perfect illustration including all details of the scene. This is because of the limited depth of focus of the optical lens of cameras. Therefore, just the object located in the focal length of camera is focused and clear, and other parts of the image are blurred.

VSN captures images with different depths of focus using several cameras. Due to the large amount of data generated by cameras compared to other sensors such as pressure and temperature sensors and some limitations of bandwidth, energy consumption and processing time, it is essential to process the local input images to decrease the amount of transmitted data.[5]

Multi-Focus image fusion in the spatial domain

Huang and Jing have reviewed and applied several focus measurements in the spatial domain for the multi-focus image fusion process, suitable for real-time applications. They mentioned some focus measurements including variance, energy of image gradient (EOG), Tenenbaum's algorithm (Tenengrad), energy of Laplacian (EOL), sum-modified-Laplacian (SML), and spatial frequency (SF). Their experiments showed that EOL gave better results than other methods like variance and spatial frequency.[7][4]

Multi-Focus image fusion in multi-scale transform and DCT domain

Image fusion based on the multi-scale transform is the most commonly used and promising technique. Laplacian pyramid transform, gradient pyramid-based transform, morphological pyramid transform and the premier ones, discrete wavelet transform, shift-invariant wavelet transform (SIDWT), and discrete cosine harmonic wavelet transform (DCHWT) are some examples of image fusion methods based on multi-scale transform.[5][4][8] These methods are complex and have some limitations e.g. processing time and energy consumption. For example, multi-focus image fusion methods based on DWT require a lot of convolution operations, so they take more time and energy to process. Therefore, most methods in multi-scale transform are not suitable for real-time applications.[8][4] Moreover, these methods are not very successful along edges, due to the wavelet transform process missing the edges of the image. They create ringing artefacts in the output image and reduce its quality.

Due to the aforementioned problems in the multi-scale transform methods, researchers are interested in multi-focus image fusion in the DCT domain. DCT-based methods are more efficient in terms of transmission and archiving images coded in Joint Photographic Experts Group (JPEG) standard to the upper node in the VSN agent. A JPEG system consists of a pair of an encoder and a decoder. In the encoder, images are divided into non-overlapping 8×8 blocks, and the DCT coefficients are calculated for each. Since the quantization of DCT coefficients is a lossy process, many of the small-valued DCT coefficients are quantized to zero, which corresponds to high frequencies. DCT-based image fusion algorithms work better when the multi-focus image fusion methods are applied in the compressed domain.[8][4]

In addition, in the spatial-based methods, the input images must be decoded and then transferred to the spatial domain. After implementation of the image fusion operations, the output fused images must again be encoded. DCT domain-based methods do not require complex and time-consuming consecutive decoding and encoding operations. Therefore, the image fusion methods based on DCT domain operate with much less energy and processing time.[8][4] Recently, a lot of research has been carried out in the DCT domain. DCT+Variance, DCT+Corr_Eng, DCT+EOL, and DCT+VOL are some prominent examples of DCT based methods.[4][8]

References

[9][10][11][12]

  1. ^ Amin-Naji, Mostafa; Aghagolzadeh, Ali; Ezoji, Mehdi (2019). "Ensemble of CNN for multi-focus image fusion". Information Fusion. 51: 201–214. doi:10.1016/j.inffus.2019.02.003. ISSN 1566-2535. S2CID 150059597.
  2. ^ Amin-Naji, Mostafa; Aghagolzadeh, Ali; Ezoji, Mehdi (2019). "CNNs hard voting for multi-focus image fusion". Journal of Ambient Intelligence and Humanized Computing. 11 (4): 1749–1769. doi:10.1007/s12652-019-01199-0. ISSN 1868-5145. S2CID 86563059.
  3. ^ Liu, Yu; Chen, Xun; Peng, Hu; Wang, Zengfu (2017-07-01). "Multi-focus image fusion with a deep convolutional neural network". Information Fusion. 36: 191–207. doi:10.1016/j.inffus.2016.12.001. ISSN 1566-2535. S2CID 11925688.
  4. ^ a b c d e f g Amin-Naji, Mostafa; Aghagolzadeh, Ali (2018). "Multi-Focus Image Fusion in DCT Domain using Variance and Energy of Laplacian and Correlation Coefficient for Visual Sensor Networks". Journal of AI and Data Mining. 6 (2): 233–250. doi:10.22044/jadm.2017.5169.1624. ISSN 2322-5211.
  5. ^ a b c Li, Shutao; Kang, Xudong; Fang, Leyuan; Hu, Jianwen; Yin, Haitao (2017-01-01). "Pixel-level image fusion: A survey of the state of the art". Information Fusion. 33: 100–112. doi:10.1016/j.inffus.2016.05.004. ISSN 1566-2535. S2CID 9263669.
  6. ^ a b Liu, Yu; Chen, Xun; Wang, Zengfu; Wang, Z. Jane; Ward, Rabab K.; Wang, Xuesong (2018-07-01). "Deep learning for pixel-level image fusion: Recent advances and future prospects". Information Fusion. 42: 158–173. doi:10.1016/j.inffus.2017.10.007. ISSN 1566-2535. S2CID 46849537.
  7. ^ Huang, Wei; Jing, Zhongliang (2007-03-01). "Evaluation of focus measures in multi-focus image fusion". Pattern Recognition Letters. 28 (4): 493–500. Bibcode:2007PaReL..28..493H. doi:10.1016/j.patrec.2006.09.005. ISSN 0167-8655.
  8. ^ a b c d e Haghighat, Mohammad Bagher Akbari; Aghagolzadeh, Ali; Seyedarabi, Hadi (2011-09-01). "Multi-focus image fusion for visual sensor networks in DCT domain". Computers & Electrical Engineering. Special Issue on Image Processing. 37 (5): 789–797. doi:10.1016/j.compeleceng.2011.04.016. ISSN 0045-7906. S2CID 38131177.
  9. ^ Amin-Naji, Mostafa; Aghagolzadeh, Ali; Ezoji, Mehdi (2018). "Fully Convolutional Networks for Multi-Focus Image Fusion". 2018 9th International Symposium on Telecommunications (IST). pp. 553–558. doi:10.1109/ISTEL.2018.8660989. ISBN 978-1-5386-8274-6. S2CID 71150698.
  10. ^ Du, C.; Gao, S. (2017). "Image Segmentation-Based Multi-Focus Image Fusion Through Multi-Scale Convolutional Neural Network". IEEE Access. 5: 15750–15761. Bibcode:2017IEEEA...515750D. doi:10.1109/ACCESS.2017.2735019. S2CID 9466474.
  11. ^ Tang, Han; Xiao, Bin; Li, Weisheng; Wang, Guoyin (2018-04-01). "Pixel convolutional neural network for multi-focus image fusion". Information Sciences. 433–434: 125–141. doi:10.1016/j.ins.2017.12.043. ISSN 0020-0255.
  12. ^ Guo, Xiaopeng; Nie, Rencan; Cao, Jinde; Zhou, Dongming; Qian, Wenhua (2018-06-12). "Fully Convolutional Network-Based Multifocus Image Fusion". Neural Computation. 30 (7): 1775–1800. doi:10.1162/neco_a_01098. ISSN 0899-7667. PMID 29894654. S2CID 48358558.