基于Transformer和通道混合并行卷积的高光谱图像去噪

胡帅; 高峰; 龚卓然; 陶盛恩; 上官心语; 董军宇

doi:10.11834/jig.230381

遥感图像处理 | 浏览量 : 0 下载量: 17 CSCD: 0

PDF
导出
分享
收藏
专辑

基于Transformer和通道混合并行卷积的高光谱图像去噪
Parallel channel shuffling and Transformer-based denoising for hyperspectral images
2024年29卷第7期页码：2063-2074
纸质出版日期： 2024-07-16 ，
DOI： 10.11834/jig.230381
稿件说明：

移动端阅览

胡帅，高峰，龚卓然，陶盛恩，上官心语，董军宇. 2024. 基于Transformer和通道混合并行卷积的高光谱图像去噪. 中国图象图形学报， 29(07):2063-2074

Hu Shuai， Gao Feng， Gong Zhuoran， Tao Shengen， ShangGuan Xinyu， Dong Junyu. 2024. Parallel channel shuffling and Transformer-based denoising for hyperspectral images. Journal of Image and Graphics， 29(07):2063-2074
胡帅，高峰，龚卓然，陶盛恩，上官心语，董军宇. 2024. 基于Transformer和通道混合并行卷积的高光谱图像去噪. 中国图象图形学报， 29(07):2063-2074 DOI： 10.11834/jig.230381.

Hu Shuai， Gao Feng， Gong Zhuoran， Tao Shengen， ShangGuan Xinyu， Dong Junyu. 2024. Parallel channel shuffling and Transformer-based denoising for hyperspectral images. Journal of Image and Graphics， 29(07):2063-2074 DOI： 10.11834/jig.230381.

摘要

目的

高光谱图像因设备及环境因素容易受到噪声污染，导致图像的可见性和分析精度降低，因此高光谱图像去噪任务已经成为遥感图像处理领域国内外研究热点。当前的高光谱图像去噪方法主要面临两个难题：1）对特征的全局信息利用不足。当前基于卷积神经网络的方法受限于卷积核的大小，难以捕获特征的全局信息；2）卷积神经网络和Transformer在结构上存在差异，导致两者难以融合，因此，需要考虑合理的特征交互方式，来平衡局部和全局特征提取之间的关系。

方法

针对上述问题，本文提出了基于Transformer和通道混合并行卷积的高光谱图像去噪模型，包括3个模块：通道混合特征提取模块、基于块下采样的全局增强模块和自适应双向特征融合模块。通过这3个模块的相互作用，可以充分结合全局和局部的特征信息，处理不同区域中的噪声和纹理差异，有效提高模型对空间细节信息的恢复能力。

结果

实验在2个数据集上与主流的5种方法进行比较，在Pavia数据集中设置不同高斯噪声强度的情况下，相比于性能第2的模型，峰值信噪比（peak signal-to-noise ratio，PSNR）值最大提高了0.4 dB；在ICVL数据集中设置各种混合噪声的情况下，相比于性能第2的模型，PSNR最大提高了2.18 dB。同时可视化的去噪结果图像体现了本文所提出的去噪模型的优异性能。

结论

本文方法在各种噪声情况下均具有较好的去噪效果，显著优于当前主流方法，能够有效去除高光谱图像中噪声，同时保留图像丰富的纹理信息。

Abstract

Objective

With the increasing availability and advancement of hyperspectral imaging technology， hyperspectral images have become an invaluable resource in various fields， including agriculture， environmental monitoring， and remote sensing. However， these images are often prone to noise contamination， which can significantly degrade their quality and hinder accurate analysis and interpretation. As a result， denoising hyperspectral images has become a crucial task in the field of remote sensing image processing， attracting significant attention from researchers worldwide. The challenges associated with denoising hyperspectral images are multifaceted. First， the inherent characteristics of hyperspectral data， such as high dimensionality and complex spectral information， pose significant difficulties for traditional denoising approaches. The presence of noise in hyperspectral images can obscure valuable information embedded within the spectral bands， making it essential to develop advanced denoising techniques that can effectively restore the original signal while preserving the rich texture and spatial details. Furthermore， the development of deep learning techniques， particularly convolutional neural networks （CNNs）， has revolutionized the field of image processing， including denoising tasks. CNN-based approaches have shown promising results in denoising various types of images. However， when it comes to hyperspectral data， traditional CNN architectures face limitations in capturing the global contextual information necessary for accurate denoising. The fixed-size receptive fields of CNNs restrict their ability to exploit the spatial and spectral correlations present in hyperspectral images， thereby reducing their overall denoising performance. To overcome these limitations， recent research has explored the integration of Transformers， which were originally designed for natural language processing tasks， into the field of computer vision， including hyperspectral image denoising. Transformers are capable of capturing long-range dependencies and global contextual information， making them an attractive alternative to CNNs for denoising tasks. However， directly applying Transformer-based models to hyperspectral data requires careful consideration of the specific challenges posed by the unique characteristics of hyperspectral images.

Method

In this study， we propose a novel denoising model for hyperspectral images that combines the strengths of Transformers and parallel convolution operations. Our model comprises three key modules： channel shuffling module， block downsampling global enhancement module， and adaptive bidirectional feature fusion module. These modules work synergistically to address the challenges encountered in denoising hyperspectral images. The channel shuffling module exploits the inter-channel relationships within hyperspectral data by incorporating channel-mixing operations. By fusing information across different spectral channels， the module enhances the representation power of the network and enables more comprehensive feature extraction. This approach effectively addresses the limitation of traditional CNN-based methods in fully utilizing the global information available in hyperspectral images， ultimately improving the model’s denoising performance. In the block downsampling global enhancement module， we leverage a block downsampling strategy to capture global contextual information. By reducing the spatial resolution of the input hyperspectral image， the module enlarges the receptive fields， allowing the model to incorporate larger-scale information during the denoising process. This mechanism enhances the model’s understanding of the overall structure of the image， facilitating more effective noise suppression and accurate restoration of spatial details. The adaptive bidirectional feature fusion module is designed to strike a balance between local and global feature extraction， leveraging the complementary strengths of CNNs and Transformers. This module introduces a mechanism for adaptively fusing features from local and global contexts， enabling the model to effectively combine local details with global information. By considering the intricate relationship between spatial and spectral features， our proposed approach improves the denoising performance and preserves the rich texture information inherent in hyperspectral images.

Result

To evaluate the effectiveness of our proposed model， extensive experiments were conducted on publicly available hyperspectral image datasets， including ICVL and Pavia. Experimental results demonstrated the superior denoising performance of our approach compared with that of current state-of-the-art methods. Our model consistently outperformed existing techniques in various noise scenarios， effectively removing noise while preserving the fine spatial details and rich texture information of hyperspectral images. The experimental evaluation involved quantitative metrics such as peak signal-to-noise ratio （PSNR）， structural similarity index （SSIM）， and spectral angel mapping （SAM）. Our proposed model achieved significantly higher PSNR values and SSIM scores compared with the baseline methods， indicating improved denoising accuracy and visual quality of the restored images. In addition， the SAM values obtained using our model were consistently lower， indicating higher spectral similarity. Moreover， we conducted a comprehensive analysis of the computational efficiency of our model. With the increasing volume and complexity of hyperspectral data， developing denoising methods that are computationally efficient without sacrificing performance is crucial. Our proposed model demonstrated competitive computational efficiency， making it practical for real-world applications that involve large-scale hyperspectral image processing.

Conclusion

The success of our denoising model can be attributed to the synergistic combination of the Transformer-based architecture and the channel-mixing parallel convolution operations. The Transformer module enables effective capture of global contextual information， facilitating better understanding of the relationships between spectral bands and spatial features. By incorporating channel-mixing operations， our model exploits the inter-channel correlations and enhances the discriminative power of feature extraction， resulting in improved denoising performance. Furthermore， our model’s ability to handle diverse noise scenarios and maintain image quality can be attributed to the adaptive bidirectional feature fusion module. This module intelligently combines local and global features， enabling effective noise suppression while preserving the fine details and texture information specific to different regions of the hyperspectral images. The adaptability of the feature fusion mechanism ensures robust denoising performance across various noise levels and image characteristics. In conclusion， this study presents a novel denoising model for hyperspectral images based on the integration of Transformers and channel-mixing parallel convolution. The proposed model effectively addresses the limitations of traditional approaches in utilizing global information and captures the complex spatial-spectral correlations inherent in hyperspectral data. Experimental results demonstrate its superior denoising performance compared with that of state-of-the-art methods， with improved accuracy and preservation of fine details and texture information. The model’s computational efficiency further enhances its practicality for real-world applications. Future research directions may include exploring additional mechanisms for adaptive feature fusion and investigating the model’s performance on other hyperspectral image processing tasks such as classification and segmentation.

关键词

高光谱图像去噪通道混合Transformer特征融合全局注意力

Keywords

hyperspectral image denoising; channel shuffling; Transformer; feature fusionglobal attention

references

Arad B and Ben-Shahar O. 2016. Sparse recovery of hyperspectral signal from natural RGB images//Proceedings of the 14th European Conference on Computer Vision. Amsterdam， the Netherlands： Springer： 19-34 ［DOI： 10.1007/978-3-319-46478-7_2http://dx.doi.org/10.1007/978-3-319-46478-7_2］

Chen Y， Huang T Z， He W， Zhao X L， Zhang H Y and Zeng J S. 2022. Hyperspectral image denoising using factor group sparsity-regularized nonconvex low-rank approximation. IEEE Transactions on Geoscience and Remote Sensing， 60： #5515916 ［DOI： 10.1109/TGRS.2021.3110769http://dx.doi.org/10.1109/TGRS.2021.3110769］

Gelvez-Barrera T， Arguello H and Foi A. 2022. Joint nonlocal， spectral， and similarity low-rank priors for hyperspectral-multispectral image fusion. IEEE Transactions on Geoscience and Remote Sensing， 60： #5537112 ［DOI： 10.1109/TGRS.2022.3203294http://dx.doi.org/10.1109/TGRS.2022.3203294］

Hao J L， Xue J Z， Zhao Y Q and Chan J C W. 2023. Transformed structured sparsity with smoothness for hyperspectral image deblurring. IEEE Geoscience and Remote Sensing Letters， 20： #5500105 ［DOI： 10.1109/LGRS.2022.3230205http://dx.doi.org/10.1109/LGRS.2022.3230205］

He W， Yao Q M， Li C， Yokoya N， Zhao Q B， Zhang H Y and Zhang L P. 2022. Non-local meets global： an iterative paradigm for hyperspectral image restoration. IEEE Transactions on Pattern Analysis and Machine Intelligence， 44（4）： 2089-2107 ［DOI： 10.1109/TPAMI.2020.3027563http://dx.doi.org/10.1109/TPAMI.2020.3027563］

Li H， Zhao X L， Lin J and Chen Y. 2022. Low-rank tensor optimization with nonlocal plug-and-play regularizers for snapshot compressive imaging. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing， 15： 581-593 ［DOI： 10.1109/JSTARS.2021.3136217http://dx.doi.org/10.1109/JSTARS.2021.3136217］

Li Y Z， Chong Y W， Pan S M and Ding Y. 2023. First-order smoothing-based deep graph network for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing， 61： #5504716 ［DOI： 10.1109/TGRS.2023.3250715http://dx.doi.org/10.1109/TGRS.2023.3250715］

Maffei A， Haut J M， Paoletti M E， Plaza J， Bruzzone L and Plaza A. 2020. A single model CNN for hyperspectral image denoising. IEEE Transactions on Geoscience and Remote Sensing， 58（4）： 2516-2529 ［DOI： 10.1109/TGRS.2019.2952062http://dx.doi.org/10.1109/TGRS.2019.2952062］

Maggioni M， Katkovnik V， Egiazarian K and Foi A. 2013. Nonlocal transform-domain filter for volumetric data denoising and reconstruction. IEEE Transactions on Image Processing， 22（1）： 119-133 ［DOI： 10.1109/TIP.2012.2210725http://dx.doi.org/10.1109/TIP.2012.2210725］

Othman H and Qian S E. 2006. Noise reduction of hyperspectral imagery using hybrid spatial-spectral derivative-domain wavelet shrinkage. IEEE Transactions on Geoscience and Remote Sensing， 44（2）： 397-408 ［DOI： 10.1109/TGRS.2005.860982http://dx.doi.org/10.1109/TGRS.2005.860982］

Shi Q， Tang X P， Yang T R， Liu R and Zhang L P. 2021. Hyperspectral image denoising using a 3-D attention denoising network. IEEE Transactions on Geoscience and Remote Sensing， 59（12）： 10348-10363 ［DOI： 10.1109/TGRS.2020.3045273http://dx.doi.org/10.1109/TGRS.2020.3045273］

Vaswani A， Shazeer N， Parmar N， Uszkoreit J， Jones L， Gomez A N， Kaiser Ł and Polosukhin I. 2017. Attention is all you need//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach， USA： Curran Associates Inc.： 6000-6010

Wang X H， Zhao K Y， Zhao X Y and Li S Y. 2023. TriTF： a triplet Transformer framework based on parents and brother attention for hyperspectral image change detection. IEEE Transactions on Geoscience and Remote Sensing， 61： #5507213 ［DOI： 10.1109/TGRS.2023.3260969http://dx.doi.org/10.1109/TGRS.2023.3260969］

Wei K X， Fu Y and Huang H. 2021. 3-D quasi-recurrent neural network for hyperspectral image denoising. IEEE Transactions on Neural Networks and Learning Systems， 32（1）： 363-375 ［DOI： 10.1109/TNNLS.2020.2978756http://dx.doi.org/10.1109/TNNLS.2020.2978756］

Xiong F C， Zhou J， Zhao Q L， Lu J F and Qian Y T. 2022. MAC-Net： model-aided nonlocal neural network for hyperspectral image denoising. IEEE Transactions on Geoscience and Remote Sensing， 60： #5519414 ［DOI： 10.1109/TGRS.2021.3131878http://dx.doi.org/10.1109/TGRS.2021.3131878］

Yu D B， Li Q W， Wang X L， Zhang Z L， Qian Y X and Xu C. 2023. DSTrans： dual-stream Transformer for hyperspectral image restoration//Proceedings of 2023 IEEE/CVF Winter Conference on Applications of Computer Vision （WACV）. Waikoloa， USA： IEEE： 3728-3738 ［DOI： 10.1109/WACV56688.2023.00373http://dx.doi.org/10.1109/WACV56688.2023.00373］

Yuan Q Q， Zhang Q， Li J， Shen H F and Zhang L P. 2019. Hyperspectral image denoising employing a spatial-spectral deep residual convolutional neural network. IEEE Transactions on Geoscience and Remote Sensing， 57（2）： 1205-1218 ［DOI： 10.1109/TGRS.2018.2865197http://dx.doi.org/10.1109/TGRS.2018.2865197］

Zhang B， Chen Y X， Rong Y， Xiong S W and Lu X Q. 2023. MATNet： a combining multi-attention and Transformer network for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing， 61： #5506015 ［DOI： 10.1109/TGRS.2023.3254523http://dx.doi.org/10.1109/TGRS.2023.3254523］

Zhang H Y， He W， Zhang L P， Shen H F and Yuan Q Q. 2014. Hyperspectral image restoration using low-rank matrix recovery. IEEE Transactions on Geoscience and Remote Sensing， 52（8）： 4729-4743 ［DOI： 10.1109/TGRS.2013.2284280http://dx.doi.org/10.1109/TGRS.2013.2284280］

Zhao B， Ulfarsson M O， Sveinsson J R and Chanussot J. 2022. Hyperspectral image denoising using spectral-spatial transform-based sparse and low-rank representations. IEEE Transactions on Geoscience and Remote Sensing， 60： #5522125 ［DOI： 10.1109/TGRS.2022.3142988http://dx.doi.org/10.1109/TGRS.2022.3142988］

文章被引用时，请邮件提醒。

提交

暂无数据