结合频率和ViT的工业产品表面相似特征缺陷检测方法
Defect detection method for industrial product surfaces with similar features by combining frequency and ViT
- 2024年29卷第10期 页码:3074-3089
纸质出版日期: 2024-10-16
DOI: 10.11834/jig.230532
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2024-10-16 ,
移动端阅览
王素琴, 程成, 石敏, 朱登明. 2024. 结合频率和ViT的工业产品表面相似特征缺陷检测方法. 中国图象图形学报, 29(10):3074-3089
Wang Suqin, Cheng Cheng, Shi Min, Zhu Dengming. 2024. Defect detection method for industrial product surfaces with similar features by combining frequency and ViT. Journal of Image and Graphics, 29(10):3074-3089
目的
2
工业产品表面的缺陷检测是保证其质量的重要环节。针对工业产品表面缺陷与背景相似度高、表面缺陷特征相似的问题,提出了一种差异化检测网络YOLO-Differ(you only look once-difference)。
方法
2
该网络以YOLOv5(you only look once version 5)为基础,利用离散余弦变换算法和自注意力机制提取和增强频率特征,并通过融合频率特征,增大缺陷与背景特征之间的区分度;同时考虑到融合中存在的错位问题,设计自适应特征融合模块对齐并融合RGB特征和频率特征。其次,在网络的检测模块后新增细粒度分类分支,将视觉变换器(vision Transformer,ViT)作为该分支中的校正分类器,专注于提取和识别缺陷的微小特征差异,以应对不同缺陷特征细微差异的挑战。
结果
2
实验在3个数据集上与7种目标检测模型进行了对比,YOLO-Differ模型均取得了最优结果,与其他模型相比,平均准确率均值(mean average precision,mAP)分别提升了3.6%、2.4%和0.4%以上。
结论
2
YOLO-Differ模型与同类模型相比,具有更高的检测精度和更强的通用性。
Objective
2
In industrial production, influenced by the complex environment during manufacturing and production processes, surface defects on products are difficult to avoid. These defects not only destroy the integrity of the products but also affect their quality, posing potential threats to the health and safety of individuals. Thus, defect detection on the surface of industrial products is an important part that cannot be ignored in production. In defect detection tasks, the targets must be accurately classified to determine whether they should be subjected to recycling treatment. At the same time, the detection results must be presented in the form of bounding boxes to assist enterprises in analyzing the causes of defects and improving the production process. The traditional method of surface defect detection is the manual inspection method. However, in practice, manual inspection often has large limitations. In recent years, the performance of computers has improved by leaps and bounds, and traditional machine vision technology has been widely tested in various production fields. These methods rely on image processing and feature engineering, and in specific scenarios, they can reach a level close to manual detection, truly realizing the productivity replacement of machines for some manual labor. However, the shortcoming is the difficulty in extracting features from complex backgrounds, often resulting in inaccurate detection. Therefore, it is hardly reused in other types of workpiece inspection tasks. Deep learning has played an increasingly important role in the field of computer vision in recent years. Deep learning-based defect detection methods learn the features of numerous defect samples and utilize the defect sample features to achieve classification and localization. With high detection accuracy and applicability, they have addressed the complexity and uncertainty associated with manual feature extraction in traditional image processing, achieving remarkable results in industrial product surface defect detection. However, given the complex background of some industrial product surfaces, the high similarity between some surface defects and the background, and the small difference between different defects, the existing methods could hardly detect surface defects with accuracy. In this study, we propose a differential detection network (YOLO-Differ) based on YOLOv5.
Method
2
First, for cases where some defects are similar to background features on the surface of products, according to the studies of biology and psychology, predators use perceptual filters bound to specific features to separate target animals from the background during predation. In other words, they capture camouflaged targets by utilizing frequency domain features. The frequency signal strength of the target is lower than that of the background, and this difference helps us find targets similar to the background. Therefore, a novel method is proposed for the first time to integrate frequency cues in the object detection network, thus addressing the issue of inaccurate localization caused by defects that resemble the background, thereby enhancing the distinguishability between defects and the background. Second, a fine-grained classification branch is added after the detection module of the network to address the issue of small differences in defect features among different types. The vision Transformer(ViT) classification network is used as the corrective classifier in this branch to extract subtle distinguishing features of defects. Specifically, it divides the defective image into N blocks small enough to allow its inherent attention mechanism to capture important regions in the image. At the same time, Transformer performs global relationship modeling on different patches and gives each patch the importance of affecting classification results. This large range of relationship modeling and importance settings enable it to locate subtle differences in features and focus on important features of defects. Therefore, YOLO-Differ is divided into five parts: RGB feature extraction, frequency feature extraction, feature fusion, detection head, and fine-grained classification. First, RGB feature extraction, which consists of the backbone network and neck, is responsible for extracting the basic RGB feature information and fusing RGB features of different scales to obtain improved detection results. Next, RGB images are converted to YCbCr image space, and its results are processed through discrete cosine transform(DCT) and frequency enhancement to obtain their frequency features. The feature fusion module aligns and fuses the RGB features with frequency features. Then, the fused features are fed into the detection head to obtain defect localization information and preliminary classification results. Finally, the defect images are cropped in accordance with the location information and fed into the fine-grained classifier for secondary classification to obtain the final classification results of defects.
Result
2
In the experiment, YOLO-Differ models were compared with seven object detection models on three datasets, and YOLO-Differ consistently achieved optimal results. Compared with the current state-of-the-art models, the mean average precision(mAP) improved by 3.6%, 2.4%, and 0.4% on each respective dataset.
Conclusion
2
Compared with similar models, the YOLO-Differ model exhibits higher detection accuracy and stronger generality.
表面缺陷检测相似性频率特征细粒度分类通用性
surface defect detectionsimilarityfrequency featuresfine-grained classificationgenerality
Bai X L, Fang Y M, Lin W S, Wang L P and Ju B F. 2014. Saliency-based defect detection in industrial images by using phase spectrum. IEEE Transactions on Industrial Informatics, 10(4): 2135-2145 [DOI: 10.1109/TII.2014.2359416http://dx.doi.org/10.1109/TII.2014.2359416]
Chen C, Li K L, Cheng Z Y, Piccialli F, Hoi S C H and Zeng Z. 2022. A hybrid deep learning based framework for component defect detection of moving trains. IEEE Transactions on Intelligent Transportation Systems, 23(4): 3268-3280 [DOI: 10.1109/TITS.2020.3034239http://dx.doi.org/10.1109/TITS.2020.3034239]
Chen X J, An Z Y, Huang L S, He S Y, Zhang X Q and Lin S Z. 2020. Surface defect detection of electric power equipment in substation based on improved YOLOV4 algorithm//Proceedings of the 10th International Conference on Power and Energy Systems. Chengdu, China: IEEE: 256-261 [DOI: 10.1109/ICPES51309.2020.9349721http://dx.doi.org/10.1109/ICPES51309.2020.9349721]
Ding R W, Dai L H, Li G P and Liu H. 2019. TDD-net: a tiny defect detection network for printed circuit boards. CAAI Transactions on Intelligence Technology, 4(2): 110-116 [DOI: 10.1049/trit.2019.0019http://dx.doi.org/10.1049/trit.2019.0019]
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X H, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J and Houlsby N. 2020. An image is worth 16 × 16 words: Transformers for image recognition at scale [EB/OL]. [2023-08-14]. https://arxiv.org/pdf/2010.11929.pdfhttps://arxiv.org/pdf/2010.11929.pdf
Ge Z, Liu S T, Wang F, Li Z M and Sun J. 2021. YOLOX: exceeding YOLO series in 2021 [EB/OL]. [2023-08-14]. https://arxiv.org/pdf/2107.08430.pdfhttps://arxiv.org/pdf/2107.08430.pdf
Huang F R, Li Y, Guo L S, Qian F and Zhu Y C. 2020. Method for detecting surface defects of engine parts based on faster R-CNN. Journal of Computer-Aided Design and Computer Graphics, 32(6): 883-893
黄凤荣, 李杨, 郭兰申, 钱法, 朱雨晨. 2020. 基于Faster R-CNN的零件表面缺陷检测算法. 计算机辅助设计与图形学学报, 32(6): 883-893 [DOI: 10.3724/SP.J.1089.2020.17981http://dx.doi.org/10.3724/SP.J.1089.2020.17981]
Lin T Y, Goyal P, Girshick R, He K M and Dollr P. 2017. Focal loss for dense object detection//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 2999-3007 [DOI: 10.1109/ICCV.2017.324http://dx.doi.org/10.1109/ICCV.2017.324]
Luo J X, Yang Z Y, Li S P and Wu Y L. 2021. FPCB surface defect detection: a decoupled two-stage object detection framework. IEEE Transactions on Instrumentation and Measurement, 70: #5012311 [DOI: 10.1109/TIM.2021.3092510http://dx.doi.org/10.1109/TIM.2021.3092510]
Qi Y C, Wu X L, Zhao Z B, Shi B Q and Nie L Q. 2021. Bolt defect detection for aerial transmission lines using faster R-CNN with an embedded dual attention mechanism. Journal of Image and Graphics, 26(11): 2594-2604
戚银城, 武学良, 赵振兵, 史博强, 聂礼强. 2021. 嵌入双注意力机制的Faster R-CNN航拍输电线路螺栓缺陷检测. 中国图象图形学报, 26(11): 2594-2604 [DOI: 10.11834/jig.200793http://dx.doi.org/10.11834/jig.200793]
Ren S Q, He K M, Girshick R and Sun J. 2017. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6): 1137-1149 [DOI: 10.1109/TPAMI.2016.2577031http://dx.doi.org/10.1109/TPAMI.2016.2577031]
Su B Y, Chen H Y, Chen P, Bian G B, Liu K and Liu W P. 2021. Deep learning-based solar-cell manufacturing defect detection with complementary attention network. IEEE Transactions on Industrial Informatics, 17(6): 4084-4095 [DOI: 10.1109/TII.2020.3008021http://dx.doi.org/10.1109/TII.2020.3008021]
Sun P Z, Zhang R F, Jiang Y, Kong T, Xu C F, Zhan W, Tomizuka M, Li L, Yuan Z H, Wang C H and Luo P. 2021. Sparse R-CNN: end-to-end object detection with learnable proposals//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 14449-14458 [DOI: 10.1109/CVPR46437.2021.01422http://dx.doi.org/10.1109/CVPR46437.2021.01422]
Tan M X, Pang R M and Le Q V. 2020. EfficientDet: scalable and efficient object detection//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 10778-10787 [DOI: 10.1109/CVPR42600.2020.01079http://dx.doi.org/10.1109/CVPR42600.2020.01079]
Tao X, Zhang D P, Wang Z H, Liu X L, Zhang H Y and Xu D. 2020. Detection of power line insulator defects using aerial images analyzed with convolutional neural networks. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 50(4): 1486-1498 [DOI: 10.1109/TSMC.2018.2871750http://dx.doi.org/10.1109/TSMC.2018.2871750]
Tian Z, Shen C H, Chen H and He T. 2019. FCOS: fully convolutional one-stage object detection//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 9626-9635 [DOI: 10.1109/ICCV.2019.00972http://dx.doi.org/10.1109/ICCV.2019.00972]
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser Ł and Polosukhin I. 2017. Attention is all you need//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc.: 6000-6010
Wang C Y, Bochkovskiy A and Liao H Y M. 2023. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 7464-7475 [DOI: 10.1109/CVPR52729.2023.00721http://dx.doi.org/10.1109/CVPR52729.2023.00721]
Wang C Y, Liao H Y M, Wu Y H, Chen P Y, Hsieh J W and Yeh I H. 2020. CSPNet: a new backbone that can enhance learning capability of CNN//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle, USA: IEEE: 1571-1580 [DOI: 10.1109/CVPRW50498.2020.00203http://dx.doi.org/10.1109/CVPRW50498.2020.00203]
Zhang M and Yin L J. 2022. Solar cell surface defect detection based on improved YOLO v5. IEEE Access, 10: 80804-80815 [DOI: 10.1109/ACCESS.2022.3195901http://dx.doi.org/10.1109/ACCESS.2022.3195901]
Zhang Y C, Zhang Y and Gong J. 2020. A LCD screen Mura defect detection method based on machine vision//Proceedings of 2020 Chinese Control And Decision Conference. Hefei, China: IEEE: 4618-4623 [DOI: 10.1109/CCDC49329.2020.9164492http://dx.doi.org/10.1109/CCDC49329.2020.9164492]
Zhong Y J, Li B, Tang L, Kuang S Y, Wu S and Ding S H. 2022. Detecting camouflaged object in frequency domain//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 4494-4503 [DOI: 10.1109/CVPR52688.2022.00446http://dx.doi.org/10.1109/CVPR52688.2022.00446]
Zhou P, Zhou G B, Li Y M, He Z Z and Liu Y W. 2020. A hybrid data-driven method for wire rope surface defect detection. IEEE Sensors Journal, 20(15): 8297-8306 [DOI: 10.1109/JSEN.2020.2970070http://dx.doi.org/10.1109/JSEN.2020.2970070]
相关作者
相关机构