面向电力设备缺陷检测的多模态层次化分类

白艳峰; 王立彪; 高卫东; 马应龙

doi:10.11834/jig.230269

图像分析和识别 | 浏览量 : 0 下载量: 20 CSCD: 0

PDF
导出
分享
收藏
专辑

面向电力设备缺陷检测的多模态层次化分类
Multi-modal hierarchical classification for power equipment defect detection
2024年29卷第7期页码：2011-2023
纸质出版日期： 2024-07-16 ，
DOI： 10.11834/jig.230269
稿件说明：

移动端阅览

白艳峰，王立彪，高卫东，马应龙. 2024. 面向电力设备缺陷检测的多模态层次化分类. 中国图象图形学报， 29(07):2011-2023

Bai Yanfeng， Wang Libiao， Gao Weidong， Ma Yinglong. 2024. Multi-modal hierarchical classification for power equipment defect detection. Journal of Image and Graphics， 29(07):2011-2023
白艳峰，王立彪，高卫东，马应龙. 2024. 面向电力设备缺陷检测的多模态层次化分类. 中国图象图形学报， 29(07):2011-2023 DOI： 10.11834/jig.230269.

Bai Yanfeng， Wang Libiao， Gao Weidong， Ma Yinglong. 2024. Multi-modal hierarchical classification for power equipment defect detection. Journal of Image and Graphics， 29(07):2011-2023 DOI： 10.11834/jig.230269.

摘要

目的

电力设备的状态检测和故障维护是保障电力系统正常运行的重要基础。针对目前多数变电站存在电力设备缺陷类型复杂且现有的单分类缺陷检测方法无法满足电力设备的多标签分类缺陷检测需求的问题，提出一种面向电力设备缺陷检测的多模态层次化分类方法。

方法

首先采集来自多个变电站的电力设备缺陷图像并进行人工标注、数据增强及归一化等预处理，构建了一个具有层次标签结构的电力设备缺陷图像数据集。然后提出一种基于多模态特征融合的层次化分类模型，采用ResNet50网络对图像进行特征提取，利用区域生成网络对目标进行定位以及前景、背景预测；为避免对区域生成网络生成的位置坐标进行量化时引入误差，进一步采用ROI Align（region of interest align）方法连续操作，生成位置坐标。最后采用层次化分类，将父类别标签嵌入到当前层目标特征表示进行逐层缺陷分类，最后一层得到最终的缺陷检测结果。

结果

在电力设备缺陷数据集和基准数据集上，与多标签分类电力设备缺陷检测方法和流行的常用目标检测算法进行对比实验。实验结果表明，模型对绝大部分设备缺陷类别的检测准确率最高，平均检测准确率达到86.4%，相比性能第2的模型，准确率提升了5.1%，并且在基准数据集上的平均检测准确率也提高了1.1%～3%。

结论

提出的电力设备缺陷检测方法充分利用设备缺陷标签的语义信息、层次结构和设备缺陷数据的图像特征，通过多模态层次化分类模型，能够提升电力设备缺陷检测的准确率。

Abstract

Objective

Safety state detection of power equipment is a fundamental task to ensure the safe operation of power systems. The state detection and fault maintenance of power equipment are the basic prerequisites for ensuring the normal operation of the power system. With the growing diversities and complexity of defects in substations， the current defect recognition and power detection has increasingly been required to handle multi-label classification tasks based on a large number of closely related defect labels. However， due to the complex types of power equipment defects in most substations， most existing approaches for power equipment defect detection are inefficient at multi-label defect detection because the defect category labels often have different granularities in their semantic concepts and are often closely related with each other. All these problems cause existing defect detection methods to have difficulty meeting the requirements of multi-label classification-based defect detection tasks of power equipment. To address these problems， this paper proposes a multi-modal hierarchical classification for power equipment defect detection， which is suitable for defect detection in complex power equipment environments.

Method

We propose a multi-modal hierarchical classification method， which fuses the feature information of defect images， hierarchical structure information， and the semantic information of category labels. First， defect images of power equipment from multiple substations are collected and preprocessed with manual annotation， data enhancement， and normalization to construct a power equipment defect image dataset with a hierarchical label structure. Then， a hierarchical classification model based on multi-modal feature fusion and hierarchical fine-tuning techniques is proposed， which uses the ResNet50 network to extract features from images， and a region proposal network to locate object and predict the foreground and background. The region of interest align（ROI Align） method is further used to continuously generate the position coordinates to avoid introducing errors in quantifying the position coordinates generated by the region proposal network. Finally， the hierarchical structure of power equipment to be detected is used to embed the parent category labels into the current layer’s object feature representation for layer-by-layer defect classification. The final defect detection result is obtained in the final layer.

Result

Comparative experiments are conducted on the real-world power equipment defect dataset and the PASCAL VOC2012 benchmark dataset against the current multi-label classification-based power equipment defect detection methods and the popularly used object detection algorithms. Experimental results show that the proposed method achieved the best detection accuracy for most equipment defect categories， with a mean average precision of 86.4%. Compared with the second-best performing model， the accuracy improved by 5.1%， and the mean average precision on the benchmark dataset increased by 1.1% to 3%. The proposed method can be executed in a relevantly shorter time than the compared methods.

Conclusion

Our method achieves superior detection accuracy performance against the compared methods while maintaining a lower computational cost. It can improve the accuracy of power equipment defect detection through a hierarchical classification model based on multi-modal feature fusion by fully utilizing the semantic relationship between equipment defect labels.

关键词

缺陷检测图像识别层次化分类多模态特征融合标签嵌入区域特征聚集

Keywords

defect detectionimage recognitionhierarchical classificationmulti-modal feature fusionlabel embeddingregional feature aggregation

references

Aly R， Remus S and Biemann C. 2019. Hierarchical multi-label classification of text with capsule networks//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics： Student Research Workshop. Florence， Italy： ACL： 323-330 ［DOI： 10.18653/v1/P19-2045http://dx.doi.org/10.18653/v1/P19-2045］

Bochkovskiy A， Wang C Y and Liao H Y M. 2020. YOLOv4： optimal speed and accuracy of object detection ［EB/OL］. ［2023-05-29］. https://arxiv.org/pdf/2004.10934.pdfhttps://arxiv.org/pdf/2004.10934.pdf

Cao J L， Li Y L， Sun H Q， Xie J， Huang K Q and Pang Y W. 2022. A survey on deep learning based visual object detection. Journal of Image and Graphics， 27（6）： 1697-1722

曹家乐，李亚利，孙汉卿，谢今，黄凯奇，庞彦伟. 2022. 基于深度学习的视觉目标检测技术综述. 中国图象图形学报， 27（6）： 1697-1722 ［DOI： 10.11834/jig.220069http://dx.doi.org/10.11834/jig.220069］

Carion N， Massa F， Synnaeve G， Usunier N， Kirillov A and Zagoruyko S. 2020. End-to-end object detection with transformers//Proceedings of the 16th European Conference on Computer Vision. Glasgow， UK： Springer： 213-229

He K M， Gkioxari G， Dollár P and Girshick R. 2017. Mask R-CNN//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice， Italy： IEEE： 2980-2988 ［DOI： 10.1109/ICCV.2017.322http://dx.doi.org/10.1109/ICCV.2017.322］

He K M， Zhang X Y， Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas， USA： IEEE： 770-778 ［DOI： 10.1109/CVPR.2016.90http://dx.doi.org/10.1109/CVPR.2016.90］

Howard A， Sandler M， Chen B， Wang W J， Chen L C， Tan M X， Chu G， Vasudevan V， Zhu Y K， Pang R M， Adam H and Le Q. 2019. Searching for MobileNetV3//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul， Korea （South）： IEEE： 1314-1324 ［DOI： 10.1109/ICCV.2019.00140http://dx.doi.org/10.1109/ICCV.2019.00140］

Huang Z C， Zeng Z Y， Huang Y P， Liu B， Fu D M and Fu J L. 2021. Seeing out of the box： end-to-end pre-training for vision-language representation learning//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville， USA： IEEE： 12971-12980 ［DOI： 10.1109/CVPR46437.2021.01278http://dx.doi.org/10.1109/CVPR46437.2021.01278］

Liao W T， Hu K， Yang M Y and Rosenhahn B. 2022. Text to image generation with semantic-spatial aware GAN//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 18166-18175 ［DOI： 10.1109/CVPR52688.2022.01765http://dx.doi.org/10.1109/CVPR52688.2022.01765］

Lin T Y， Dollár P， Girshick R， He K M， Hariharan B and Belongie S. 2017. Feature pyramid networks for object detection//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu， USA： IEEE： 936-944 ［DOI： 10.1109/CVPR.2017.106http://dx.doi.org/10.1109/CVPR.2017.106］

Mao J W. 2021. Research on Transformer Image Analysis Method Based on Deep Learning. Nanjing： Nanjing Normal university

毛进伟. 2021. 基于深度学习的变电图像分析方法研究. 南京：南京师范大学

Qi Y C， Wu X L， Zhao Z B， Shi B Q and Nie L Q. 2021. Bolt defect detection for aerial transmission lines using Faster R-CNN with an embedded dual attention mechanism. Journal of Image and Graphics， 26（11）： 2594-2604

戚银城，武学良，赵振兵，史博强，聂礼强. 2021. 嵌入双注意力机制的Faster R-CNN航拍输电线路螺栓缺陷检测. 中国图象图形学报， 26（11）： 2594-2604 ［DOI： 10.11834/jig.200793http://dx.doi.org/10.11834/jig.200793］

Ren S Q， He K M， Girshick R and Sun J. 2017. Faster R-CNN： towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence， 39（6）： 1137-1149 ［DOI： 10.1109/TPAMI.2016.2577031http://dx.doi.org/10.1109/TPAMI.2016.2577031］

Tan M X， Pang R M and Le Q V. 2020. EfficientDet： scalable and efficient object detection//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle， USA： IEEE： 10778-10787 ［DOI： 10.1109/CVPR42600.2020.01079http://dx.doi.org/10.1109/CVPR42600.2020.01079］

Ying Y， Wang Y Z， Yan Y F， Dong Z K， Qi D L and Li C Y. 2020. An improved defect detection method for substation equipment//Proceedings of the 39th Chinese Control Conference. Shenyang， China： IEEE： 6318-6323 ［DOI： 10.23919/CCC50068.2020.9189042http://dx.doi.org/10.23919/CCC50068.2020.9189042］

Zhang G J， Luo Z P， Yu Y C， Cui K W and Lu S J. 2022. Accelerating DETR convergence via semantic-aligned matching//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans， USA： IEEE： 939-948 ［DOI： 10.1109/CVPR52688.2022.00102http://dx.doi.org/10.1109/CVPR52688.2022.00102］

Zhang J M， Liu H Y， Yang K L， Hu X X， Liu R P and Stiefelhagen R. 2023. CMX： cross-modal fusion for RGB-X semantic segmentation with transformers. IEEE Transactions on Intelligent Transportation Systems， 24（12）： 14679-14694 ［DOI： 10.1109/TITS.2023.3300537http://dx.doi.org/10.1109/TITS.2023.3300537］

Zhao Q J， Sheng T， Wang Y T， Tang Z， Chen Y， Cai L and Ling H B. 2019. M2Det： a single-shot object detector based on multi-level feature pyramid network//Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Honolulu， USA： AAAI Press： 9259-9266 ［DOI： 10.1609/aaai.v33i01.33019259http://dx.doi.org/10.1609/aaai.v33i01.33019259］

Zhao W Q， Zhang H M and Xu M F. 2021. Insulator recognition based on an improved scale-transferrable network. Journal of Image and Graphics， 26（11）： 2561-2570

赵文清，张海明，徐敏夫. 2021. 面向改进尺度缩放网络的绝缘子识别. 中国图象图形学报， 26（11）： 2561-2570 ［DOI： 10.11834/jig.200697http://dx.doi.org/10.11834/jig.200697］

文章被引用时，请邮件提醒。

提交

面向图像识别的公平性研究进展

基于视觉的液晶屏/OLED屏缺陷检测方法综述

微表情峰值帧定位引导的分类算法