面向RGB-D图像的多层特征提取算法综述

李洋; 吴晓群

doi:10.11834/jig.230351

综述 | 浏览量 : 0 下载量: 215 CSCD: 0

PDF
导出
分享
收藏
专辑

面向RGB-D图像的多层特征提取算法综述
Survey of multilevel feature extraction methods for RGB-D images
2024年29卷第5期页码：1346-1363
收稿日期：2023-06-18，

修回日期：2023-09-20，

纸质出版日期：2024-05-16
DOI： 10.11834/jig.230351
稿件说明：

移动端阅览

李洋，吴晓群. 2024. 面向RGB-D图像的多层特征提取算法综述. 中国图象图形学报， 29(05):1346-1363 DOI： 10.11834/jig.230351.

Li Yang， Wu Xiaoqun. 2024. Survey of multilevel feature extraction methods for RGB-D images. Journal of Image and Graphics， 29(05):1346-1363 DOI： 10.11834/jig.230351.

摘要

RGB-D图像包含丰富的多层特征，如底层的线特征、平面特征，高层的语义特征，面向RGB-D图像的多层特征提取结果可以作为先验知识提升室内场景重建、SLAM（simultaneous localization and mapping）等多种任务的输出质量，是计算机图形学领域的热点研究内容之一。传统的多层特征提取算法一般利用RGB图像中丰富的颜色、纹理信息以及深度图像中的几何信息提取多层特征，此类提取算法依赖输入RGB-D图像的质量，而受采集过程中环境和人为因素的影响，很难得到高质量的RGB-D图像。随着深度学习技术的快速发展，基于深度学习的多层特征提取算法突破了这一限制，涌现出一批高质量的研究成果。本文对面向RGB-D图像的多层特征提取算法进行综述。首先，汇总了现有的常用于多层特征提取任务的RGB-D数据集和相关算法的质量评价指标。然后，按照特征所处的不同层次，依次对线、平面和语义特征相关算法进行了总结。此外，本文还对各算法的优缺点进行比较并结合常用算法质量评价标准进行了定量分析。最后，讨论了当前多层特征提取算法亟待解决的问题并展望了未来发展的趋势。

Abstract

RGB-D images contain rich multilevel features， such as low-level line， planar， and high-level semantic features. These different levels of features provide valuable information for various computer vision tasks. Computer vision algorithms can extract meaningful information from RGB-D images and improve the performance of various tasks， including object detection， tracking， and indoor scene reconstruction， by leveraging these multilevel features. Terms such as feature and contour lines can be used when describing existing line features in a single RGB-D image. Line features provide crucial information regarding the spatial relationships and boundaries in the input image， aiding in the understanding and interpretation of input data. Plane and surface are used to describe planar features and those refer to flat or nearly flat regions in the RGB-D image. Terms such as instance and semantic labels can be used when describing an object. Instance labels refer to unique identifiers or labels assigned to individual instances or occurrences of objects in an image， while semantic labels represent the broad class or category to which an object belongs. Semantic labels provide a high-level understanding of the objects in the image， grouping them into meaningful categories that indicate the general type of object present. Traditional methods for extracting line features often utilize color， texture information of RGB image， and geometric information in the depth image to extract feature and contour lines. The extraction of planar features involves clustering to extract sets of points with similar properties， further facilitating planar feature extraction. Semantic feature extraction aims to assign specific semantic categories to each pixel in the RGB-D input， and most of the methods used for this task are implemented based on deep learning. The multilevel feature extraction results for RGB-D images can be used as prior knowledge aids such as indoor scene reconstruction， scene understanding， object recognition， and other tasks to improve the quality of network output. Multilevel feature extraction for RGB-D images is also one of the popular topics in the field of computer graphics. With the development and popularization of commercial depth cameras， acquiring RGB-D data has become increasingly convenient. However， the quality of captured RGB-D data is often compromised by environmental and human factors during the acquisition process. This phenomenon leads to issues such as noise and depth absence， which， in turn， negatively affects the quality of multilevel feature extraction results to some extent. These problems are detrimental to traditional methods， but the emergence of deep learning approaches has overcome these issues to a certain extent. With the rapid development of deep learning technology， numerous high-quality research results have emerged for multilevel feature extraction tasks based on deep learning. The commonly used RGB-D datasets for multilevel feature extraction tasks， such as NYU v2 and SUN RGB-D， are summarized in this paper. These datasets contain diverse scene data， comprising RGB images paired with corresponding depth images. Taking NYU v2 as an example， the dataset includes 1 499 RGB-D images， derived from 464 distinct indoor scenes across 26 scene classes. After introducing the datasets， this paper provides a summary of commonly used evaluation criteria for assessing the quality of line， planar， and semantic features. Detailed explanations are presented for the computation method of each evaluation criterion. When reviewing line feature extraction methods， a comprehensive summary based on traditional and deep learning approaches is presented. Detailed explanations of the principles， advantages， and limitations of different methods are provided. Furthermore， quantitative comparisons of the extraction results from several different methods are conducted. When summarizing planar feature extraction methods， a comprehensive overview is provided from two perspectives： traditional and deep learning-based planar feature extraction methods. Relevant research papers are gathered， and a quality comparison of planar feature extraction methods is then conducted. Additionally， detailed explanations of the advantages and limitations of each method are provided. A comprehensive review of deep learning-based semantic feature extraction methods is presented in this paper from two aspects： fully-supervised and semi-supervised learning-based semantic feature extraction methods. Relevant research papers are also summarized. When comparing different semantic feature extraction methods， this paper used evaluation metrics such as pixel accuracy （PA）， mean PA （MPA）， and mean intersection over union （mIoU） to measure the quality of the extraction algorithms. The results of the quantitative comparisons revealed that semantic feature extraction methods oriented toward RGB-D data exhibit superior extraction quality. These comparison results prove that feature extraction methods designed specifically for RGB-D data can achieve better results compared to methods that only utilize RGB data. The incorporation of depth information in RGB-D data facilitates accurate and robust extraction of semantic features， leading to enhanced performance in various tasks such as scene understanding and object recognition. Data annotation has certainly been a challenge for feature extraction methods based on deep learning. Annotating large-scale datasets requires considerable time and human resources. Researchers have been actively seeking ways to reduce the workload of data annotation or maximize existing annotated data to overcome these challenges. Therefore， unsupervised， semi-supervised， and transfer learning are widely investigated to leverage unlabeled or sparsely labeled data for feature extraction. Finally， the problems of the current multilevel feature extraction algorithm that must be addressed are discussed to provide guidance to the future development trend at the end of this paper.

关键词

Keywords

references

Akinlar C and Topal C . 2011 . EDLines： a real-time line segment detector with a false detection control . Pattern Recognition Letters ， 32 （ 13 ）： 1633 - 1642 ［ DOI： 10.1016/j.patrec.2011.06.001 http://dx.doi.org/10.1016/j.patrec.2011.06.001 ］

Ammirato P ， Poirson P ， Park E ， Košeck􀅡 J and Berg A C . 2017 . A dataset for developing and benchmarking active vision // Proceedings of 2017 IEEE International Conference on Robotics and Automation . Singapore， Singapore ： IEEE： 1378 - 1385 ［ DOI： 10.1109/ICRA.2017.7989164 http://dx.doi.org/10.1109/ICRA.2017.7989164 ］

Badrinarayanan V ， Kendall A and Cipolla R . 2017 . SegNet： a deep convolutional encoder-decoder architecture for image segmentation . IEEE Transactions on Pattern Analysis and Machine Intelligence ， 39 （ 12 ）： 2481 - 2495 ［ DOI： 10.1109/TPAMI.2016.2644615 http://dx.doi.org/10.1109/TPAMI.2016.2644615 ］

Bose L and Richards A . 2016 . Fast depth edge detection and edge based RGB-D SLAM // Proceedings of 2016 IEEE International Conference on Robotics and Automation . Stockholm， Sweden ： IEEE： 1323 - 1330 ［ DOI： 10.1109/ICRA.2016.7487265 http://dx.doi.org/10.1109/ICRA.2016.7487265 ］

Canny J . 1986 . A computational approach to edge detection . IEEE Transactions on Pattern Analysis and Machine Intelligence， PAMI-8 （ 6 ）： 679 - 698 ［ DOI： 10.1109/TPAMI.1986.4767851 http://dx.doi.org/10.1109/TPAMI.1986.4767851 ］

Cao Y P ， Ju T ， Xu J and Hu S M . 2017 . Extracting sharp features from RGB-D images . Computer Graphics Forum ， 36 （ 8 ）： 138 - 152 ［ DOI： 10.1111/cgf.13069 http://dx.doi.org/10.1111/cgf.13069 ］

Chen L C ， Zhu Y K ， Papandreou G ， Schroff F and Adam H . 2018a . Encoder-decoder with atrous separable convolution for semantic image segmentation // Proceedings of the 15th European Conference on Computer Vision . Munich， Germany ： Springer： 833 - 851 ［ DOI： 10.1007/978-3-030-01234-2_49 http://dx.doi.org/10.1007/978-3-030-01234-2_49 ］

Chen X Z ， Kundu K ， Zhu Y K ， Ma H M ， Fidler S and Urtasun R . 2018b . 3D object proposals using stereo imagery for accurate object class detection . IEEE Transactions on Pattern Analysis and Machine Intelligence ， 40 （ 5 ）： 1259 - 1272 ［ DOI： 10.1109/TPAMI.2017.2706685 http://dx.doi.org/10.1109/TPAMI.2017.2706685 ］

Chen Y L ， Mensink T and Gavves E . 2019 . 3D neighborhood convolution： learning depth-aware features for RGB-D and RGB semantic segmentation // Proceedings of 2019 International Conference on 3D Vision （3DV） . Quebec City， Canada ： IEEE： 173 - 182 ［ DOI： 10.1109/3DV.2019.00028 http://dx.doi.org/10.1109/3DV.2019.00028 ］

Cho N G ， Yuille A and Lee S W . 2018 . A novel Linelet-based representation for line segment detection . IEEE Transactions on Pattern Analysis and Machine Intelligence ， 40 （ 5 ）： 1195 - 1208 ［ DOI： 10.1109/TPAMI.2017.2703841 http://dx.doi.org/10.1109/TPAMI.2017.2703841 ］

Choi C ， Trevor A J B and Christensen H I . 2013 . RGB-D edge detection and edge-based registration // Proceedings of 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems . Tokyo， Japan ： IEEE： 1568 - 1575 ［ DOI： 10.1109/IROS.2013.6696558 http://dx.doi.org/10.1109/IROS.2013.6696558 ］

Dai A ， Chang A X ， Savva M ， Halber M ， Funkhouser T and Nießner M . 2017 . ScanNet： richly-annotated 3D reconstructions of indoor scenes // Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition . Honolulu， USA ： IEEE： 2432 - 2443 ［ DOI： 10.1109/CVPR.2017.261 http://dx.doi.org/10.1109/CVPR.2017.261 ］

Deng Z ， Todorovic S and Jan Latecki L . 2015 . Semantic segmentation of RGBD images with mutex constraints // Proceedings of 2015 IEEE International Conference on Computer Vision . Santiago， Chile ： IEEE： 1733 - 1741 ［ DOI： 10.1109/ICCV.2015.202 http://dx.doi.org/10.1109/ICCV.2015.202 ］

Duan L J ， Sun Q C ， Qiao Y H ， Chen J C and Cui G Q . 2021 . Attention-aware and semantic-aware network for RGB-D indoor semantic segmentation . Chinese Journal of Computers ， 44 （ 2 ）： 275 - 291

段立娟，孙启超，乔元华，陈军成，崔国勤 . 2021 . 基于注意力感知和语义感知的RGB-D室内图像语义分割算法 . 计算机学报， 44 （ 2 ）： 275 - 291 ［ DOI： 10.11897/SP.J.1016.2021.00275 http://dx.doi.org/10.11897/SP.J.1016.2021.00275 ］

Dzitsiuk M ， Sturm J ， Maier R ， Ma L N and Cremers D . 2017 . De-noising， stabilizing and completing 3D reconstructions on-the-go using plane priors // Proceedings of 2017 IEEE International Conference on Robotics and Automation . Singapore， Singapore ： IEEE： 3976 - 3983 ［ DOI： 10.1109/ICRA.2017.7989457 http://dx.doi.org/10.1109/ICRA.2017.7989457 ］

Han X N ， Wang X H ， Leng Y Q and Zhou W J . 2021 . A plane extraction approach in inverse depth images based on region-growing . Sensors ， 21 （ 4 ）： # 1141 ［ DOI： 10.3390/s21041141 http://dx.doi.org/10.3390/s21041141 ］

He K M ， Gkioxari G ， Doll􀅡r P and Girshick R . 2017 . Mask R-CNN // Proceedings of 2017 IEEE International Conference on Computer Vision . Venice， Italy ： IEEE： 2980 - 2988 ［ DOI： 10.1109/ICCV.2017.322 http://dx.doi.org/10.1109/ICCV.2017.322 ］

He K M ， Zhang X Y ， Ren S Q and Sun J . 2016 . Deep residual learning for image recognition // Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas， USA ： IEEE： 770 - 778 ［ DOI： 10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ］

Hu X X ， Yang K L ， Fei L and Wang K W . 2019 . ACNET： Attention based network to exploit complementary features for RGBD semantic segmentation // Proceedings of 2019 IEEE International Conference on Image Processing . Taipei， China ： IEEE： 1440 - 1444 ［ DOI： 10.1109/ICIP.2019.8803025 http://dx.doi.org/10.1109/ICIP.2019.8803025 ］

Hu Z T ， Chen C ， Yang B S ， Wang Z Y ， Ma R Q ， Wu W T and Sun W L . 2022 . Geometric feature enhanced line segment extraction from large-scale point clouds with hierarchical topological optimization . International Journal of Applied Earth Observation and Geoinformation ， 112 ： # 102858 ［ DOI： 10.1016/j.jag.2022.102858 http://dx.doi.org/10.1016/j.jag.2022.102858 ］

Janoch A ， Karayev S ， Jia Y Q ， Barron J T ， Fritz M ， Saenko K and Darrell T . 2011 . A category-level 3-D object dataset： putting the kinect to work // Proceedings of 2011 IEEE International Conference on Computer Vision Workshops . Barcelona， Spain ： IEEE： 1168 - 1174 ［ DOI： 10.1109/ICCVW.2011.6130382 http://dx.doi.org/10.1109/ICCVW.2011.6130382 ］

Jiang J D ， Zheng L N ， Luo F and Zhang Z J . 2018 . RedNet： residual encoder-decoder network for indoor RGB-D semantic segmentation ［EB/OL］. ［ 2023-06-03 ］. https://arxiv.org/pdf/1806.01054.pdf https://arxiv.org/pdf/1806.01054.pdf

Jin Z ， Tillo T ， Zou W B ， Zhao Y and Li X . 2019 . Robust plane detection using depth information from a consumer depth camera . IEEE Transactions on Circuits and Systems for Video Technology ， 29 （ 2 ）： 447 - 460 ［ DOI： 10.1109/TCSVT.2017.2780181 http://dx.doi.org/10.1109/TCSVT.2017.2780181 ］

Lai K ， Bo L F ， Ren X F and Fox D . 2011 . A large-scale hierarchical multi-view RGB-D object dataset // Proceedings of 2011 IEEE International Conference on Robotics and Automation . Shanghai， China ： IEEE： 1817 - 1824 ［ DOI： 10.1109/ICRA.2011.5980382 http://dx.doi.org/10.1109/ICRA.2011.5980382 ］

LeCun Y ， Bottou L ， Bengio Y and Haffner P . 1998 . Gradient-based learning applied to document recognition . Proceedings of the IEEE ， 86 （ 11 ）： 2278 - 2324 ［ DOI： 10.1109/5.726791 http://dx.doi.org/10.1109/5.726791 ］

Lee S ， Park S J and Hong K S . 2017 . RDFNet： RGB-D multi-level residual feature fusion for indoor semantic segmentation // Proceedings of 2017 IEEE International Conference on Computer Vision . Venice， Italy ： IEEE： 4990 - 4999 ［ DOI： 10.1109/ICCV.2017.533 http://dx.doi.org/10.1109/ICCV.2017.533 ］

Li F Y ， Ye B and Qin C . 2023 . Mutual attention mechanism-driven lightweight semantic segmentation network . Journal of Image and Graphics ， 28 （ 7 ）： 2068 - 2080

栗风永，叶彬，秦川 . 2023 . 互注意力机制驱动的轻量级图像语义分割网络 . 中国图象图形学报， 28 （ 7 ）： 2068 - 2080 ［ DOI： 10.11834/jig.211127 http://dx.doi.org/10.11834/jig.211127 ］

Li L ， Yang F ， Zhu H B ， Li D L ， Li Y and Tang L . 2017 . An improved RANSAC for 3D point cloud plane segmentation based on normal distribution transformation cells . Remote Sensing ， 9 （ 5 ）： # 433 ［ DOI： 10.3390/rs9050433 http://dx.doi.org/10.3390/rs9050433 ］

Lin G S ， Milan A ， Shen C H and Reid I . 2017 . RefineNet： multi-path refinement networks for high-resolution semantic segmentation // Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition . Honolulu， USA ： IEEE： 5168 - 5177 ［ DOI： 10.1109/CVPR.2017.549 http://dx.doi.org/10.1109/CVPR.2017.549 ］

Liu C ， Kim K ， Gu J W ， Furukawa Y and Kautz J . 2019 . PlaneRCNN： 3D plane detection and reconstruction from a single image // Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach， USA ： IEEE： 4445 - 4454 ［ DOI： 10.1109/CVPR.2019.00458 http://dx.doi.org/10.1109/CVPR.2019.00458 ］

Liu C ， Yang J M ， Ceylan D ， Yumer E and Furukawa Y . 2018 . PlaneNet： piece-wise planar reconstruction from a single RGB image // Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City， USA ： IEEE： 2579 - 2588 ［ DOI： 10.1109/CVPR.2018.00273 http://dx.doi.org/10.1109/CVPR.2018.00273 ］

Liu J C ， Ji P ， Bansal N ， Cai C J ， Yan Q A ， Huang X L and Xu Y . 2022 . PlaneMVS： 3D plane reconstruction from multi-view stereo // Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition . New Orleans， USA ： IEEE： 8655 - 8665 ［ DOI： 10.1109/CVPR52688.2022.00847 http://dx.doi.org/10.1109/CVPR52688.2022.00847 ］

Long J ， Shelhamer E and Darrell T . 2015 . Fully convolutional networks for semantic segmentation // Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition . Boston， USA ： IEEE： 3431 - 3440 ［ DOI： 10.1109/CVPR.2015.7298965 http://dx.doi.org/10.1109/CVPR.2015.7298965 ］

Lu X H ， Liu Y H and Li K . 2019 . Fast 3D line segment detection from unorganized point cloud ［EB/OL］. ［ 2023-06-03 ］. https://arxiv.org/pdf/1901.02532.pdf https://arxiv.org/pdf/1901.02532.pdf

Lu X H ， Yao J ， Li K and Li L . 2015 . CannyLines： a parameter-free line segment detector // Proceedings of 2015 IEEE International Conference on Image Processing . Quebec City， Canada ： IEEE： 507 - 511 ［ DOI： 10.1109/ICIP.2015.7350850 http://dx.doi.org/10.1109/ICIP.2015.7350850 ］

Maheshwari H ， Liu Y C and Kira Z . 2023 . Missing modality robustness in semi-supervised multi-modal semantic segmentation ［EB/OL］. ［ 2023-06-03 ］. https://arxiv.org/pdf/2304.10756.pdf https://arxiv.org/pdf/2304.10756.pdf

McCormac J ， Handa A ， Leutenegger S and Davison A J . 2017 . SceneNet RGB-D： can 5M synthetic images beat generic imagenet pre-training on indoor segmentation? // Proceedings of 2017 IEEE International Conference on Computer Vision . Venice， Italy ： IEEE： 2697 - 2706 ［ DOI： 10.1109/ICCV.2017.292 http://dx.doi.org/10.1109/ICCV.2017.292 ］

Nieto M ， Cuevas C ， Salgado L and García N . 2011 . Line segment detection using weighted mean shift procedures on a 2D slice sampling strategy . Pattern Analysis and Applications ， 14 （ 2 ）： 149 - 163 ［ DOI： 10.1007/s10044-011-0211-4 http://dx.doi.org/10.1007/s10044-011-0211-4 ］

Ronneberger O ， Fischer P and Brox T . 2015 . U-Net： convolutional networks for biomedical image segmentation // Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention . Munich， Germany ： Springer： 234 - 241 ［ DOI： 10.1007/978-3-319-24574-4_28 http://dx.doi.org/10.1007/978-3-319-24574-4_28 ］

Roychoudhury A ， Missura M and Bennewitz M . 2021 . Plane segmentation in organized point clouds using flood fill // Proceedings of 2021 IEEE International Conference on Robotics and Automation . Xi’an， China ： IEEE： 13532 - 13538 ［ DOI： 10.1109/ICRA48506.2021.9561325 http://dx.doi.org/10.1109/ICRA48506.2021.9561325 ］

Seichter D ， Köhler M ， Lewandowski B ， Wengefeld T and Gross H M . 2021 . Efficient RGB-D semantic segmentation for indoor scene analysis // Proceedings of 2021 IEEE International Conference on Robotics and Automation . Xi’an， China ： IEEE： 13525 - 13531 ［ DOI： 10.1109/ICRA48506.2021.9561675 http://dx.doi.org/10.1109/ICRA48506.2021.9561675 ］

Silberman N and Fergus R . 2011 . Indoor scene segmentation using a structured light sensor // Proceedings of 2011 IEEE International Conference on Computer Vision Workshops . Barcelona， Spain ： IEEE： 601 - 608 ［ DOI： 10.1109/ICCVW.2011.6130298 http://dx.doi.org/10.1109/ICCVW.2011.6130298 ］

Silberman N ， Hoiem D ， Kohli P and Fergus R . 2012 . Indoor segmentation and support inference from RGBD images // Proceedings of the 12th European Conference on Computer Vision . Florence， Italy ： 746 - 760 ［ DOI： 10.1007/978-3-642-33715-4_54 http://dx.doi.org/10.1007/978-3-642-33715-4_54 ］

Song S R ， Lichtenberg S P and Xiao J X . 2015 . SUN RGB-D： a RGB-D scene understanding benchmark suite // Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition . Boston， USA ： IEEE： 567 - 576 ［ DOI： 10.1109/CVPR.2015.7298655 http://dx.doi.org/10.1109/CVPR.2015.7298655 ］

Stekovic S ， Fraundorfer F and Lepetit V . 2020 . Casting geometric constraints in semantic segmentation as semi-supervised learning // Proceedings of 2020 IEEE Winter Conference on Applications of Computer Vision . Snowmass， USA ： IEEE： 1843 - 1852 ［ DOI： 10.1109/WACV45572.2020.9093571 http://dx.doi.org/10.1109/WACV45572.2020.9093571 ］

Sturm J ， Engelhard N ， Endres F ， Burgard W and Cremers D . 2012 . A benchmark for the evaluation of RGB-D SLAM systems // Proceedings of 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems . Vilamoura-Algarve， Portugal ： IEEE： 573 - 580 ［ DOI： 10.1109/IROS.2012.6385773 http://dx.doi.org/10.1109/IROS.2012.6385773 ］

Sun Q C ， En Q ， Duan L J and Qiao Y H . 2022 . RGB-D image semantic segmentation based on multi-modal adaptive convolution . Journal of Computer-Aided Design and Computer Graphics ， 34 （ 8 ）： 1272 - 1282

孙启超，恩擎，段立娟，乔元华 . 2022 . 基于多模态自适应卷积的RGB-D图像语义分割 . 计算机辅助设计与图形学学报， 34 （ 8 ）： 1272 - 1282 ［ DOI： 10.3724/SP.J.1089.2022.19132 http://dx.doi.org/10.3724/SP.J.1089.2022.19132 ］

Tan B ， Xue N ， Bai S ， Wu T F and Xia G S . 2021 . PlaneTR： structure-guided Transformers for 3D plane recovery // Proceedings of 2021 IEEE/CVF International Conference on Computer Vision . Montreal， Canada ： IEEE： 4166 - 4175 ［ DOI： 10.1109/ICCV48922.2021.00415 http://dx.doi.org/10.1109/ICCV48922.2021.00415 ］

Tian X ， Wang L and Ding Q . 2019 . Review of image semantic segmentation based on deep learning . Journal of Software ， 30 （ 2 ）： 440 - 468

田萱，王亮，丁琪 . 基于深度学习的图像语义分割方法综述 . 软件学报， 30 （ 2 ）： 440 - 468 ［ DOI： 10.13328/j.cnki.jos.005659 http://dx.doi.org/10.13328/j.cnki.jos.005659 ］

Valada A ， Mohan R and Burgard W . 2020 . Self-supervised model adaptation for multimodal semantic segmentation . International Journal of Computer Vision ， 128 （ 5 ）： 1239 - 1285 ［ DOI： 10.1007/s11263-019-01188-y http://dx.doi.org/10.1007/s11263-019-01188-y ］

von Gioi R G ， Jakubowicz J ， Morel J M and Randall G . 2010 . LSD： a fast line segment detector with a false detection control . IEEE Transactions on Pattern Analysis and Machine Intelligence ， 32 （ 4 ）： 722 - 732 ［ DOI： 10.1109/TPAMI.2008.300 http://dx.doi.org/10.1109/TPAMI.2008.300 ］

Wang W Y and Neumann U . 2018 . Depth-aware CNN for RGB-D segmentation // Proceedings of the 15th European Conference on Computer Vision . Munich， Germany ： 144 - 161 ［ DOI： 10.1007/978-3-030-01252-6_9 http://dx.doi.org/10.1007/978-3-030-01252-6_9 ］

Xiao J X ， Owens A and Torralba A . 2013 . SUN3D： a database of big spaces reconstructed using SfM and object labels // Proceedings of 2013 IEEE International Conference on Computer Vision . Sydney， Australia ： IEEE： 1625 - 1632 ［ DOI： 10.1109/ICCV.2013.458 http://dx.doi.org/10.1109/ICCV.2013.458 ］

Xie Y M ， Gadelha M ， Yang F T ， Zhou X W and Jiang H Z . 2022 . PlanarRecon： realtime 3D plane detection and reconstruction from posed monocular videos // Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition . New Orleans， USA ： IEEE： 6209 - 6218 ［ DOI： 10.1109/CVPR52688.2022.00612 http://dx.doi.org/10.1109/CVPR52688.2022.00612 ］

Xu D ， Li F H and Wei H X . 2019 . 3D point cloud plane segmentation method based on RANSAC and support vector machine // Proceedings of the 14th IEEE Conference on Industrial Electronics and Applications . Xi’an， China ： IEEE： 943 - 948 ［ DOI： 10.1109/ICIEA.2019.8834367 http://dx.doi.org/10.1109/ICIEA.2019.8834367 ］

Xu Y F ， Xu W J ， Cheung D and Tu Z W . 2021 . Line segment detection using Transformers without edges // Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Nashville， USA ： IEEE： 4255 - 4264 ［ DOI： 10.1109/CVPR46437.2021.00424 http://dx.doi.org/10.1109/CVPR46437.2021.00424 ］

Xue N ， Bai S ， Wang F D ， Xia G S ， Wu T F and Zhang L P . 2019 . Learning attraction field representation for robust line segment detection // Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach， USA ： IEEE： 1595 - 1603 ［ DOI： 10.1109/CVPR.2019.00169 http://dx.doi.org/10.1109/CVPR.2019.00169 ］

Yang B J ， Chen E Q ， Yang S Y and Bai W J . 2015 . RGB-D geometric features extraction and edge-based scene-SIRFS // Proceedings of 2015 IEEE International Conference on Communication Software and Networks . Chengdu， China ： IEEE： 306 - 311 ［ DOI： 10.1109/ICCSN.2015.7296174 http://dx.doi.org/10.1109/ICCSN.2015.7296174 ］

Yang N ， Mi Z Q ， Guo Y ， Sadoun B and Obaidat M S . 2020 . Fast local map construction of robot using semantic priors // Proceedings of 2020 International Conference on Communications， Computing， Cybersecurity， and Informatics . Sharjah， United Arab Emirates ： IEEE： 1 - 5 ［ DOI： 10.1109/CCCI49893.2020.9256777 http://dx.doi.org/10.1109/CCCI49893.2020.9256777 ］

Yu Z H ， Zheng J ， Lian D Z ， Zhou Z H and Gao S H . 2019 . Single-image piece-wise planar 3D reconstruction via associative embedding // Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach， USA ： IEEE： 1029 - 1037 ［ DOI： 10.1109/CVPR.2019.00112 http://dx.doi.org/10.1109/CVPR.2019.00112 ］

Yue W L ， Lu J G ， Zhou W H and Miao Y B . 2018 . A new plane segmentation method of point cloud based on mean shift and RANSAC // Proceedings of 2018 Chinese Control and Decision Conference . Shenyang， China ： IEEE： 1658 - 1663 ［ DOI： 10.1109/CCDC.2018.8407394 http://dx.doi.org/10.1109/CCDC.2018.8407394 ］

Zhang M Y ， Liu X L and Xu D . 2017 . Survey on line segment detection on images // Proceedings of the 36th Chinese Control Conference . Dalian， China ： Control Theory Professional Committee of China Association of Automation： 1287 - 1293

张茗奕，刘希龙，徐德 . 2017 . 图像线段提取方法综述 // 第36届中国控制会议论文集 . 大连，中国：中国自动化学会控制理论专业委员会： 1287 - 1293

Zhang Y J ， Wei D and Li Y S . 2021 . AG3line： active grouping and geometry-gradient combined validation for fast line segment extraction . Pattern Recognition ， 113 ： # 107834 ［ DOI： 10.1016/j.patcog.2021.107834 http://dx.doi.org/10.1016/j.patcog.2021.107834 ］

Zhang Z H ， Li Z X ， Bi N ， Zheng J ， Wang J L ， Huang K ， Luo W X ， Xu Y Y and Gao S H . 2019 . PPGNet： learning point-pair graph for line segment detection // Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach， USA ： IEEE： 7098 - 7107 ［ DOI： 10.1109/CVPR.2019.00727 http://dx.doi.org/10.1109/CVPR.2019.00727 ］

Zhang Z Y ， Deng H G ， Liu Y ， Xu Q G and Liu G . 2022 . A semi-supervised semantic segmentation method for blast-hole detection . Symmetry ， 14 （ 4 ）： # 653 ［ DOI： 10.3390/sym14040653 http://dx.doi.org/10.3390/sym14040653 ］

Zhao J Y ， Yu C Q and Sang N . 2022 . RGB-D semantic segmentation： depth information selection . Journal of Image and Graphics ， 27 （ 8 ）： 2473 - 2486

赵经阳，余昌黔，桑农 . 2022 . RGB-D语义分割：深度信息的选择使用 . 中国图象图形学报， 27 （ 8 ）： 2473 - 2486 ［ DOI： 10.11834/jig.210061 http://dx.doi.org/10.11834/jig.210061 ］

文章被引用时，请邮件提醒。

提交

结合坐标转换和时空信息注入的点云人体行为识别