轻量化视觉定位技术综述
Lightweight visual-based localization technology
- 2024年29卷第10期 页码:2880-2911
纸质出版日期: 2024-10-16
DOI: 10.11834/jig.230744
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2024-10-16 ,
移动端阅览
叶翰樵, 刘养东, 申抒含. 2024. 轻量化视觉定位技术综述. 中国图象图形学报, 29(10):2880-2911
Ye Hanqiao, Liu Yangdong, Shen Shuhan. 2024. Lightweight visual-based localization technology. Journal of Image and Graphics, 29(10):2880-2911
视觉定位旨在从已知的三维场景中恢复当前观测图像的相机位姿。视觉定位技术具备低成本、高精度和易于集成等优势,是实现计算设备与真实世界建立智能交互过程的关键技术之一,如今获得了混合现实、自动驾驶等应用领域的广泛关注。作为计算机视觉领域长期探索的基础任务之一,视觉定位方法至今已取得显著的研究进展,然而现有方法普遍存在计算开销和存储占用过大等不足,这些问题导致视觉定位在移动端的高效部署和场景模型的更新维护方面存在困难,并因此在很大程度上限制着视觉定位技术的实际应用。针对这一问题,部分研究工作开始聚焦于推动视觉定位技术的轻量化发展。轻量化视觉定位旨在研究更加高效的场景表达形式及其视觉定位方法,目前正逐渐成为视觉定位领域重要的研究方向。本文首先回顾早期视觉定位框架,随后从场景表达形式的角度对具备轻量化特性的现有视觉定位研究工作进行分类。在各个方法类别下,分析总结其特点优势、应用场景和技术难点,并同时介绍代表性成果。进一步地,本文对部分轻量化视觉定位的代表性方法在常用室内外数据集上的性能表现进行对比分析,评估指标主要包含离线建图的用时、场景地图的存储占用和定位精度3个维度。现有的轻量化视觉定位技术仍然面临着诸多的难题与挑战,场景模型的表达能力、定位方法的泛化性与鲁棒性尚存在较大的提升空间。最后,本文对轻量化视觉定位未来的发展趋势进行分析与展望。
Visual-based localization determines the camera translation and orientation of an image observation with respect to a prebuilt 3D-based representation of the environment. It is an essential technology that empowers the intelligent interactions between computing facilities and the real world. Compared with alternative positioning systems beyond, the capability to estimate the accurate 6DOF camera pose, along with the flexibility and frugality in deployment, positions visual-based localization technology as a cornerstone of many applications, ranging from autonomous vehicles to augmented and mixed reality. As a long-standing problem in computer vision, visual localization has made exceeding progress over the past decades. A primary branch of prior arts relies on a preconstructed 3D map obtained by structure-from-motion techniques. Such 3D maps, a.k.a. SfM point clouds, store 3D points and per-point visual features. To estimate the camera pose, these methods typically establish correspondences between 2D keypoints detected in the query image and 3D points of the SfM point cloud through descriptor matching. The 6DOF camera pose of the query image is then recovered from these 2D-3D matches by leveraging geometric principles introduced by photogrammetry. Despite delivering fairly sound and reliable performance, such a scheme often has to consume several gigabytes of storage for just a single scene, which would result in computationally expensive overhead and prohibitive memory footprint for large-scale applications and resource-intensive platforms. Furthermore, it suffers from other drawbacks, such as costly map maintenance and privacy vulnerability. The aforementioned issues pose a major bottleneck in real-world applications and have thus prompted researchers to shift their focus toward leaner solutions. Lightweight visual-based localization seeks to introduce improvements in scene representations and the associated localization methods, making the resulting framework computationally tractable and memory-efficient without incurring a notable performance expense. For the background, this literature review first introduces several flagship frameworks of the visual-based localization task as preliminaries. These frameworks can be broadly classified into three categories, including image-retrieval-based methods, structure-based methods, and hierarchical methods. 3D scene representations adopted in these conventional frameworks, such as reference image databases and SfM point clouds, generally exhibit a high degree of redundancy, which causes excessive memory usage and inefficiency in distinguishing scene features for descriptor matching. Next, this review provides a guided tour of recent advances that promote the brevity of the 3D scene representations and the efficiency of corresponding visual localization methods. From the perspective of scene representations, existing research efforts in lightweight visual localization can be classified into six categories. Within each category, this literature review analyzes its characteristics, application scenarios, and technical limitations while also surveying some of the representative works. First, several methods have been proposed to enhance memory efficiency by compressing the SfM point clouds. These methods reduce the size of SfM point clouds through the combination of techniques including feature quantization, keypoint subset sampling, and feature-free matching. Extreme compression rates, such as 1% and below, can be achieved with barely noticeable accuracy degradation. Employing line maps as scene representations has become a focus of research in the field of lightweight visual localization. In human-made scenes characterized by salient structural features, the substitution of line maps for point clouds offers two major merits: 1) the abundance and rich geometric properties of line segments make line maps a concise option for depicting the environment; 2) line features exhibit better robustness in weak-textured areas or under temporally varying lighting conditions. However, the lack of a unified line descriptor and the difficulty of establishing 2D-3D correspondences between 3D line segments and image observations remain as main challenges. In the field of autonomous driving, high-definition maps constructed from vectorized semantic features have unlocked a new wave of cost-effective and lightweight solutions to visual localization for self-driving vehicle. Recent trends involve the utilization of data-driven techniques to learn to localize. This end-to-end philosophy has given rise to two regression-based methods. Scene coordinate regression (SCR) methods eschew the explicit processes of feature extraction and matching. Instead, they establish a direct mapping between observations and scene coordinates through regression. While a grounding in geometry remains essential for camera pose estimation in SCR methods, pose regression methods employ deep neural networks to establish the mapping from image observations to camera poses without any explicit geometric reasoning. Absolute pose regression techniques are akin to image retrieval approaches with limited accuracy and generalization capability, while relative pose regression techniques typically serve as a postprocessing step following the coarse localization stage. Neural radiance fields and related volumetric-based approaches have emerged as a novel way for the neural implicit scene representation. While visual localization based solely on a learned volumetric-based implicit map is still in an exploratory phase, the progress made over the past year or two has already yielded an impressive performance in terms of the scene representation capability and precision of localization. Furthermore, this study quantitatively evaluates the performance of several representative lightweight visual localization methods on well-known indoor and outdoor datasets. Evaluation metrics, including offline mapping time usage, storage demand, and localization accuracy, are considered for making comparisons. Results reveal that SCR methods generally stand out among the existing work, boasting remarkably compact scene maps and high success rates of localization. Existing lightweight visual localization methods have dramatically pushed the performance boundary. However, challenges still remain in terms of scalability and robustness when enlarging the scene scale and taking considerable visual disparity between query and mapping images into consideration. Therefore, extensive efforts are still required to promote the compactness of scene representations and improving the robustness of localization methods. Finally, this review provides an outlook on developing trends in the hope of facilitating future research.
视觉定位相机位姿估计三维场景表达轻量化地图特征匹配场景坐标回归位姿回归
visual localizationcamera pose estimation3D scene representationlightweight mapfeature matchingscene coordinate regressionpose regression
Arandjelovic R, Gronat P, Torii A, Pajdla T and Sivic J. 2016. NetVLAD: CNN architecture for weakly supervised place recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 5297-5307 [DOI: 10.1109/CVPR.2016.572http://dx.doi.org/10.1109/CVPR.2016.572]
Arandjelovic R, Gronat P, Torii A, Pajdla T and Sivic J. 2018. NetVLAD: CNN architecture for weakly supervised place recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(6): 1437-1451 [DOI: 10.1109/TPAMI.2017.2711011http://dx.doi.org/10.1109/TPAMI.2017.2711011]
Arnold E, Wynn J, Vicente S, Garcia-Hernando G, Monszpart Á, Prisacariu V, Turmukhambetov D and Brachmann E. 2022. Map-free visual relocalization: metric pose relative to a single image//Proceedings of the 17th European Conference on Computer Vision. Tel Aviv, Israel: Springer: 690-708 [DOI: 10.1007/978-3-031-19769-7_40http://dx.doi.org/10.1007/978-3-031-19769-7_40]
Balntas V, Li S D and Prisacariu V. 2018. RelocNet: continuous metric learning relocalisation using neural nets//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 782-799 [DOI: 10.1007/978-3-030-01264-9_46http://dx.doi.org/10.1007/978-3-030-01264-9_46]
Blanton H, Greenwell C, Workman S and Jacobs N. 2020. Extending absolute pose regression to multiple scenes//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle, USA: IEEE: 170-178 [DOI: 10.1109/CVPRW50498.2020.00027http://dx.doi.org/10.1109/CVPRW50498.2020.00027]
Brachmann E, Cavallari T and Prisacariu V A. 2023. Accelerated coordinate encoding: learning to relocalize in minutes using RGB and poses//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 5044-5053 [DOI: 10.1109/CVPR52729.2023.00488http://dx.doi.org/10.1109/CVPR52729.2023.00488]
Brachmann E, Krull A, Nowozin S, Shotton J, Michel F, Gumhold S and Rother C. 2017. DSAC — differentiable RANSAC for camera localization//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 2492-2500 [DOI: 10.1109/CVPR.2017.267http://dx.doi.org/10.1109/CVPR.2017.267]
Brachmann E, Michel F, Krull A, Yang M Y, Gumhold S and Rother C. 2016. Uncertainty-driven 6D pose estimation of objects and scenes from a single RGB image//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 3364-3372 [DOI: 10.1109/CVPR.2016.366http://dx.doi.org/10.1109/CVPR.2016.366]
Brachmann E and Rother C. 2018. Learning less is more - 6D camera localization via 3D surface regression//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 4654-4662 [DOI: 10.1109/CVPR.2018.00489http://dx.doi.org/10.1109/CVPR.2018.00489]
Brachmann E and Rother C. 2019a. Neural-guided RANSAC: learning where to sample model hypotheses//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 4321-4330 [DOI: 10.1109/ICCV.2019.00442http://dx.doi.org/10.1109/ICCV.2019.00442]
Brachmann E and Rother C. 2019b. Expert sample consensus applied to camera re-localization//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 7524-7533 [DOI: 10.1109/ICCV.2019.00762http://dx.doi.org/10.1109/ICCV.2019.00762]
Brachmann E and Rother C. 2022. Visual camera re-localization from RGB and RGB-D images using DSAC. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(9): 5847-5865 [DOI: 10.1109/TPAMI.2021.3070754http://dx.doi.org/10.1109/TPAMI.2021.3070754]
Brahmbhatt S, Gu J W, Kim K, Hays J and Kautz J. 2018. Geometry-aware learning of maps for camera localization//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 2616-2625 [DOI: 10.1109/CVPR.2018.00277http://dx.doi.org/10.1109/CVPR.2018.00277]
Brown M, Windridge D and Guillemaut J Y. 2015. Globally optimal 2D-3D registration from points or lines without correspondences//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 2111-2119 [DOI: 10.1109/ICCV.2015.244http://dx.doi.org/10.1109/ICCV.2015.244]
Bui M, Baur C, Navab N, Ilic S and Albarqouni S. 2019. Adversarial networks for camera pose regression and refinement//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision Workshop. Seoul, Korea (South): IEEE: 3778-3787 [DOI: 10.1109/ICCVW.2019.00470http://dx.doi.org/10.1109/ICCVW.2019.00470]
Cai R J, Hariharan B, Snavely N and Averbuch-Elor H. 2021. Extreme rotation estimation using dense correlation volumes//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 14561-14570 [DOI: 10.1109/CVPR46437.2021.01433http://dx.doi.org/10.1109/CVPR46437.2021.01433]
Campbell D, Liu L and Gould S. 2020a. Solving the blind perspective-n-point problem end-to-end with robust differentiable geometric optimization//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 244-261 [DOI: 10.1007/978-3-030-58536-5_15http://dx.doi.org/10.1007/978-3-030-58536-5_15]
Campbell D, Petersson L, Kneip L and Li H D. 2020b. Globally-optimal inlier set maximisation for camera pose and correspondence estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2): 328-342 [DOI: 10.1109/TPAMI.2018.2848650http://dx.doi.org/10.1109/TPAMI.2018.2848650]
Campbell D, Petersson L, Kneip L, Li H D and Gould S. 2019. The alignment of the spheres: globally-optimal spherical mixture alignment for camera pose estimation//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 11788-11798 [DOI: 10.1109/CVPR.2019.01207http://dx.doi.org/10.1109/CVPR.2019.01207]
Camposeco F, Cohen A, Pollefeys M and Sattler T. 2019. Hybrid scene compression for visual localization//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 7645-7654 [DOI: 10.1109/CVPR.2019.00784http://dx.doi.org/10.1109/CVPR.2019.00784]
Camposeco F, Sattler T, Cohen A, Geiger A and Pollefeys M. 2017. Toroidal constraints for two-point localization under high outlier ratios//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 6700-6708 [DOI: 10.1109/CVPR.2017.709http://dx.doi.org/10.1109/CVPR.2017.709]
Cao S and Snavely N. 2014. Minimal scene descriptions from structure from motion models//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE: 461-468 [DOI: 10.1109/CVPR.2014.66http://dx.doi.org/10.1109/CVPR.2014.66]
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A and Zagoruyko S. 2020. End-to-end object detection with Transformers//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 213-229 [DOI: 10.1007/978-3-030-58452-8_13http://dx.doi.org/10.1007/978-3-030-58452-8_13]
Cavallari T, Bertinetto L, Mukhoti J, Torr P and Golodetz S. 2019. Let’s take this online: adapting scene coordinate regression network predictions for online RGB-D camera relocalisation//Proceedings of 2019 International Conference on 3D Vision. Québec City, Canada: IEEE: 564-573 [DOI: 10.1109/3DV.2019.00068http://dx.doi.org/10.1109/3DV.2019.00068]
Cavallari T, Golodetz S, Lord N A, Valentin J, di Stefano L and Torr P H S. 2017. On-the-fly adaptation of regression forests for online camera relocalisation//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 218-227 [DOI: 10.1109/CVPR.2017.31http://dx.doi.org/10.1109/CVPR.2017.31]
Cavallari T, Golodetz S, Lord N A, Valentin J, Prisacariu V A, Stefano L D and Torr P H S. 2020. Real-time RGB-D camera pose estimation in novel scenes using a relocalisation cascade. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(10): 2465-2477 [DOI: 10.1109/TPAMI.2019.2915068http://dx.doi.org/10.1109/TPAMI.2019.2915068]
Chen K F, Snavely N and Makadia A. 2021. Wide-baseline relative camera pose estimation with directional learning//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 3257-3267 [DOI: 10.1109/CVPR46437.2021.00327http://dx.doi.org/10.1109/CVPR46437.2021.00327]
Chen L C, Zhu Y K, Papandreou G, Schroff F and Adam H. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 833-851 [DOI: 10.1007/978-3-030-01234-2_49http://dx.doi.org/10.1007/978-3-030-01234-2_49]
Chen S, Li X H, Wang Z R and Prisacariu V A. 2022. DFNet: enhance absolute pose regression with direct feature matching//Proceedings of the 17th European Conference on Computer Vision. Tel Aviv, Israel: Springer: 1-17 [DOI: 10.1007/978-3-031-20080-9_1http://dx.doi.org/10.1007/978-3-031-20080-9_1]
Chen Y, Chen X Y, Wang X, Zhang Q, Guo Y, Shan Y and Wang F. 2023. Local-to-global registration for bundle-adjusting neural radiance fields//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 8264-8273 [DOI: 10.1109/CVPR52729.2023.00799http://dx.doi.org/10.1109/CVPR52729.2023.00799]
Chen Z H, Pei H Y, Wang J K and Dai D Y. 2021. Survey of monocular camera-based visual relocalization. Robot, 43(3): 373-384
陈宗海, 裴浩渊, 王纪凯, 戴德云. 2021. 基于单目相机的视觉重定位方法综述. 机器人, 43(3): 373-384 [DOI: 10.13973/j.cnki.robot.200350http://dx.doi.org/10.13973/j.cnki.robot.200350]
Cheng W T, Lin W S, Chen K and Zhang X F. 2019. Cascaded parallel filtering for memory-efficient image-based localization//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 1032-1041 [DOI: 10.1109/ICCV.2019.00112http://dx.doi.org/10.1109/ICCV.2019.00112]
Chidlovskii B and Sadek A. 2020. Adversarial transfer of pose estimation regression//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 646-661 [DOI: 10.1007/978-3-030-66415-2_43http://dx.doi.org/10.1007/978-3-030-66415-2_43]
Cipolla R, Gal Y and Kendall A. 2018. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 7482-7491 [DOI: 10.1109/CVPR.2018.00781http://dx.doi.org/10.1109/CVPR.2018.00781]
Clark R, Wang S, Markham A, Trigoni N and Wen H K. 2017. VidLoc: a deep spatio-temporal model for 6-DoF video-clip relocalization//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 2652-2660 [DOI: 10.1109/CVPR.2017.284http://dx.doi.org/10.1109/CVPR.2017.284]
Dai A, Nießner M, Zollhöfer M, Izadi S and Theobalt C. 2017. BundleFusion: real-time globally consistent 3D reconstruction using on-the-fly surface reintegration. ACM Transactions on Graphics, 36(3): #24 [DOI: 10.1145/3054739http://dx.doi.org/10.1145/3054739]
David P, DeMenthon D, Duraiswami R and Samet H. 2004. SoftPOSIT: simultaneous pose and correspondence determination. International Journal of Computer Vision, 59(3): 259-284 [DOI: 10.1023/B:VISI.0000025800.10423.1fhttp://dx.doi.org/10.1023/B:VISI.0000025800.10423.1f]
DeTone D, Malisiewicz T and Rabinovich A. 2018. SuperPoint: self-supervised interest point detection and description//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Salt Lake City, USA: IEEE: 337-349 [DOI: 10.1109/CVPRW.2018.00060http://dx.doi.org/10.1109/CVPRW.2018.00060]
Ding M Y, Wang Z, Sun J K, Shi J P and Luo P. 2019. CamNet: coarse-to-fine retrieval for camera re-localization//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 2871-2880 [DOI: 10.1109/ICCV.2019.00296http://dx.doi.org/10.1109/ICCV.2019.00296]
Dong S Y, Wang S Z, Zhuang Y X, Kannala J, Pollefeys M and Chen B Q. 2022. Visual localization via few-shot scene region classification//Proceedings of 2022 International Conference on 3D Vision. Prague, Czech Republic: IEEE: 393-402 [DOI: 10.1109/3DV57658.2022.00051http://dx.doi.org/10.1109/3DV57658.2022.00051]
Dong S Y, Liu H, Guo H K, Chen B Q and Pollefeys M. 2023. Lazy visual localization via motion averaging [EB/OL]. [2023-08-19]. http://arxiv.org/pdf/2307.09981.pdfhttp://arxiv.org/pdf/2307.09981.pdf
Donoser M and Schmalstieg D. 2014. Discriminative feature-to-point matching in image-based localization//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE: 516-523 [DOI: 10.1109/CVPR.2014.73http://dx.doi.org/10.1109/CVPR.2014.73]
Fang Q H, Yin Y D, Fan Q N, Xia F, Dong S Y, Wang S, Wang J, Guibas L J and Chen B Q. 2022. Towards accurate active camera localization//Proceedings of the 17th European Conference on Computer Vision. Tel Aviv, Israel: Springer: 122-139 [DOI: 10.1007/978-3-031-20080-9_8http://dx.doi.org/10.1007/978-3-031-20080-9_8]
Fischler M A and Bolles R C. 1981. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6): 381-395 [DOI: 10.1145/358669.358692http://dx.doi.org/10.1145/358669.358692]
Gao S, Wan J X, Ping Y S, Zhang X D, Dong S Z, Yang Y C, Ning H K, Li J J N and Guo Y D. 2022. Pose refinement with joint optimization of visual points and lines//Proceedings of 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems. Kyoto, Japan: IEEE: 2888-2894 [DOI: 10.1109/IROS47612.2022.9981420http://dx.doi.org/10.1109/IROS47612.2022.9981420]
Gong R H, Liu X L, Jiang S H, Li T X, Hu P, Lin J Z, Yu F W and Yan J J. 2019. Differentiable soft quantization: bridging full-precision and low-bit neural networks//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 4851-4860 [DOI: 10.1109/ICCV.2019.00495http://dx.doi.org/10.1109/ICCV.2019.00495]
Goto T, Pathak S, Ji Y, Fujii H, Yamashita A and Asama H. 2018. Line-based global localization of a spherical camera in manhattan worlds//Proceedings of 2018 IEEE International Conference on Robotics and Automation. Brisbane, Australia: IEEE: 2296-2303 [DOI: 10.1109/ICRA.2018.8460920http://dx.doi.org/10.1109/ICRA.2018.8460920]
Guzman-Rivera A, Kohli P, Glocker B, Shotton J, Sharp T, Fitzgibbon A and Izadi S. 2014. Multi-output learning for camera relocalization//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE: 1114-1121 [DOI: 10.1109/CVPR.2014.146http://dx.doi.org/10.1109/CVPR.2014.146]
Hofer M, Maurer M and Bischof H. 2017. Efficient 3D scene abstraction using line segments. Computer Vision and Image Understanding, 157: 167-178 [DOI: 10.1016/j.cviu.2016.03.017http://dx.doi.org/10.1016/j.cviu.2016.03.017]
Huang Z Y, Zhou H, Li Y J, Yang B B, Xu Y, Zhou X W, Bao H J, Zhang G F and Li H S. 2021. VS-net: voting with segmentation for visual localization//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 6097-6107 [DOI: 10.1109/CVPR46437.2021.00604http://dx.doi.org/10.1109/CVPR46437.2021.00604]
Irschara A, Zach C, Frahm J M and Bischof H. 2009. From structure-from-motion point clouds to fast location recognition//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE: 2599-2606 [DOI: 10.1109/CVPR.2009.5206587http://dx.doi.org/10.1109/CVPR.2009.5206587]
Izadi S, Kim D, Hilliges O, Molyneaux D, Newcombe R, Kohli P, Shotton J, Hodges S, Freeman D, Davison A and Fitzgibbon A. 2011. KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera//Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology. Santa Barbara, USA: ACM: 559-568 [DOI: 10.1145/2047196.2047270http://dx.doi.org/10.1145/2047196.2047270]
Jégou H, Douze M, Schmid C and Pérez P. 2010. Aggregating local descriptors into a compact image representation//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, USA: IEEE: 3304-3311 [DOI: 10.1109/CVPR.2010.5540039http://dx.doi.org/10.1109/CVPR.2010.5540039]
Jeong J, Cho Y and Kim A. 2020. HDMI-Loc: exploiting high definition map image for precise localization via bitwise particle filter. IEEE Robotics and Automation Letters, 5(4): 6310-6317 [DOI: 10.1109/LRA.2020.3013881http://dx.doi.org/10.1109/LRA.2020.3013881]
Kabsch W. 1976. A solution for the best rotation to relate two sets of vectors. Acta Crystallographica Section A, 32(5): 922-923 [DOI: 10.1107/S0567739476001873http://dx.doi.org/10.1107/S0567739476001873]
Kendall A and Cipolla R. 2016. Modelling uncertainty in deep learning for camera relocalization//Proceedings of 2016 IEEE International Conference on Robotics and Automation. Stockholm, Sweden: IEEE: 4762-4769 [DOI: 10.1109/ICRA.2016.7487679http://dx.doi.org/10.1109/ICRA.2016.7487679]
Kendall A and Cipolla R. 2017. Geometric loss functions for camera pose regression with deep learning//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 6555-6564 [DOI: 10.1109/CVPR.2017.694http://dx.doi.org/10.1109/CVPR.2017.694]
Kendall A, Grimes M and Cipolla R. 2015. PoseNet: a convolutional network for real-time 6-DOF camera relocalization//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 2938-2946 [DOI: 10.1109/ICCV.2015.336http://dx.doi.org/10.1109/ICCV.2015.336]
Kim J, Choi C, Jang H and Kim Y. 2023. LDL: line distance functions for panoramic localization//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 17836-17846 [DOI: 10.1109/ICCV51070.2023.01639http://dx.doi.org/10.1109/ICCV51070.2023.01639]
Krull A, Brachmann E, Michel F, Yang M Y, Gumhold S and Rother C. 2015. Learning analysis-by-synthesis for 6D pose estimation in RGB-D images//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 954-962 [DOI: 10.1109/ICCV.2015.115http://dx.doi.org/10.1109/ICCV.2015.115]
Laskar Z, Melekhov I, Kalia S and Kannala J. 2017. Camera relocalization by computing pairwise relative poses using convolutional neural network//Proceedings of 2017 IEEE International Conference on Computer Vision Workshops. Venice, Italy: IEEE: 920-929 [DOI: 10.1109/ICCVW.2017.113http://dx.doi.org/10.1109/ICCVW.2017.113]
Lepetit V, Moreno-Noguer F and Fua P. 2009. EPnP: an accurate O(n) solution to the PnP problem. International Journal of Computer Vision, 81(2): 155-166 [DOI: 10.1007/s11263-008-0152-6http://dx.doi.org/10.1007/s11263-008-0152-6]
Li X T, Wang S Z, Zhao Y, Verbeek J and Kannala J. 2020. Hierarchical scene coordinate classification and regression for visual localization//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 11980-11989 [DOI: 10.1109/cvpr42600.2020.01200http://dx.doi.org/10.1109/cvpr42600.2020.01200]
Li X T, Ylioinas J, Verbeek J and Kannala J. 2019. Scene coordinate regression with angle-based reprojection loss for camera relocalization//Proceedings of the 15th European Conference on Computer Vision Workshops. Munich, Germany: Springer: 229-245 [DOI: 10.1007/978-3-030-11015-4_19http://dx.doi.org/10.1007/978-3-030-11015-4_19]
Li Y P, Snavely N, Huttenlocher D and Fua P. 2012. Worldwide pose estimation using 3D point clouds//Proceedings of the 12th European Conference on Computer Vision. Florence, Italy: Springer: 15-29 [DOI: 10.1007/978-3-642-33718-5_2http://dx.doi.org/10.1007/978-3-642-33718-5_2]
Li Y P, Snavely N and Huttenlocher D P. 2010. Location recognition using prioritized feature matching//Proceedings of the 11th European Conference on Computer Vision. Heraklion, Greece: Springer: 791-804 [DOI: 10.1007/978-3-642-15552-9_57http://dx.doi.org/10.1007/978-3-642-15552-9_57]
Liao W L, Zhao H Q and Yan J C. 2021. Online extrinsic camera calibration based on high-definition map matching on public roadway. Journal of Image and Graphics, 26(1): 208-217
廖文龙, 赵华卿, 严骏驰. 2021. 开放道路中匹配高精度地图的在线相机外参标定. 中国图象图形学报, 26(1): 208-217 [DOI: 10.11834/jig.200432http://dx.doi.org/10.11834/jig.200432]
Lin C H, Ma W C, Torralba A and Lucey S. 2021. BARF: bundle-adjusting neural radiance fields//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 5721-5731 [DOI: 10.1109/ICCV48922.2021.00569http://dx.doi.org/10.1109/ICCV48922.2021.00569]
Lin Y Z, Müller T, Tremblay J, Wen B W, Tyree S, Evans A, Vela P A and Birchfield S. 2023. Parallel inversion of neural radiance fields for robust pose estimation//Proceedings of 2023 IEEE International Conference on Robotics and Automation. London, United Kingdom: IEEE: 9377-9384 [DOI: 10.1109/ICRA48891.2023.10161117http://dx.doi.org/10.1109/ICRA48891.2023.10161117]
Lindeberg T. 1998. Edge detection and ridge detection with automatic scale selection. International Journal of Computer Vision, 30(2): 117-156 [DOI: 10.1023/A:1008097225773http://dx.doi.org/10.1023/A:1008097225773]
Lindenberger P, Sarlin P-E and Pollefeys M. 2023. LightGlue: local feature matching at light speed//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 17581–17592 [DOI: 10.1109/ICCV51070.2023.01616http://dx.doi.org/10.1109/ICCV51070.2023.01616]
Liu J L, Nie Q, Liu Y and Wang C J. 2023a. NeRF-loc: visual localization with conditional neural radiance field//Proceedings of 2023 IEEE International Conference on Robotics and Automation. London, United Kingdom: IEEE: 9385-9392 [DOI: 10.1109/ICRA48891.2023.10161420http://dx.doi.org/10.1109/ICRA48891.2023.10161420]
Liu L, Li H D and Dai Y C. 2017. Efficient global 2D-3D matching for camera localization in a large-scale 3D map//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 2391-2400 [DOI: 10.1109/ICCV.2017.260http://dx.doi.org/10.1109/ICCV.2017.260]
Liu S, Zhang Y X, Xu J T, Zou D F, Chen S Y and Wang Z H. 2020. Visual prior-information-based map recovery slam in complex scenes. Journal of Image and Graphics, 25(1): 158-170
刘盛, 张宇翔, 徐婧婷, 邹大方, 陈胜勇, 王振华. 2020. 复杂场景下视觉先验信息的地图恢复SLAM. 中国图象图形学报, 25(1): 158-170 [DOI: 10.11834/jig.190041http://dx.doi.org/10.11834/jig.190041]
Liu S H, Yu Y F, Pautrat R, Pollefeys M and Larsson V. 2023b. 3D line mapping revisited//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 21445-21455 [DOI: 10.1109/CVPR52729.2023.02054http://dx.doi.org/10.1109/CVPR52729.2023.02054]
Lowe D G. 2004. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2): 91-110 [DOI: 10.1023/B:VISI.0000029664.99615.94http://dx.doi.org/10.1023/B:VISI.0000029664.99615.94]
Lowry S, Sünderhauf N, Newman P, Leonard J J, Cox D, Corke P and Milford M J. 2016. Visual place recognition: a survey. IEEE Transactions on Robotics, 32(1): 1-19 [DOI: 10.1109/TRO.2015.2496823http://dx.doi.org/10.1109/TRO.2015.2496823]
Lu Y, Huang J W, Chen Y T and Heisele B. 2017. Monocular localization in urban environments using road markings//Proceedings of 2017 IEEE Intelligent Vehicles Symposium. Los Angeles, USA: IEEE: 468-474 [DOI: 10.1109/IVS.2017.7995762http://dx.doi.org/10.1109/IVS.2017.7995762]
Maggio D, Abate M, Shi J N, Mario C and Carlone L. 2023. Loc-NeRF: Monte Carlo localization using neural radiance fields//Proceedings of 2023 IEEE International Conference on Robotics and Automation. London, United Kingdom: IEEE: 4018-4025 [DOI: 10.1109/ICRA48891.2023.10160782http://dx.doi.org/10.1109/ICRA48891.2023.10160782]
Melekhov I, Ylioinas J, Kannala J and Rahtu E. 2017. Image-based localization using hourglass networks//Proceedings of 2017 IEEE International Conference on Computer Vision Workshops. Venice, Italy: IEEE: 870-877 [DOI: 10.1109/ICCVW.2017.107http://dx.doi.org/10.1109/ICCVW.2017.107]
Meng Q, Chen A P, Luo H M, Wu M Y, Su H, Xu L, He X M and Yu J Y. 2021. GNeRF: GAN-based neural radiance field without posed camera//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 6331-6341 [DOI: 10.1109/ICCV48922.2021.00629http://dx.doi.org/10.1109/ICCV48922.2021.00629]
Mera-Trujillo M, Smith B and Fragoso V. 2020. Efficient scene compression for visual-based localization//Proceedings of 2020 International Conference on 3D Vision. Fukuoka, Japan: IEEE: 1-10 [DOI: 10.1109/3DV50981.2020.00111http://dx.doi.org/10.1109/3DV50981.2020.00111]
Micusík B and Wildenauer H. 2015. Descriptor free visual indoor localization with line segments//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 3165-3173 [DOI: 10.1109/CVPR.2015.7298936http://dx.doi.org/10.1109/CVPR.2015.7298936]
Micusik B and Wildenauer H. 2017. Structure from motion with line segments under relaxed endpoint constraints. International Journal of Computer Vision, 124(1): 65-79 [DOI: 10.1007/s11263-016-0971-9http://dx.doi.org/10.1007/s11263-016-0971-9]
Middelberg S, Sattler T, Untzelmann O and Kobbelt L. 2014. Scalable 6-DOF localization on mobile devices//Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer: 268-283 [DOI: 10.1007/978-3-319-10605-2_18http://dx.doi.org/10.1007/978-3-319-10605-2_18]
Mildenhall B, Srinivasan P P, Tancik M, Barron J T, Ramamoorthi R and Ng R. 2020. NeRF: representing scenes as neural radiance fields for view synthesis//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer International Publishing: 405-421 [DOI: 10.1007/978-3-030-58452-8_24http://dx.doi.org/10.1007/978-3-030-58452-8_24]
Moreau A, Piasco N, Bennehar M, Tsishkou D, Stanciulescu B and de La Fortelle A. 2023. CROSSFIRE: camera relocalization on self-supervised features from an implicit representation//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. Paris, France: IEEE: 252-262 [DOI: 10.1109/ICCV51070.2023.00030http://dx.doi.org/10.1109/ICCV51070.2023.00030]
Moreau A, Piasco N, Tsishkou D, Stanciulescu B and de La Fortelle A. 2022. LENS: localization enhanced by NeRF synthesis//Proceedings of the 5th Conference on Robot Learning. Auckland, New Zealand: PMLR: 1347-1356
Moreno-Noguer F, Lepetit V and Fua P. 2008. Pose priors for simultaneously solving alignment and correspondence//Proceedings of the 10th European Conference on Computer Vision. Marseille, France: Springer: 405-418 [DOI: 10.1007/978-3-540-88688-4_30http://dx.doi.org/10.1007/978-3-540-88688-4_30]
Mur-Artal R, Montiel J M M and Tardós J D. 2015. ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Transactions on Robotics, 31(5): 1147-1163 [DOI: 10.1109/TRO.2015.2463671http://dx.doi.org/10.1109/TRO.2015.2463671]
Naseer T and Burgard W. 2017. Deep regression for monocular camera-based 6-DoF global localization in outdoor environments//Proceedings of 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems. Vancouver, Canada: IEEE: 1525-1530 [DOI: 10.1109/IROS.2017.8205957http://dx.doi.org/10.1109/IROS.2017.8205957]
Newcombe R A, Fitzgibbon A, Izadi S, Hilliges O, Molyneaux D, Kim D, Davison A J, Kohi P, Shotton J and Hodges S. 2011. KinectFusion: real-time dense surface mapping and tracking//Proceedings of the 10th IEEE International Symposium on Mixed and Augmented Reality. Basel, Switzerland: IEEE: 127-136 [DOI: 10.1109/ISMAR.2011.6092378http://dx.doi.org/10.1109/ISMAR.2011.6092378]
Nichol A, Achiam J and Schulman J. 2018. On first-order meta-learning algorithms [EB/OL]. [2023-10-23]. https://arxiv.org/pdf/1803.02999v3.pdfhttps://arxiv.org/pdf/1803.02999v3.pdf
Nistér D. 2003. Preemptive RANSAC for live structure and motion estimation//Proceedings of the 9th IEEE International Conference on Computer Vision. Nice, France: IEEE: 199-206 [DOI: 10.1109/ICCV.2003.1238341http://dx.doi.org/10.1109/ICCV.2003.1238341]
Ozuysal M, Calonder M, Lepetit V and Fua P. 2010. Fast keypoint recognition using random ferns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3): 448-461 [DOI: 10.1109/TPAMI.2009.23http://dx.doi.org/10.1109/TPAMI.2009.23]
Pan X K, Liu H M, Fang M, Wang Z, Zhang Y and Zhang G F. 2023. Dynamic 3D scenario-oriented monocular slam based on semantic probability prediction. Journal of Image and Graphics, 28(7): 2151-2166
潘小鹍, 刘浩敏, 方铭, 王政, 张涌, 章国锋. 2023. 基于语义概率预测的动态场景单目视觉SLAM. 中国图象图形学报, 28(7): 2151-2166 [DOI: 10.11834/jig.210632http://dx.doi.org/10.11834/jig.210632]
Panek V, Kukelova Z and Sattler T. 2023. Visual localization using imperfect 3D models from the internet//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 13175-13186 [DOI: 10.1109/CVPR52729.2023.01266http://dx.doi.org/10.1109/CVPR52729.2023.01266]
Park H S, Wang Y, Nurvitadhi E, Hoe J C, Sheikh Y and Chen M. 2013. 3D point cloud reduction using mixed-integer quadratic programming//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Portland, USA: IEEE: 229-236 [DOI: 10.1109/CVPRW.2013.41http://dx.doi.org/10.1109/CVPRW.2013.41]
Perez E, Strub F, De Vries H, Dumoulin V and Courville A. 2018. FiLM: visual reasoning with a general conditioning layer//Proceedings of the 32nd AAAI Conference on Artificial Intelligence. New Orleans, USA: AAAI Press: 3942-3951 [DOI: 10.1609/aaai.v32i1.11671http://dx.doi.org/10.1609/aaai.v32i1.11671]
Poggenhans F, Salscheider N O and Stiller C. 2018. Precise localization in high-definition road maps for urban regions//Proceedings of 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. Madrid, Spain: IEEE: 2167-2174 [DOI: 10.1109/IROS.2018.8594414http://dx.doi.org/10.1109/IROS.2018.8594414]
Radwan N, Valada A and Burgard W. 2018. VLocNet++: deep multitask learning for semantic visual localization and odometry. IEEE Robotics and Automation Letters, 3(4): 4407-4414 [DOI: 10.1109/LRA.2018.2869640http://dx.doi.org/10.1109/LRA.2018.2869640]
Ranganathan A, Ilstrup D and Wu T. 2013. Light-weight localization for vehicles using road markings//Proceedings of 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. Tokyo, Japan: IEEE: 921-927 [DOI: 10.1109/IROS.2013.6696460http://dx.doi.org/10.1109/IROS.2013.6696460]
Revaud J, Weinzaepfel P, De Souza C and Humenberger M. 2019. R2D2: repeatable and reliable detector and descriptor//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc.: 12414-12424
Rublee E, Rabaud V, Konolige K and Bradski G. 2011. ORB: an efficient alternative to SIFT or SURF//Proceedings of 2011 IEEE International Conference on Computer Vision. Barcelona, Spain: IEEE: 2564-2571 [DOI: 10.1109/ICCV.2011.6126544http://dx.doi.org/10.1109/ICCV.2011.6126544]
Sandler M, Howard A, Zhu M L, Zhmoginov A and Chen L C. 2018. MobileNetV2: inverted residuals and linear bottlenecks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 4510-4520 [DOI: 10.1109/CVPR.2018.00474http://dx.doi.org/10.1109/CVPR.2018.00474]
Sarlin P E, Cadena C, Siegwart R and Dymczyk M. 2019. From coarse to fine: robust hierarchical localization at large scale//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 12708-12717 [DOI: 10.1109/CVPR.2019.01300http://dx.doi.org/10.1109/CVPR.2019.01300]
Sarlin P E, Debraine F, Dymczyk M, Siegwart R and Cadena C. 2018. Leveraging deep visual descriptors for hierarchical efficient localization//Proceedings of the 2nd Conference on Robot Learning. Zürich, Switzerland: PMLR: 456-465 [DOI: 10.3929/ETHZ-B-000318818http://dx.doi.org/10.3929/ETHZ-B-000318818]
Sarlin P E, DeTone D, Malisiewicz T and Rabinovich A. 2020. SuperGlue: learning feature matching with graph neural networks//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE: 4937-4946 [DOI: 10.1109/CVPR42600.2020.00499http://dx.doi.org/10.1109/CVPR42600.2020.00499]
Sarlin P E, Dusmanu M, Schönberger J L, Speciale P, Gruber L, Larsson V, Miksik O and Pollefeys M. 2022. LaMAR: benchmarking localization and mapping for augmented reality//Proceedings of the 17th European Conference on Computer Vision. Tel Aviv, Israel: Springer: 686-704 [DOI: 10.1007/978-3-031-20071-7_40http://dx.doi.org/10.1007/978-3-031-20071-7_40]
Sattler T, Havlena M, Radenovic F, Schindler K and Pollefeys M. 2015. Hyperpoints and fine vocabularies for large-scale location recognition//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE: 2102-2110 [DOI: 10.1109/ICCV.2015.243http://dx.doi.org/10.1109/ICCV.2015.243]
Sattler T, Leibe B and Kobbelt L. 2011. Fast image-based localization using direct 2D-to-3D matching//Proceedings of 2011 IEEE International Conference on Computer Vision. Barcelona, Spain: IEEE: 667-674 [DOI: 10.1109/ICCV.2011.6126302http://dx.doi.org/10.1109/ICCV.2011.6126302]
Sattler T, Leibe B and Kobbelt L. 2012. Improving image-based localization by active correspondence search//Proceedings of the 12th European Conference on Computer Vision. Florence, Italy: Springer: 752-765 [DOI: 10.1007/978-3-642-33718-5_54http://dx.doi.org/10.1007/978-3-642-33718-5_54]
Sattler T, Leibe B and Kobbelt L. 2017. Efficient and effective prioritized matching for large-scale image-based localization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(9): 1744-1756 [DOI: 10.1109/TPAMI.2016.2611662http://dx.doi.org/10.1109/TPAMI.2016.2611662]
Sattler T, Maddern W, Toft C, Torii A, Hammarstrand L, Stenborg E, Safari D, Okutomi M, Pollefeys M, Sivic J, Kahl F and Pajdla T. 2018. Benchmarking 6DOF outdoor visual localization in changing conditions//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 8601-8610 [DOI: 10.1109/CVPR.2018.00897http://dx.doi.org/10.1109/CVPR.2018.00897]
Sattler T, Zhou Q J, Pollefeys M and Leal-Taixé L. 2019. Understanding the limitations of CNN-based absolute camera pose regression//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 3297-3307 [DOI: 10.1109/CVPR.2019.00342http://dx.doi.org/10.1109/CVPR.2019.00342]
Schonberger J L and Frahm J M. 2016. Structure-from-motion revisited//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 4104-4113 [DOI: 10.1109/CVPR.2016.445http://dx.doi.org/10.1109/CVPR.2016.445]
Schreiber M, Knöppel C and Franke U. 2013. LaneLoc: lane marking based localization using highly accurate maps//Proceedings of 2013 IEEE Intelligent Vehicles Symposium. Gold Coast, Australia: IEEE: 449-454 [DOI: 10.1109/IVS.2013.6629509http://dx.doi.org/10.1109/IVS.2013.6629509]
Shavit Y, Ferens R and Keller Y. 2021. Learning multi-scene absolute pose regression with Transformers//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 2713-2722 [DOI: 10.1109/ICCV48922.2021.00273http://dx.doi.org/10.1109/ICCV48922.2021.00273]
Shi T X, Shen S H, Gao X and Zhu L J. 2019. Visual localization using sparse semantic 3D map//Proceedings of 2019 IEEE International Conference on Image Processing. Taipei, China: IEEE: 315-319 [DOI: 10.1109/ICIP.2019.8802957http://dx.doi.org/10.1109/ICIP.2019.8802957]
Shi Y, Cai J X, Shavit Y, Mu T J, Feng W S and Zhang K. 2022. ClusterGNN: cluster-based coarse-to-fine graph neural network for efficient feature matching//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 12507-12516 [DOI: 10.1109/CVPR52688.2022.01219http://dx.doi.org/10.1109/CVPR52688.2022.01219]
Shotton J, Glocker B, Zach C, Izadi S, Criminisi A and Fitzgibbon A. 2013. Scene coordinate regression forests for camera relocalization in RGB-D images//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA: IEEE: 2930-2937 [DOI: 10.1109/CVPR.2013.377http://dx.doi.org/10.1109/CVPR.2013.377]
Speciale P, Schonberger J L, Kang S B, Sinha S N and Pollefeys M. 2019. Privacy preserving image-based localization//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 5488-5498 [DOI: 10.1109/CVPR.2019.00564http://dx.doi.org/10.1109/CVPR.2019.00564]
Stewénius H, Engels C and Nistér D. 2006. Recent developments on direct relative orientation. ISPRS Journal of Photogrammetry and Remote Sensing, 60(4): 284-294 [DOI: 10.1016/j.isprsjprs.2006.03.005http://dx.doi.org/10.1016/j.isprsjprs.2006.03.005]
Sucar E, Liu S K, Ortiz J and Davison A J. 2021. IMAP: implicit mapping and positioning in real-time//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE: 6209-6218 [DOI: 10.1109/ICCV48922.2021.00617http://dx.doi.org/10.1109/ICCV48922.2021.00617]
Tang S T, Tang C Z, Huang R, Zhu S Y and Tan P. 2021. Learning camera localization via dense scene matching//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE: 1831-1841 [DOI: 10.1109/CVPR46437.2021.00187http://dx.doi.org/10.1109/CVPR46437.2021.00187]
Tang S T, Tang S C, Tagliasacchi A, Tan P and Furukawa Y. 2023. NeuMap: neural coordinate mapping by auto-transdecoder for camera localization//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE: 929-939 [DOI: 10.1109/CVPR52729.2023.00096http://dx.doi.org/10.1109/CVPR52729.2023.00096]
Toft C, Maddern W, Torii A, Hammarstrand L, Stenborg E, Safari D, Okutomi M, Pollefeys M, Sivic J, Pajdla T, Kahl F and Sattler T. 2022. Long-term visual localization revisited. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(4): 2074-2088 [DOI: 10.1109/TPAMI.2020.3032010http://dx.doi.org/10.1109/TPAMI.2020.3032010]
Torii A, Arandjelović R, Sivic J, Okutomi M and Pajdla T. 2015. 24/7 place recognition by view synthesis//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 1808-1817 [DOI: 10.1109/CVPR.2015.7298790http://dx.doi.org/10.1109/CVPR.2015.7298790]
Truong P, Rakotosaona M J, Manhardt F and Tombari F. 2023. SPARF: neural radiance fields from sparse and noisy poses//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: 4190-4200 [DOI: 10.1109/CVPR52729.2023.00408http://dx.doi.org/10.1109/CVPR52729.2023.00408]
Valada A, Radwan N and Burgard W. 2018. Deep auxiliary learning for visual localization and odometry//Proceedings of 2018 IEEE International Conference on Robotics and Automation. Brisbane, Australia: IEEE: 6939-6946 [DOI: 10.1109/ICRA.2018.8462979http://dx.doi.org/10.1109/ICRA.2018.8462979]
Valentin J, Dai A, Niessner M, Kohli P, Torr P, Izadi S and Keskin C. 2016. Learning to navigate the energy landscape//Proceedings of the 4th International Conference on 3D Vision (3DV). Stanford, USA: 323-332 [DOI: 10.1109/3DV.2016.41http://dx.doi.org/10.1109/3DV.2016.41]
Valentin J, Niebner M, Shotton J, Fitzgibbon A, Izadi S and Torr P. 2015. Exploiting uncertainty in regression forests for accurate camera relocalization//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 4400-4408 [DOI: 10.1109/CVPR.2015.7299069http://dx.doi.org/10.1109/CVPR.2015.7299069]
Walch F, Hazirbas C, Leal-Taixé L, Sattler T, Hilsenbeck S and Cremers D. 2017. Image-based localization using LSTMs for structured feature correlation//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 627-637 [DOI: 10.1109/ICCV.2017.75http://dx.doi.org/10.1109/ICCV.2017.75]
Wang B, Chen C H, Lu C X, Zhao P J, Trigoni N and Markham A. 2020. AtLoc: attention guided camera localization//Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York, USA: AAAI Press: 10393-10401 [DOI: 10.1609/aaai.v34i06.6608http://dx.doi.org/10.1609/aaai.v34i06.6608]
Wang Z R, Wu S Z, Xie W D, Chen M and Prisacariu V. 2021. NeRF--: neural radiance fields without known camera parameters [EB/OL]. [2023-09-09]. https://arxiv.org/pdf/2102.07064.pdfhttps://arxiv.org/pdf/2102.07064.pdf
Wei D, Wan Y, Zhang Y J, Liu X Y, Zhang B and Wang X Q. 2022. ELSR: efficient line segment reconstruction with planes and points guidance//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 15786-15794 [DOI: 10.1109/CVPR52688.2022.01535http://dx.doi.org/10.1109/CVPR52688.2022.01535]
Wen T P, Jiang K, Wijaya B, Li H Y, Yang M M and Yang D G. 2022. TM3Loc: tightly-coupled monocular map matching for high precision vehicle localization. IEEE Transactions on Intelligent Transportation Systems, 23(11): 20268-20281 [DOI: 10.1109/TITS.2022.3176914http://dx.doi.org/10.1109/TITS.2022.3176914]
Wu J, Ma L W and Hu X L. 2017. Delving deeper into convolutional neural networks for camera relocalization//Proceedings of 2017 IEEE International Conference on Robotics and Automation. Singapore,Singapore: IEEE: 5644-5651 [DOI: 10.1109/ICRA.2017.7989663http://dx.doi.org/10.1109/ICRA.2017.7989663]
Wu T and Ranganathan A. 2013. Vehicle localization using road markings//Proceedings of 2013 IEEE Intelligent Vehicles Symposium. Gold Coast, Australia: IEEE: 1185-1190 [DOI: 10.1109/IVS.2013.6629627http://dx.doi.org/10.1109/IVS.2013.6629627]
Xu C, Zhang L L, Cheng L and Koch R. 2017. Pose estimation from line correspondences: a complete analysis and a series of solutions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6): 1209-1222 [DOI: 10.1109/TPAMI.2016.2582162http://dx.doi.org/10.1109/TPAMI.2016.2582162]
Xue F, Wang X, Yan Z K, Wang Q Y, Wang J Q and Zha H B. 2019. Local supports global: deep camera relocalization with sequence enhancement//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 2841-2850 [DOI: 10.1109/ICCV.2019.00293http://dx.doi.org/10.1109/ICCV.2019.00293]
Yang L W, Bai Z Q, Tang C Z, Li H H, Furukawa Y and Tan P. 2019. SANet: scene agnostic network for camera localization//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea (South): IEEE: 42-51 [DOI: 10.1109/ICCV.2019.00013http://dx.doi.org/10.1109/ICCV.2019.00013]
Yang L W, Shrestha R, Li W B, Liu S C, Zhang G F, Cui Z P and Tan P. 2022. SceneSqueezer: learning to compress scene for camera relocalization//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 8249-8258 [DOI: 10.1109/CVPR52688.2022.00808http://dx.doi.org/10.1109/CVPR52688.2022.00808]
Yen-Chen L, Florence P, Barron J T, Rodriguez A, Isola P and Lin T Y. 2021. INeRF: inverting neural radiance fields for pose estimation//Proceedings of 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems. Prague, Czech Republic: IEEE: 1323-1330 [DOI: 10.1109/IROS51168.2021.9636708http://dx.doi.org/10.1109/IROS51168.2021.9636708]
Yoon S and Kim A. 2021. Line as a visual sentence: context-aware line descriptor for visual localization. IEEE Robotics and Automation Letters, 6(4): 8726-8733 [DOI: 10.1109/LRA.2021.3111760http://dx.doi.org/10.1109/LRA.2021.3111760]
Yu H, Zhen W K, Yang W, Zhang J and Scherer S. 2020. Monocular camera localization in prior LiDAR maps with 2D-3D line correspondences//Proceedings of 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems. Las Vegas, USA: IEEE: 4588-4594 [DOI: 10.1109/IROS45743.2020.9341690http://dx.doi.org/10.1109/IROS45743.2020.9341690]
Zhang C, Liu H, Xie Z J, Yang K Y, Guo K, Cai R and Li Z W. 2021. AVP-Loc: surround view localization and relocalization based on HD vector map for automated valet parking//Proceedings of 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems. Prague, Czech Republic: IEEE: 5552-5559 [DOI: 10.1109/IROS51168.2021.9636746http://dx.doi.org/10.1109/IROS51168.2021.9636746]
Zhou L P, Ye J M and Kaess M. 2019a. A stable algebraic camera pose estimation for minimal configurations of 2D/3D point and line correspondences//Proceedings of the 14th Asian Conference on Computer Vision. Perth, Australia: Springer: 273-288 [DOI: 10.1007/978-3-030-20870-7_17http://dx.doi.org/10.1007/978-3-030-20870-7_17]
Zhou Q J, Agostinho S, Ošep A and Leal-Taixé L. 2022. Is geometry enough for matching in visual localization?//Proceedings of the 17th European Conference on Computer Vision. Tel Aviv, Israel: Springer: 407-425 [DOI: 10.1007/978-3-031-20080-9_24http://dx.doi.org/10.1007/978-3-031-20080-9_24]
Zhou Q J, Sattler T, Pollefeys M and Leal-Taixé L. 2020. To learn or not to learn: visual localization from essential matrices//Proceedings of 2020 IEEE International Conference on Robotics and Automation. Paris, France: IEEE: 3319-3326 [DOI: 10.1109/ICRA40945.2020.9196607http://dx.doi.org/10.1109/ICRA40945.2020.9196607]
Zhou T H, Brown M, Snavely N and Lowe D G. 2017. Unsupervised learning of depth and ego-motion from video//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 6612-6619 [DOI: 10.1109/CVPR.2017.700http://dx.doi.org/10.1109/CVPR.2017.700]
Zhou Y, Barnes C, Lu J W, Yang J M and Li H. 2019b. On the continuity of rotation representations in neural networks//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 5738-5746 [DOI: 10.1109/CVPR.2019.00589http://dx.doi.org/10.1109/CVPR.2019.00589]
Zhu Z H, Peng S Y, Larsson V, Xu W W, Bao H J, Cui Z P, Oswald M R and Pollefeys M. 2022. NICE-SLAM: neural implicit scalable encoding for SLAM//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 12776-12786 [DOI: 10.1109/CVPR52688.2022.01245http://dx.doi.org/10.1109/CVPR52688.2022.01245]
Zhu Z X, Chen Y T, Wu Z R, Hou C, Shi Y L, Li C X, Li P F, Zhao H and Zhou G Y. 2023. LATITUDE: robotic global localization with truncated dynamic low-pass filter in city-scale NeRF//Proceedings of 2023 IEEE International Conference on Robotics and Automation. London, United Kingdom: IEEE: 8326-8332 [DOI: 10.1109/ICRA48891.2023.10161570http://dx.doi.org/10.1109/ICRA48891.2023.10161570]
相关作者
相关机构