This repository contains the implementation of the paper: Lightweight Structured Line Map Based Visual Localization, Hongmin Liu, Chengyang Cao, Hanqiao Ye, Hainan Cui, Wei Gao, Xing Wang, and Shuhan Shen
Visual localization, also known as camera pose estimation, is a crucial component of many applications, such as robotics, autonomous driving, and augmented reality. Traditional visual localization algorithms typically run on point cloud maps generated by algorithms such as Structure-from-Motion (SfM) or Simultaneous Localization and Mapping (SLAM). However, point features are sensitive to weak textures and illumination changes. In addition, the generated 3D point cloud maps often contain millions of points, posing higher demands on device storage and computing resources. To address these challenges, we propose a visual localization algorithm based on lightweight structured line maps. Instead of extracting and matching point features in the images, we select line segments that represent structured scene information as image features. These line segments are then used to construct a lightweight line map containing rich structured scene information. Then, the camera pose is estimated through a series of steps including line extraction, matching, initial pose estimation, and pose refinement. Experimental results on benchmark datasets demonstrate that compared to the current state-of-the-art visual localization methods, our method achieves competitive localization accuracy while significantly reducing the memory footprint of the 3D map.
The pipeline of LSLM_VLoc can be divided into four steps:
-
Inputs: Queries and database images.
-
Step1: utilize the camera poses of reference images provided by a standard point-based SfM algorithm as input, a 3D scene line map is constructed offline as a pre-built map for visual localization.
-
Step2: establish 2D-3D line correspondences through line segment detection and a coarse-to-fine hierarchical matching strategy.
-
Step3: The initial pose of the camera is estimated by the designed Group-RANSAC PnL algorithm.
-
Step4: iteratively refine the initial pose using a reprojection loss function specifically designed for line segments to obtain the final six degrees of freedom camera pose.
-
Outputs: The six-degree-of-freedom pose of the camera when the query image is taken, which consists of a three-degree-of-freedom rotation matrix R and a three-degree-of-freedom translation vector t.
The following figure shows the pipeline of LSLM_VLoc:
We are actively preparing to release the source code.
If you're interested in our project, keep an eye out as the source code will be available very soon.
A line map constructed in the Old Hospital scenario of the Cambridge Landmarks dataset:
Experimental results of various current state-of-the-art visual localization methods in the Old Hospital scenario.
Method | Map Size | Median Errors (m/°) |
---|---|---|
Active Search | 200MB | 0.52/1.12 |
HLoc | 800MB | 0.15/0.30 |
PixLoc | ~600MB | 0.16/0.30 |
GoMatch | ~12MB | 2.83/8.14 |
BPnPNet+SuperPoint | ~12MB | 24.8/162.99 |
PtLine | >800MB | 0.15/0.31 |
SRC | 40MB | 0.38/0.50 |
DSAC++ | 207MB | 0.20/0.30 |
PoseNet | 50MB | 2.31/5.38 |
MS-Transformer | ~70MB | 1.81/2.39 |
SANet | ∼260MB | 0.32/0.50 |
CROSSFIRE | 50MB | 0.43/0.70 |
Ours | ∼3MB | 0.34/0.86 |