360DVO: Deep Visual Odometry for Monocular 360-Degree Camera
Abstract
In this paper, we present 360DVO, the first deep learning-based OVO framework. Our approach introduces a distortion-aware spherical feature extractor (DAS-Feat) that adaptively learns distortion-resistant features from 360-degree images. These sparse feature patches are then used to establish constraints for effective pose estimation within a novel omnidirectional differentiable bundle adjustment (ODBA) module. To facilitate evaluation in realistic settings, we also contribute a new real-world OVO benchmark. Extensive experiments on this benchmark and public synthetic datasets (TartanAir V2 and 360VO) demonstrate that 360DVO surpasses state-of-the-art baselines (including 360VO and OpenVSLAM), improving robustness by 50% and accuracy by 37.5%.
Pipeline
Our method takes sequential 360-degree RGB frames as input and extracts matching features and context features using our proposed DAS-Feat module on each of them. In DAS-Feat, the key component SphereResNet extracts distortion-resistant features, allowing patches to be cropped without deformation. After patchifying the matching features around their gradient maxima, we compute the correlation of patch features and context features and estimate optical flow through a recurrent network. In the ODBA module, the pose and depth of current frame are jointly optimized by minimizing the distance between predicted patch (from optical flow) and reprojected patch on the adjacent frame.
First image description.
Second image description.
Third image description.
Fourth image description.
Experiments
Video Presentation
Demo Videos
BibTeX
@article{YourPaperKey2024,
title={Your Paper Title Here},
author={First Author and Second Author and Third Author},
journal={Conference/Journal Name},
year={2024},
url={https://your-domain.com/your-project-page}
}