EDM: Equirectangular Projection-Oriented Dense Kernelized Feature Matching

Annonymous

Nerfies

EDM estimates distortion-aware dense correspondences between pairs of omnidirectional images.

Abstract

We introduce the first learning-based dense matching algorithm, termed Equirectangular Projection-Oriented Dense Kernelized Feature Matching (EDM), specifically designed for omnidirectional images. Equirectangular projection (ERP) images, with their large fields of view, are particularly suited for dense matching techniques that aim to establish comprehensive correspondences across images. However, ERP images are subject to significant distortions, which we address by leveraging the spherical camera model and geodesic flow refinement in the dense matching method. To further mitigate these distortions, we propose spherical positional embeddings based on 3D Cartesian coordinates of the feature grid. Additionally, our method incorporates bidirectional transformations between spherical and Cartesian coordinate systems during refinement, utilizing a unit sphere to improve matching performance. We demonstrate that our proposed method achieves notable performance enhancements, with improvements of +26.72 and +42.62 in AUC@5° on the Matterport3D and Stanford2D3D datasets, respectively.

Nerfies
Our model architecture is inspired by designs of SOTA dense matching models, but re-designed for the distortion-awareness for omnidirectional images. First, we propose a Spherical Spatial Alignment Module that utilizes Gaussian Process regression and spherical positional embeddings to establish 3D correspondences between omnidirectional images. Second, we use Geodesic Flow Refinement by enabling conversions between coordinates to refine the displacement on the surface of the sphere. Moreover, with azimuth rotation for data augmentation, we achieve state-of-the-art performance in dense matching and relative pose estimation between two omnidirectional images.

Results

Nerfies


Nerfies
Qualitative results. Warp refers to results obtained by multiplying the warped image with the predicted certainty map, demonstrating that our method yields accurate dense matches.

Downstream Tasks : Spherical Two View Geometry

Downstream Tasks : Spherical Structure from Motion