Content-Length: 395808 | pFad | http://github.com/MarkYu98/madpose

B7 GitHub - MarkYu98/madpose: Official implementation of the solvers and estimators proposed in the paper "Relative Pose Estimation through Affine Corrections of Monocular Depth Priors"
Skip to content

Official implementation of the solvers and estimators proposed in the paper "Relative Pose Estimation through Affine Corrections of Monocular Depth Priors"

License

Notifications You must be signed in to change notification settings

MarkYu98/madpose

Repository files navigation

MADPose

arXiv

This repo contains the official implementation of the solvers and estimators proposed in the paper "Relative Pose Estimation through Affine Corrections of Monocular Depth Priors". The solvers and estimators are implemented using C++, and we provide easy-to-use Python bindings.

Note: "MAD" is an acronym for "Monocular Affine Depth".

Overview: Our method takes a pair of images as input, runs off-the-shelf feature matching and monocular depth estimation, then jointly estimates the relative pose, scale and shift parameters of the two depth maps, and optionally the focal lengths.

Evaluation

We develop three solvers for relative pose estimation that explicitly account for independent affine (scale and shift) ambiguities, tailored for three setups of cameras: calibrated, shared-focal, and unknown focal lengths (two-focal). The solvers are further combined with classic point-based solvers and epipolar constraints in our hybrid RANSAC estimators. Our estimators show consistent improvements over classic point-based methods (e.g. PoseLib) and recent two-view reconstruction methods DUSt3R and MASt3R, across different datasets with different feature matchers and monocular depth estimation models.

Some highlight results (probably best viewed in light theme on GitHub!):

Pose Error AUCs on ScanNet-1500 with shared-focal setting

Matches Method MD Model AUC@5° AUC@10° AUC@20°
SP+SG PoseLib-6pt - 12.84 28.13 45.64
Ours-sf DA-met. 18.35 37.54 57.58
RoMa PoseLib-6pt - 27.17 49.24 67.42
Ours-sf DA-met. 29.81 53.11 71.15
MASt3R PoseLib-6pt - 30.28 54.16 72.87
Ours-sf DA-met. 31.87 56.20 74.51
MASt3R 32.58 56.99 74.91
Reference entry - DUSt3R 25.90 48.45 68.03
Reference entry - MASt3R 23.94 46.44 66.18

Pose Error AUCs on 2D-3D-S sampled image pairs with unknown focal lengths

Matches Method MD Model AUC@2° AUC@5° AUC@10° AUC@20°
SP+LG PoseLib-7pt - 5.85 13.95 21.94 30.71
Ours-tf DAv2-met. 9.15 22.22 32.80 43.26
RoMa PoseLib-7pt - 8.73 20.31 30.45 41.48
Ours-tf DAv2-met. 13.50 29.19 42.18 54.42
MASt3R PoseLib-7pt - 12.58 30.27 45.57 59.85
Ours-tf DAv2-met. 18.05 39.92 56.64 70.86
MASt3R 22.44 48.02 64.79 76.55
Reference entry - DUSt3R 6.43 24.47 42.39 58.36
Reference entry - MASt3R 13.39 38.41 57.92 71.91
Please refer to the paper for more results and discussions.

Installation

Install from PyPI

We are working on setting up wheel for easy installation using PyPI. Currently please use the following method to install from source.

Install from source

Install dependencies

sudo apt-get install libeigen3-dev libceres-dev libopencv-dev

Note: The two-focal estimator currently relies on cv::recoverPose from OpenCV, we plan to remove dependency on OpenCV in future updates.

Clone the repo

git clone --recursive https://github.com/MarkYu98/madpose

Build and install the Python bindings

pip install .

If you would like to see the building process (e.g. CMake logs) you can add -v option to the above command.

Note: MADPose also depends on PoseLib. By default, CMake will automatically build PoseLib using FetchContent. Set FETCH_POSELIB CMake option to OFF if you prefer to use a self-installed version.

Check the installation

python -c "import madpose"

You should not see any errors if MADPose is successfully installed.

Usage

Estimators

We provide Python bindings of our 3 hybrid estimators for image pairs with calibrated cameras, shared-focal cameras, and cameras with unknown focal lengths (two-focal).

The estimators take HybridLORansacOptions and EstimatorConfig for related settings, some useful settings are:

import madpose

options = madpose.HybridLORansacOptions()
options.min_num_iterations = 1000
options.max_num_iterations = 10000
options.success_probability = 0.9999
options.random_seed = 0 # for reproducibility
options.final_least_squares = True
options.threshold_multiplier = 5.0
options.num_lo_steps = 4
# squared px thresholds for reprojection error and epipolar error
options.squared_inlier_thresholds = [reproj_pix_thres ** 2, epipolar_pix_thres ** 2]
# weight when scoring for the two types of errors
options.data_type_weights = [1.0, epipolar_weight]

est_config = madpose.EstimatorConfig()
# if enabled, the input min_depth values are guaranteed to be positive with the estimated depth offsets (shifts), default: True
est_config.min_depth_constraint = True
# if disabled, will model the depth with only scale (only applicable to the calibrated camera case)
est_config.use_shift = True

We provide a example image pairs and code snippets in examples/ to test the hybrid estimators. More demos and evaluations will be added in the future.

We use numpy and cv2 to handle pre-computed data in Python in the examples, you can easily install them using pip:

pip install numpy opencv-python

Calibrated estimator

pose, stats = madpose.HybridEstimatePoseScaleOffset(
                  mkpts0, mkpts1, 
                  depth0, depth1,
                  [depth_map0.min(), depth_map1.min()], 
                  K0, K1, options, est_config
              )
# rotation and translation of the estimated pose
R_est, t_est = pose.R(), pose.t()
# scale and offsets of the affine corrected depth maps
s_est, o0_est, o1_est = pose.scale, pose.offset0, pose.offset1

The parameters are: keypoint matches(mkpts0, mkpts1), their corresponding depth prior values(depth0, depth1), min depth values for both views (used when est_config.min_depth_constraint is True), camera intrinsics(K0, K1),options, and est_config.

See examples/calibrated.py for a complete code example, evaluation, and comparison with point-based estimation using PoseLib.

Shared-focal estimator

pose, stats = madpose.HybridEstimatePoseScaleOffsetSharedFocal(
                  mkpts0, mkpts1, 
                  depth0, depth1,
                  [depth_map0.min(), depth_map1.min()], 
                  pp0, pp1, options, est_config
              )
# rotation and translation of the estimated pose
R_est, t_est = pose.R(), pose.t()
# scale and offsets of the affine corrected depth maps
s_est, o0_est, o1_est = pose.scale, pose.offset0, pose.offset1
# estimated shared focal length
f_est = pose.focal

Different from the calibrated estimator, now instead of intrinsics(K0, K1), the shared-focal estimator now takes as input the principal points(pp0, pp1).

See examples/shared_focal.py for complete example.

Two-focal estimator

pose, stats = madpose.HybridEstimatePoseScaleOffsetTwoFocal(
                  mkpts0, mkpts1, 
                  depth0, depth1,
                  [depth_map0.min(), depth_map1.min()], 
                  pp0, pp1, options, est_config
              )
# rotation and translation of the estimated pose
R_est, t_est = pose.R(), pose.t()
# scale and offsets of the affine corrected depth maps
s_est, o0_est, o1_est = pose.scale, pose.offset0, pose.offset1
# the estimated two focal lengths
f0_est, f1_est = pose.focal0, pose.focal1

The parameters are same with the shared-focal estimator, but now the estimator will estimate two independent focal lengths.

See examples/two_focal.py for complete example.

Point-based baseline

You can compare with point-based estimators from PoseLib. You need to install the Poselib's Python bindings.

The corresponding point-based estimation are included in each of the three example scripts above.

Solvers

Our hybrid estimators combine our newly proposed depth-aware solvers with the point-based solvers. The Python implementation of the solvers can be found under solver_py/, which also include calls to the C++ solvers through Python bindings for comparison.

As described in the paper, the proposed solvers all work in a 2-step manner, first solve for the depth scale and shifts (affine parameters) of the depth maps and then recover the pose by finding the rigid transformation between the back-projected points. We provide Python bindings for the first step (solving scale and shifts) as well as combined wrapper for both the pose and the affine parameters.

The bindings for solving the affine parameters:

# Calibrated case
affine_params = madpose.solve_scale_and_shift(x1.T, x2.T, d1, d2) 
# Shared-focal
affine_params = madpose.solve_scale_and_shift_shared_focal(x1.T, x2.T, d1, d2)
# Two-focal
affine_params = madpose.solve_scale_and_shift_two_focal(x1.T, x2.T, d1, d2)

The input x1 and x2 are np.array (or list) of homogeneous normalized image coordinates ([x, y, 1]) of the minimal set of $M$ matched keypoints; d1 and d2 are np.array (or list) of the corresponding depth values. The returned affine_params is a list of candidate solutions (a1, a2, b1, b2) with a1 always equal to 1 and therefore a2, b1, b2 correspond to the scale and shift values $\alpha, \beta_1, \beta_2$.

Or we can use the combined wrapper that solves affine parameters and also recover the candidate poses:

# Calibrated case
poses = madpose.solve_scale_shift_pose(x1.T, x2.T, d1, d2)
# Shared-focal
poses = madpose.solve_scale_shift_pose_shared_focal(x1.T, x2.T, d1, d2)
# Two-focal
poses = madpose.solve_scale_shift_pose_two_focal(x1.T, x2.T, d1, d2)

Note that for simplicity the return solutions are all named poses, the list contains objects of different classes for the three cases (madpose.PoseScaleOffset, madpose.PoseScaleOffsetSharedFocal, and madpose.PoseScaleOffsetTwoFocal). Retrieve the pose, focal lengths, and affine parameters by:

R_est, t_est = pose.R(), pose.t()
s_est, o0_est, o1_est = pose.scale, pose.offset0, pose.offset1

# for PoseScaleOffsetSharedFocal
f_est = pose.focal

# for PoseScaleOffsetTwoFocal
f0_est, f1_est = pose.focal0, pose.focal1

TODO List

  • Remove dependency on OpenCV.
  • Setup wheel for PyPI
  • Add experiment scirpts on datasets

Acknowledgement

This codebase is inspired by and built upon many research work and opensource projects, we thank the authors and contributors for their work.

Citation

If you find our work useful in your research, please consider citing our paper:

@misc{yu2025madpose,
      title={Relative Pose Estimation through Affine Corrections of Monocular Depth Priors}, 
      author={Yifan Yu and Shaohui Liu and Rémi Pautrat and Marc Pollefeys and Viktor Larsson},
      year={2025},
      eprint={2501.05446},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2501.05446}, 
}

About

Official implementation of the solvers and estimators proposed in the paper "Relative Pose Estimation through Affine Corrections of Monocular Depth Priors"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages









ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/MarkYu98/madpose

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy