Code Monkey home page Code Monkey logo

spicyguml / loftr-dkm-network-with-taa-for-image-matching Goto Github PK

View Code? Open in Web Editor NEW

This project forked from wanghewei16/loftr-dkm-network-with-taa-for-image-matching

0.0 0.0 0.0 16.39 MB

[CVPR2022 Workshop Competition Top6% Silver Medal🥈] Develop a DL model to register two images from different viewpoints with coarse-to-fine attention mechanisms, which adopt LoFTR for local feature matching and DKM for regression with deep kernels

Jupyter Notebook 100.00%

loftr-dkm-network-with-taa-for-image-matching's Introduction

LoFTR-DKM-Network-with-TAA-for-Image-Matching

1. Background of the problem to be solved

For the majority of us, our best camera is built into our phone. We may take a photograph of a landmark, such as Rome's Trevi Fountain, and share it with friends. That photo is only two-dimensional and includes the perspective of our shooting location. Of course, many people have photographed the fountain. We might be able to create a more complete, three-dimensional view if we work together. What if machine learning could aid in better capturing the richness of the world by utilizing the vast amounts of unstructured image collections that are freely available on the internet?

Structure-from-Motion (SfM) is a method for reconstructing 3D objects and buildings from images. Typically, these images are captured under controlled conditions by skilled operators, ensuring homogeneous, high-quality data. Building 3D models from disparate images is much more difficult, given the wide range of viewpoints, lighting and weather conditions, occlusions from people and vehicles, and even user-applied filters.

The first step is to determine which parts of two images capture the same physical points of a scene, such as window corners. This is typically accomplished through the use of local features (key locations in an image that can be reliably identified across different views). Local features include short description vectors that capture the appearance in the vicinity of the point of interest. By comparing these descriptors, it is possible to establish likely correspondences between the pixel coordinates of image locations in two or more images. This "image registration" allows for the recovery of the point's 3D location via triangulation.

Our task is to develop a machine learning algorithm for registering two images from different perspectives. To train and test our model, we have access to a dataset of thousands of images. Figure below shows the sample of image matching and the trainset sample.

2. Testset Sample

3. Kornia tool

We choose to use Kornia [Link] to develop our mode, this is a tool powered by PyTorch and good at dealing with image matching problem.

4. Process of solving this problem

As for estimator, we use OpenCV estimator MAGSAC++ [Link].

5. BackBone

5.1 LoFTR Achitecture

The figure below shows the achitecture of LoFTR [Paper Link]

5.2 DKM Achitecture

The figure below shows the achitecture of Deep Kernelized Dense Geometric Matching (DKM) [Paper Link]

6. Test Time Augmentation (TTA)

Adopt size transformation and flip and Test Time Augmentation (TTA) on the inference stage of LoFTR.

7. Image preprocess skill

loftr-dkm-network-with-taa-for-image-matching's People

Contributors

wanghewei16 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.