Code Monkey home page Code Monkey logo

Comments (6)

ducha-aiki avatar ducha-aiki commented on June 30, 2024

This keeps top k keypoints with strongest response. These are values of the responces of the keypoints, not their coordinates. Output shape would be k x 1

Regarding sc_y_x - it contains scale, x and y of the keypoints. Kind of equivalent of the https://github.com/opencv/opencv_contrib/blob/master/modules/xfeatures2d/src/sift.cpp#L569-L572 same thing in SIFT detector.

Why it looks so different? Because in order to make it parallelizable and differentiable, everything should by expressed in the terms of conv2d or functions like this. So lots of classical code looks very strange. But otherwise there would be no point in implementing Hessian detector in PyTorch.

from affnet.

yunyundong avatar yunyundong commented on June 30, 2024

Strictly speaking, the sc_y_x calculated by F.conv2d(resp3d, self.grid, padding = 1) / (F.conv2d(resp3d, self.grid_ones, padding = 1) + 1e-8) is equal to the xi, xr, and xc in the opencv source code Interpolate the location of Strong response points.? In opencv, its calculation fourmualr is as the following picture shown:
image

Besides, I am confused about the variable img_scale in the line const float img_scale = 1.f/(255*SIFT_FIXPT_SCALE); from the source code of opencv, its role is to normalize the value of DoG to range [0,1]? Thank you in advance.

from affnet.

ducha-aiki avatar ducha-aiki commented on June 30, 2024

Yes, is it similar, but not exactly.
In OpenCV or this Hessian https://github.com/perdoch/hesaff/blob/master/pyramid.cpp#L158 exact subpixel point position is done by fitting 2nd order curve. It is better way, but harder to do in parallel.

I am using simpler "center of the mass" approach, which is implemented by two convilutions you mentioned. It is also used in LIFT
https://arxiv.org/abs/1603.09114
image

Regarding img_scale, you are right.

from affnet.

yunyundong avatar yunyundong commented on June 30, 2024

Thank you very much, I got it. Thank you again.

from affnet.

yunyundong avatar yunyundong commented on June 30, 2024

Hi, @ducha-aiki , I have another problem, it is about the calculation of gradient of image in the x and y direction. Is some cases, its kernel is np.array([[[[0.5, 0, -0.5]]]] as here gx, and in other cases , its kernel is np.array([[[[-1, 0, 1]]]] gx and dx from the hesaff, its kernel should be np.array([[[[-1, 0, 1]]]].
Is there some standards to calculate the dx, or dy? Especially for the second-derivates, such as gxx and gyy

I do a simple example as following:

a=torch.randint(1,100,(1,10)).view(1,1,1,10)
gx =  nn.Conv2d(1, 1, kernel_size=(1,3), bias = False,padding=(0,1))
gx.weight.data = torch.from_numpy(np.array([[[[0.5, 0, -0.5]]]], dtype=np.float32))
gxx=nn.Conv2d(1, 1, kernel_size=(1,3), bias = False,padding=(0,1))
gxx.weight.data=torch.from_numpy(np.array([[[[1, -2, 1]]]], dtype=np.float32))
dxx1=gxx(a)
dxx2=gx(gx(a))

However, dxx1!=dxx2. Especially for the cross-second derivate dxy, its factor is 0.25, for dxx, dyy, its factor is 1.0. Is there some derivation for the calculation about dx, dy, dxx, dyy and dxy?

from affnet.

ducha-aiki avatar ducha-aiki commented on June 30, 2024

@yunyundong
Unfortunately, no standard way. There are two approaches: one is to incorporate normalization coefficient into the weights and the second - it to do it separately.

You can read about this a bit here
https://cw.fel.cvut.cz/b172/courses/ucuss18/labs/00_intro

I also recommend next lab https://cw.fel.cvut.cz/b172/courses/ucuss18/labs/01_corresp

And corresponding slides-lectures
https://cw.fel.cvut.cz/b172/_media/courses/ucuss18/2018_local-features-orig_new_jc.pdf

from affnet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.