Code Monkey home page Code Monkey logo

Comments (4)

ken012git avatar ken012git commented on July 24, 2024 2

I follow the IBOT demo code, but I could not get the correct matching results from dinov2, either.

from dinov2.

nicolaihaeni avatar nicolaihaeni commented on July 24, 2024 2

I would also be interested in getting sample code for creating the feature matching across patches. So far, I could not get it to work. Could you give some additional details of how this was done in the paper? Thank you

from dinov2.

franchesoni avatar franchesoni commented on July 24, 2024 1

here's my code (it's not clean, but kinda works)
the idea is to compute features for a batch of B images and feature match those features on the first image to the features in the other images in the batch

def scaled_click_to_flattened_index(sc, Ph, Pw):
  dummy = np.zeros((Ph, Pw))
  row, col = sc
  dummy[row, col] = 1
  dummy = dummy.flatten()
  return np.where(dummy)

def flattened_index_to_scaled_click(find, Ph, Pw):
  dummy = np.zeros((Ph, Pw))
  dummy = dummy.flatten()
  dummy[find] = 1
  dummy = dummy.reshape((Ph, Pw))
  return np.where(dummy)



from scipy.optimize import linear_sum_assignment
queryfeats = feats[0]  # Ph x Pw x F, number of patches on the height and weight x number of features
for feat in feats[1:]:  # for all other feature maps, feats[0] is the source or query image
  # compute dissimilarity

  # # ram intensive way (too much for colab)
  # dsim = (queryfeats.reshape(Ph*Pw, 1, F) - feat.reshape(1, Ph*Pw, F)).norm(dim=2)

  # using a for loop + vectorization
  dsim = torch.zeros(Ph*Pw, Ph*Pw)
  vectorfeat = feat.reshape(Ph*Pw, F)
  for qind, queryfeat in enumerate(queryfeats.reshape(Ph*Pw, F)):
    print(f"{qind}/{Ph*Pw}", end='\r')
    dsim[qind] = (queryfeat.reshape(1, F) - vectorfeat).norm(dim=1)

  row_ind, col_ind = linear_sum_assignment(dsim)

  plt.figure()

  plt.subplot(2,2,1)  # plot mapping
  plt.scatter(row_ind, col_ind)

  plt.subplot(2,2,2)  # plot query image and clicks
  rgbquery = PCA(3).fit_transform(queryfeats.reshape(Ph*Pw, F)).reshape(Ph, Pw, 3)
  plt.imshow(norm_fn(rgbquery))
  sc = scaled_clicks[0]
  plt.scatter(sc[1],  Ph-sc[0], marker='x', s=50, c='g')
  sc = scaled_clicks[1]
  plt.scatter(sc[1],  Ph-sc[0], marker='x', s=50, c='r')

  plt.subplot(2,2,3)  # plot key image and mapped clicks
  rgbfeat = PCA(3).fit_transform(vectorfeat).reshape(Ph, Pw, 3)
  plt.imshow(norm_fn(rgbfeat))

  sc = scaled_clicks[0]
  sc_mapping_row = scaled_click_to_flattened_index(sc, Ph, Pw)
  mapped_col = col_ind[sc_mapping_row]
  mapped_sc = flattened_index_to_scaled_click(mapped_col, Ph, Pw)
  plt.scatter(mapped_sc[1],  Ph-mapped_sc[0], marker='x', s=50, c='g')

  sc = scaled_clicks[1]
  sc_mapping_row = scaled_click_to_flattened_index(sc, Ph, Pw)
  mapped_col = col_ind[sc_mapping_row]
  mapped_sc = flattened_index_to_scaled_click(mapped_col, Ph, Pw)
  plt.scatter(mapped_sc[1],  Ph-mapped_sc[0], marker='x', s=50, c='r')

The plots will show the index assignment (source index and target index), the source or query image with two clicks, and the target images with the matched two clicks. See:

image

or also

image

Don't follow this blindly, understand what you're doing. For instance, I might have a bug as the red and green points are inverted in the last image.

from dinov2.

yousafe007 avatar yousafe007 commented on July 24, 2024

Can someone provide the code? It has been over 3 months now. It would be extremely appreciated

from dinov2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.