Code Monkey home page Code Monkey logo

Comments (2)

elevenjiang1 avatar elevenjiang1 commented on August 17, 2024

Here is what I rewrite, please check whether it is correct;
Meanwhile, seems that it will assume one object will only appear once in a picture; When the picture has two same object, what will the result be?

  def get_mask_from_image(self,color_image,confidence=0.5,num_max_dets=5):
      #1: color image generate detections and dinov2 descriptors
      color_image=cv.cvtColor(color_image,cv.COLOR_BGR2RGB)
      detections=self.segmentor_model.generate_masks(np.array(color_image))
      
      detections=Detections(detections)
      decriptors=self.descriptor_model.forward(np.array(color_image),detections)#140*1024
      
      #2: compare similarity
      metric=Similarity()
      #get scores per proposal
      scores=metric(decriptors[None,:,None,:],self.all_ref_feats[:,None,:,:])#(1*140*1*1024, N*1*42*1024)->(N*140*42)
      
      #select top-k detections and compute their mean score as the final score
      top5_scores,_=torch.topk(scores,k=5,dim=-1)#(N*140*42)->(N*140*5)
      avg_top5_scores=torch.mean(top5_scores,dim=-1)#(N*140*5)->(N*140)
      
      #get final score and index
      max_value,best_match_idx=torch.max(avg_top5_scores,dim=-1)#(N*140)->(N),(N)
      # max_value,best_match_idx=torch.topk(avg_top5_scores,k=num_max_dets,dim=-1)#(N*140)->(N),(N)
      
      #filter the detections by confidence
      filtered_indices=torch.where(max_value>confidence)[0]#N->m# m is less than N
      
      #get math_idx masks and index
      detections.filter(filtered_indices)#m*masks.shape(m*720*1280)
      detections.to_numpy()
      
      #get object_index and class_list
      object_index=filtered_indices.cpu().numpy()
      class_list=[self._object_name_list[index] for index in object_index]
      
      return detections.masks,class_list#(m*720*1280), the N may less than num_max_dets

from cnos.

elevenjiang1 avatar elevenjiang1 commented on August 17, 2024

Solve it by the code shown below

  def get_id_from_mask(self,color_image,mask_list,bbox_list):
      """
      Get mesh_id from cnos/data/obj_dta

      Args:
          color_image (_type_): _description_
          mask_list (_type_): should be a list with np.uint8, or np.array with (N*width*height)
          bbox_list (_type_): should be a list with [[x1,y1,x2,y2],[x1,y1,x2,y2],...]

      Returns:
          _type_: _description_
      """
      #1: init color_image and detections
      color_image=cv.cvtColor(color_image,cv.COLOR_BGR2RGB)
      if type(mask_list)==list:
          mask_list=np.array(mask_list).astype(np.float32)/255.0
      elif type(mask_list)==np.ndarray:
          mask_list=mask_list.astype(np.float32)/255.0
      bbox_list=np.array(bbox_list).astype(np.float32)
      masks_tensor=torch.tensor(mask_list).cuda()#num_bbox*H*W
      bbox_tensor=torch.tensor(bbox_list).cuda()#num_bbox*4
      detections={'masks':masks_tensor,'boxes':bbox_tensor}
      
      detections=Detections(detections)
      decriptors=self.descriptor_model.forward(np.array(color_image),detections)#num_bbox*1024
      
      #2: compare similarity
      metric=PairwiseSimilarity()
      scores=metric(decriptors,self.all_ref_feats)#(num_bbox*1024, N*42*1024)->(num_bbox*N*42)

      #select top-k detections and compute their mean score as the final score
      top5_scores,_=torch.topk(scores,k=5,dim=-1)#(num_bbox*N*42)->(num_bbox*N*5)
      avg_top5_scores=torch.mean(top5_scores,dim=-1)#(num_bbox*N*5)->(num_bbox*N)     
      
      #get final score and index
      max_value,best_match_idx=torch.max(avg_top5_scores,dim=1)#(num_bbox*N)->(num_bbox)
      
      confidence_list=max_value.cpu().numpy()
      best_match_idx=best_match_idx.cpu().numpy()
      
      #get class name from sellf._object_name_list
      class_list=[self._object_name_list[index] for index in best_match_idx]
      
      return class_list,confidence_list

from cnos.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.