zhiwen-xdu / eventsam Goto Github PK

View Code? Open in Web Editor NEW

60.0 60.0 5.0 9.71 MB

Code for CVPR'24 Paper: Segment Any Event Streams via Weighted Adaptation of Pivotal Tokens

License: MIT License

Python 99.62% Shell 0.38%

event-vision knowledge-distillation open-world sam segmentation

eventsam's People

Contributors

Stargazers

Watchers

Forkers

zhu-zhiyu bin1119 dianzizs julicelee jacobgen

eventsam's Issues

matrix_to_weight problem?

Hi authors,
Has the matrix_to_weight function been changed in your project?
I notice that you need to multiply the attention matrices from the specific layer to the last layer, but in the project, you only use attention matrix in one layer?

Question about event npz file

Hello, I tried to use numpy to read the raw npz data of the event you provided, but I found that some npz files would report errors:

    raise ValueError("Cannot load file containing pickled data "
ValueError: Cannot load file containing pickled data when allow_pickle=False

Such as
'''
Datasets/RGBE-SEG/train/dvSave-2022_02_07_22_44_25/event/0195.npz
Datasets/RGBE-SEG/train/dvSave-2022_02_07_22_44_25/event/0223.npz
Datasets/RGBE-SEG/train/dvSave-2022_02_07_22_44_25/event/0194.npz
Datasets/RGBE-SEG/train/dvSave-2022_02_07_22_44_25/event/0227.npz
.........
'''
Even if allow_pickle=True, these files will still read errors. I wonder if you use numpy when saving npz files? Can you read these files normally?
Thanks

Missing eventsam_list.txt

Sorry, I did not find eventsam_list.txt in the RGBE-SAM dataset you provided, causing the dataloder to run incorrectly. I wonder if this file is missing from the dataset you provided? Can you help with that? Thank you very much.

class RGBEData(Dataset):
    def __init__(self, root):
        self.root = root
        self.data_paths = [line.rstrip() for line in open(os.path.join(self.root, 'eventsam_list.txt'))]
        print('The size of data is %d' % (len(self.data_paths)))
        self.image_pixel_mean =  torch.Tensor([0.485,0.456,0.406]).view(-1, 1, 1)
        self.image_pixel_std = torch.Tensor([0.229,0.224,0.225]).view(-1, 1, 1)
        self.evimg_pixel_mean = torch.Tensor([0.485,0.456,0.406]).view(-1, 1, 1)
        self.evimg_pixel_std = torch.Tensor([0.229,0.224,0.225]).view(-1, 1, 1)

About DSEC results

Hi dear authors,

may I ask where are the results for DSEC-SEG dataset? I didn't see them in the paper

class Mix_RGBE_Encoder(nn.Module) BUG

Hello, thank you for your open source code.
Maybe class Mix_RGBE_Encoder(nn.Module) has two BUGs in these two lines, one is self.image_encoder(input_images) only returns two variables, but you wrote three here, image_tokens are not mentioned. The other one is masks here are not declared cause code to run incorrectly. Can you solve it?

image_tokens, image_embeddings_dict, token_weights_dict = self.image_encoder(input_images)
evimg_embeddings_dict = self.evimg_encoder(input_evimgs,image_tokens,masks)

MVSEC GT

Hey,
Where can I find the segmentation semantic of MVSEC GT ?

Request about how to publish datasets

I read your paper and am very interested in this research.

I want to run this GitHub code, but I'm having trouble downloading the dataset.
It looks like a Chinese phone number is required to download the dataset.
Could you share datasets via other tools, such as Google Drive?

I look forward to your reply.

class Mix_ImageEncoderViT BUG

Hello, thank you very much for your open source code, but during the running process, I found whether there are bugs in class Mix_ImageEncoderViT, that is, med_attn_matrix will only have block 3, 6, 9, but not 5, 8, 11, 4, 7, 10 through your code. This problem causes the code to not train properly, can you solve it?

        # med_feature_indexes=[2,5,8,11],
        # med_attn_matrix_indexes=[2,5,8,11],
        # attn_weight_indexes=[3,6,9],

        for i,blk in enumerate(self.blocks):
            x,attn_matrix = blk(x)
            if i in self.med_feature_indexes:
                # [B,C,H,W]
                self.med_features["block_"+str(i)] = x.permute(0, 3, 1, 2)
            if i in self.attn_weight_indexes:
                # [B*N,K,K]
                self.med_attn_matrix["block_"+str(i)] = attn_matrix

        # ====Mutil Matrix to Token Weight====
        device = x.device
        batch_size = x.shape[0]
        attn_weight_11= torch.ones((batch_size,1,32,32),device=device)                 # [B,1,32,32]
        self.token_weight_dict["block_11"] = attn_weight_11
        attn_matrix_8 = (self.med_attn_matrix["block_11"].to(device) @ self.med_attn_matrix["block_10"].to(device)) @ self.med_attn_matrix["block_9"].to(device)
        self.token_weight_dict["block_8"] = self.matirx_to_weight(attn_matrix_8)     # [B,1,32,32]
        attn_matrix_5 = (self.med_attn_matrix["block_8"].to(device) @ self.med_attn_matrix["block_7"].to(device)) @ self.med_attn_matrix["block_6"].to(device)
        self.token_weight_dict["block_5"] = self.matirx_to_weight(attn_matrix_5)     # [B,1,32,32]
        attn_matrix_2 = (self.med_attn_matrix["block_5"].to(device) @ self.med_attn_matrix["block_4"].to(device)) @ self.med_attn_matrix["block_3"].to(device)
        self.token_weight_dict["block_2"] = self.matirx_to_weight(attn_matrix_2)     # [B,1,32,32]

        return self.med_features, self.token_weight_dict

Semantic Mask

Hi @happychenpipi @ZHU-Zhiyu, thanks for sharing your work, it looks very interesting!

Regarding the collected RGB-Event dataset:

Does it associate with the semantic mask or just the class-agnostic instance mask?
Have you tried EventSAM on ESS datasets? Such as DDD17-Seg and DSEC-Semantic.

Feature Request: Availability of Pre-trained Encoder for sam_vit_h Scale

I'm working with your model and was wondering if there are plans to release a pre-trained encoder module for the sam_vit_h scale?

Would you happen to have any plans to offer this in the future, or is it something that might be considered? Any information you could provide on this matter would be greatly appreciated.

Data link has expired

Thank you very much for your work, but the data link has expired. Could you please update it?

zhiwen-xdu / eventsam Goto Github PK

eventsam's People

Contributors

Stargazers

Watchers

Forkers

eventsam's Issues

matrix_to_weight problem?

Question about event npz file

Missing eventsam_list.txt

About DSEC results

class Mix_RGBE_Encoder(nn.Module) BUG

MVSEC GT

Request about how to publish datasets

class Mix_ImageEncoderViT BUG

Semantic Mask

Feature Request: Availability of Pre-trained Encoder for sam_vit_h Scale

Data link has expired

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent