Code Monkey home page Code Monkey logo

sparse4d's Issues

About mAOE in r50

Thank you for your outstanding work and valuable open-source contributions.
I have observed that the mAOE metric of Sparse4D is slightly inferior in r50 configurations compared to recent release methods (StreamPETR, SparseBEV and DynamicBEV), but it holds a great advantage on other configurations (r101, v2-99).
Do you have an explanation for this phenomenon? Anticipating your reply.

How to generate nuscenes_infos_train.pkl and nuscenes_kmeans900.npy

Hi, I'm fresh to nuscenes. I am just curious about how to generate the nuscenes_infos_train.pkl, nuscenes_infos_val.pkl, etc. Is there any difference bettween them and the nuscenes2d_temporal_infos_train.pkl in StreamPETR?

Btw, could you provide the script about how to generate nuscenes_kmeans900.npy, thanks so much!

关于深度

作者您好,很感谢您出色的工作,我想请教一下是否有关注过该算法输出的预测框在深度上的误差?另外v2版本中增加了深度的监督后,深度是否更加准确呢?

What is the use for `cls_threshold_to_reg` in Spars4DHead?

Using this while training seems to just drop all of gts and predictions with mask being basically 0s.

This parameter is defined as 0.05 in configs:

cls_threshold_to_reg=0.05,

But while running the training, this line causes problem:

mask = torch.logical_and(
mask, cls.max(dim=-1).values.sigmoid() > threshold
)

Basically, all of the cls scores are below the threshold and thus mask's elements are set to 0s by this torch.logical_and gate.

初始anchor

初始的通过聚类得到的anchor有什么好处? 可以随机初始化吗?

Tracking Pipeline of Sparse4Dv3

In Sparse4Dv3, is the tracking pipeline trained jointly with detection, or is it only used in inference to extend to tracking with Algorithm 1?

Welcome update to OpenMMLab 2.0

Welcome update to OpenMMLab 2.0

I am Vansin, the technical operator of OpenMMLab. In September of last year, we announced the release of OpenMMLab 2.0 at the World Artificial Intelligence Conference in Shanghai. We invite you to upgrade your algorithm library to OpenMMLab 2.0 using MMEngine, which can be used for both research and commercial purposes. If you have any questions, please feel free to join us on the OpenMMLab Discord at https://discord.gg/A9dCpjHPfE or add me on WeChat (ID: van-sin) and I will invite you to the OpenMMLab WeChat group.

Here are the OpenMMLab 2.0 repos branches:

OpenMMLab 1.0 branch OpenMMLab 2.0 branch
MMEngine 0.x
MMCV 1.x 2.x
MMDetection 0.x 、1.x、2.x 3.x
MMAction2 0.x 1.x
MMClassification 0.x 1.x
MMSegmentation 0.x 1.x
MMDetection3D 0.x 1.x
MMEditing 0.x 1.x
MMPose 0.x 1.x
MMDeploy 0.x 1.x
MMTracking 0.x 1.x
MMOCR 0.x 1.x
MMRazor 0.x 1.x
MMSelfSup 0.x 1.x
MMRotate 0.x 1.x
MMYOLO 0.x

Attention: please create a new virtual environment for OpenMMLab 2.0.

Inquiry about the performance result of Sparse4D v1 on ResNet-50.

Dear authors,

Thanks for the excellent work at first.
I'd like to know the nuScenes val set performance result of Sparse4D v1 on ResNet-50 because many methods compare the result of ResNet-50 on val set but there is only Resnet-101 in the paper.

I'll appreciate very much for your instant reply.

How can we prepare nuscenes_infos_train.pkl ?

nuscenes_cam
│ │ ├── nuscenes_infos_test.pkl
│ │ ├── nuscenes_infos_train.pkl
│ │ ├── nuscenes_infos_val.pkl
│ │ └── nuscenes_infos_trainval_with_inds.pkl

Trainval_with_inds is valid at: https://github.com/linxuewu/Sparse4D/releases/download/v0.0/nuscenes_infos_trainval_with_inds.pkl
Train is valid at: https://github.com/linxuewu/Sparse4D/releases/download/v0.0/nuscenes_infos_train.pkl. BUT, mmcv load this pkl will encounter a problem:

File "/opt/miniconda3/envs/py38k/lib/python3.8/site-packages/mmcv/fileio/io.py", line 57, in load
raise TypeError(f'Unsupported format: {file_format}') 

Val is not valid at the web url.

Furthermore, can we prepare pkl files myself ?

Temporal inference

Can we futher improve mAP or NDS of valset and testset, if we use temporal information during inference?Thanks!

Regarding ablation study for motion compensation

Hello, thank you for your great work.

I am writing this issue to ask about the ablation study conducted on ego motion and object motion.
I believe that ego motion compensation follows similar process as BEVDet and BEVDepth, which aligns past frame t to current frame t_0.
image

For object motion compensation, I think it follows the equation 4 in the paper: image .

And my question is how is it possible to conduct both ego and object motion compensation at the same time as in the last row of Table 2?
image

Thank you for your reply in advance.

关于MOT

感谢您非常棒的工作!这里想问一些问题。

问题1:对于跟踪任务而言,history query和det query是是有可能检测到同一个目标的吧?照论文的说法,会分配两个不同的track id,这个是如何解决的呢?

问题2:我在实验过程中,由于history query和det query有可能会命中相同的目标的,由于self attn会有抑制重复框的功能,会造成history query的score普遍偏低,所有检测结果基本都来源于当前帧的query,请问您在实验过程中有遇到过这种问题吗?

关于InfiniteGroupEachSampleInBatchSampler

我理解InfiniteGroupEachSampleInBatchSampler 是每个batch单独跑一个stream/clip?在两个clip交界的时候, 请问怎么处理第一帧的,第一帧并没有上一帧进行叠帧。训练的时候哪里进行判断第一帧

NaN loss and grad_norm in training

I tried training with projects/configs/sparse4dv2_r50_HInf_256x704.py and got NaN in grad_norm and loss in all attempts? Is this a known issue and is there a workaround?

Code Organization and Cleanup

The current state of the codebase is not well-organized and requires further cleanup. This issue aims to address the lack of clarity in the code structure and improve its overall cleanliness. While the code may still have some unresolved issues, your feedback is greatly appreciated, and I will make every effort to assist in resolving them. Rest assured, the code's cleanliness will continue to improve over time.

The sparse4dv2 model fails to converge

I trained the model using the unmodified code and data, expecting to reproduce sparse4d. However, sparse4dv2 failed to converge, and the grad_norm encountered NaN or Infinity, resulting in abnormal results. Do you have any suggestions on how to resolve this issue? Additionally, when can we expect the release of the Sparse4D-v3 code? I'm eagerly awaiting your response. Thank you!

Confusing about anchor whl and output whl.

Hello, in the paper and code the meaning of anchor[:, [w, h, l]] is lnw, lnh and lnl, but here in the output part, you just add output[:, [w, h, l]] and anchor[:, [w, h, l]] to get output anchor's lnw, lnh and lnl. I think the better formular should be output[:, [w, h, l]] = ln(output[:, [w, h, l]].exp() + anchor[:, [w, h, l]].exp()).

In the case all class scores are lower than self.cls_threshold_to_reg.

Dear author @linxuewu ,
This is really a nice work! However, when training "sparse4d_r101_H1" I will always meet a bug at these lines of code.

if self.cls_threshold_to_reg > 0:
threshold = self.cls_threshold_to_reg
mask = torch.logical_and(
mask, cls.max(dim=-1).values.sigmoid() > threshold
)
cls = cls.flatten(end_dim=1)
cls_target = cls_target.flatten(end_dim=1)
cls_loss = self.loss_cls(cls, cls_target, avg_factor=num_pos)
mask = mask.reshape(-1)
reg_weights = reg_weights * reg.new_tensor(self.reg_weights)
reg_target = reg_target.flatten(end_dim=1)[mask]
reg = reg.flatten(end_dim=1)[mask]
reg_weights = reg_weights.flatten(end_dim=1)[mask]
reg_target = torch.where(
reg_target.isnan(), reg.new_tensor(0.0), reg_target
)
reg_loss = self.loss_reg(
reg, reg_target, weight=reg_weights, avg_factor=num_pos
)

Where, the resulting mask is all False, causing an empty "reg_target".

在训练过程是如何控制时序输入前后顺序的?

感谢你非常棒的工作,看了你的代码后有一些疑惑请教你一下:
在V2版本中,训练过程中,instance_bank会在训练迭代的过程训练缓存当前的feature和anchor,训练集是按照sense划分的,整个数据集的数据不是连续的,如何保证从instance_bank获取的cache的feature和anchor与当前batch数据的连贯性呢?

谢谢!!!

How did you set the score_threshold ?

Thank you for your great work.

How did you set the score_threshold of decoder to extract output from the results of 3D detection on nuScenes test data table in the sparse4D v2.0 paper?

The version of mmdet3d

Hello, I found that the version of mmdet3d is 0.17 < 1.0, while the latest version is >1.0 with some changes. Is the code suitable for the >1.0 version? (In my observation based on the bbox-coder code, I think yes.) If not, where should I revise, such as dataloader or boxcoder?

Problem about tracking

Thank you for your great work!
I wonder if you have released the code related to tracking, since the 'tracking for NuScenes3DDetTrackDataset' is False for all configs.

If you have released it, is there an example config?

How to get the annotation files?

Thank for your sharing work! I don't know how to produce the ann file nuscenes_infos_train.pkl, nuscenes_infos_val.pkl and nuscenes_infos_test.pkl. Can you give me some guidelines?

Code release!

Hi!
Are you planning to open-source your code?

Thanks

【关于DenseDepthNet】如何设计密集深度预测的监督方式?

您好,请问Sparse4Dv2代码中,

  1. 为什么DenseDepthNet模块使用回归任务来做监督,而不是普通使用的分类任务?
  2. 回归损失函数为何这样设计,分母为什么是len(gt) * len(depth_preds) ?
    在我的模型里,error = torch.abs(depth_preds - gt_depths).sum(),error的值是338249.6469, 而分母 len(gt_depths) * len(depth_preds)的值是 30220*30220=913248400;使得最终得到的loss=338249.6469 / 913248400 = 0.00037038077143086153;这个值非常的小,同时乘以0.2的loss_weight以后就更小了,所以想问一下,这个depth_loss是否是符合预期的呢?

非常感谢~!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.