About mAOE in r50

Thank you for your outstanding work and valuable open-source contributions.
I have observed that the mAOE metric of Sparse4D is slightly inferior in r50 configurations compared to recent release methods (StreamPETR, SparseBEV and DynamicBEV), but it holds a great advantage on other configurations (r101, v2-99).
Do you have an explanation for this phenomenon? Anticipating your reply.

Suggested training configurations for smaller batch sizes?

I was wondering whether you have tested with different batch sizes, in particular do you have any training config suggestions if I'm using total batch size of 8 or 16 or 32?

How to generate nuscenes_infos_train.pkl and nuscenes_kmeans900.npy

Hi, I'm fresh to nuscenes. I am just curious about how to generate the nuscenes_infos_train.pkl, nuscenes_infos_val.pkl, etc. Is there any difference bettween them and the nuscenes2d_temporal_infos_train.pkl in StreamPETR?

Btw, could you provide the script about how to generate nuscenes_kmeans900.npy, thanks so much!

when will the tracing module in the code be released?

Hi, when will the tracing module in the code be released?

关于深度

作者您好，很感谢您出色的工作，我想请教一下是否有关注过该算法输出的预测框在深度上的误差？另外v2版本中增加了深度的监督后，深度是否更加准确呢？

What is the use for `cls_threshold_to_reg` in Spars4DHead?

Using this while training seems to just drop all of gts and predictions with mask being basically 0s.

This parameter is defined as 0.05 in configs:

Sparse4D/projects/configs/sparse4d_r101_H1.py

Line 49 in db55d91

cls_threshold_to_reg=0.05,

But while running the training, this line causes problem:

Sparse4D/projects/mmdet3d_plugin/models/sparse4d_head.py

Lines 245 to 247 in db55d91

    
           mask = torch.logical_and( 
        
               mask, cls.max(dim=-1).values.sigmoid() > threshold 
        
           )

Basically, all of the cls scores are below the threshold and thus mask's elements are set to 0s by this torch.logical_and gate.

Feature instances not learned parameters?

Hi,
The grads are turned off for instance features. How are they updated?

Sparse4D/projects/configs/sparse4dv2_r50_HInf_256x704.py

Line 117 in fec3a5e

feat_grad=False,

初始anchor

初始的通过聚类得到的anchor有什么好处？可以随机初始化吗？

Tracking Pipeline of Sparse4Dv3

In Sparse4Dv3, is the tracking pipeline trained jointly with detection, or is it only used in inference to extend to tracking with Algorithm 1？

why don't learn local yaw?

I wonder why we don't learn local yaw but yaw, will local yaw be better?

Inquiries about ckpt files learned by VoVNet and ResNet-101 backbone on the Sparse4Dv2

Hello, I am impressed with your papers.

I can check to provide only the config setting and model trained files for ResNet50 in Sparse4D v2.
So, can I get config files and ckpt files for ResNet-101 and VoVNet-99 for Sparse4D v2??

Additionally, I look forward to the Sparse4D v3 released.
Thank you!

when will open the tracker code ?

Welcome update to OpenMMLab 2.0

I am Vansin, the technical operator of OpenMMLab. In September of last year, we announced the release of OpenMMLab 2.0 at the World Artificial Intelligence Conference in Shanghai. We invite you to upgrade your algorithm library to OpenMMLab 2.0 using MMEngine, which can be used for both research and commercial purposes. If you have any questions, please feel free to join us on the OpenMMLab Discord at https://discord.gg/A9dCpjHPfE or add me on WeChat (ID: van-sin) and I will invite you to the OpenMMLab WeChat group.

Here are the OpenMMLab 2.0 repos branches:

	OpenMMLab 1.0 branch	OpenMMLab 2.0 branch
MMEngine		0.x
MMCV	1.x	2.x
MMDetection	0.x 、1.x、2.x	3.x
MMAction2	0.x	1.x
MMClassification	0.x	1.x
MMSegmentation	0.x	1.x
MMDetection3D	0.x	1.x
MMEditing	0.x	1.x
MMPose	0.x	1.x
MMDeploy	0.x	1.x
MMTracking	0.x	1.x
MMOCR	0.x	1.x
MMRazor	0.x	1.x
MMSelfSup	0.x	1.x
MMRotate	0.x	1.x
MMYOLO		0.x

Attention: please create a new virtual environment for OpenMMLab 2.0.

Inquiry about the performance result of Sparse4D v1 on ResNet-50.

Dear authors,

Thanks for the excellent work at first.
I'd like to know the nuScenes val set performance result of Sparse4D v1 on ResNet-50 because many methods compare the result of ResNet-50 on val set but there is only Resnet-101 in the paper.

I'll appreciate very much for your instant reply.

How can we prepare nuscenes_infos_train.pkl ?

nuscenes_cam
│ │ ├── nuscenes_infos_test.pkl
│ │ ├── nuscenes_infos_train.pkl
│ │ ├── nuscenes_infos_val.pkl
│ │ └── nuscenes_infos_trainval_with_inds.pkl

Trainval_with_inds is valid at: https://github.com/linxuewu/Sparse4D/releases/download/v0.0/nuscenes_infos_trainval_with_inds.pkl
Train is valid at: https://github.com/linxuewu/Sparse4D/releases/download/v0.0/nuscenes_infos_train.pkl. BUT, mmcv load this pkl will encounter a problem:

File "/opt/miniconda3/envs/py38k/lib/python3.8/site-packages/mmcv/fileio/io.py", line 57, in load
raise TypeError(f'Unsupported format: {file_format}')

Val is not valid at the web url.

Furthermore, can we prepare pkl files myself ?

Can we extend Sparse4D to a new dataset?

Sparse4D/projects/mmdet3d_plugin/models/blocks.py

Line 514 in c48ecd2

projection_mat[:, :, None, None], pts_extend[:, None, ..., None]

Assuming that we are using a new trainset composed by different cars and they have different projection_mats(lidar2img) to train a model, now we are trying to use this model in a new car without lidar. The question is whitch projection_mat should we choose?Thanks a lot!

请问可以提供checkpoint吗

可以提供模型训练的pth吗？

Temporal inference

Can we futher improve mAP or NDS of valset and testset, if we use temporal information during inference?Thanks!

关于self.instance_feature

在resnet50版本中，instance_feature的requires_grad是False，如何更新self.instance_feature ？

Regarding ablation study for motion compensation

Hello, thank you for your great work.

I am writing this issue to ask about the ablation study conducted on ego motion and object motion.
I believe that ego motion compensation follows similar process as BEVDet and BEVDepth, which aligns past frame t to current frame t_0.

For object motion compensation, I think it follows the equation 4 in the paper: .

And my question is how is it possible to conduct both ego and object motion compensation at the same time as in the last row of Table 2?

Thank you for your reply in advance.

关于MOT

感谢您非常棒的工作！这里想问一些问题。

问题1:对于跟踪任务而言，history query和det query是是有可能检测到同一个目标的吧？照论文的说法，会分配两个不同的track id，这个是如何解决的呢？

问题2:我在实验过程中，由于history query和det query有可能会命中相同的目标的，由于self attn会有抑制重复框的功能，会造成history query的score普遍偏低，所有检测结果基本都来源于当前帧的query，请问您在实验过程中有遇到过这种问题吗？

关于InfiniteGroupEachSampleInBatchSampler

我理解InfiniteGroupEachSampleInBatchSampler 是每个batch单独跑一个stream/clip？在两个clip交界的时候，请问怎么处理第一帧的，第一帧并没有上一帧进行叠帧。训练的时候哪里进行判断第一帧

Where is the code about "Camera Parameter Encoding"?

I reviewed the code but didn't find the code about "Camera Parameter Encoding" noticed in Sec 3.4 in original paper, could you figure it out, thanks!

NaN loss and grad_norm in training

I tried training with projects/configs/sparse4dv2_r50_HInf_256x704.py and got NaN in grad_norm and loss in all attempts? Is this a known issue and is there a workaround?

Purpose of the cross attention layer of sparsev2

I don't quite understand the purpose of cross attention here.
It seems that in this way , it will lose the information of new born object in current frame ?

Code Organization and Cleanup

The current state of the codebase is not well-organized and requires further cleanup. This issue aims to address the lack of clarity in the code structure and improve its overall cleanliness. While the code may still have some unresolved issues, your feedback is greatly appreciated, and I will make every effort to assist in resolving them. Rest assured, the code's cleanliness will continue to improve over time.

Questions about data augmentation.

Thanks for your good job!

There is a small question for me, why put GridMask augmentation here rather than train_pipeline? This seems not that elegant, thanks.

Sparse4D/projects/mmdet3d_plugin/models/sparse4d.py

Line 57 in c48ecd2

if self.use_grid_mask:

请问可以提供训练好的模型吗？

V3的centerness和yawness

Hi xuewu，
Sparse4D v3中关于yawness和centerness的训练的Y_pred和C_pred具体是？

license is missing in this project.

Do you intend to add license for this project?

When will sparse4d v3 be released ?

can't wait to try this SOTA Algorithm out XD

您好，我在复现您的工作的时候，出现了NuScenes3DDetTrackDataset Not Found的问题

这是因为MMdet3D plugin中的插件没有注册成功吗，我应该如何解决。而且我能否使用Nuscenes的v1.0mini数据集进行实验？我应该在哪些部分进行更改？谢谢您！！

都已经够刷到sota了，为什么还要上未来帧？

The sparse4dv2 model fails to converge

I trained the model using the unmodified code and data, expecting to reproduce sparse4d. However, sparse4dv2 failed to converge, and the grad_norm encountered NaN or Infinity, resulting in abnormal results. Do you have any suggestions on how to resolve this issue? Additionally, when can we expect the release of the Sparse4D-v3 code? I'm eagerly awaiting your response. Thank you!

Confusing about anchor whl and output whl.

Sparse4D/projects/mmdet3d_plugin/models/detection3d/detection3d_blocks.py

Line 99 in 9a0de41

output[..., self.refine_state] = (

Hello, in the paper and code the meaning of anchor[:, [w, h, l]] is lnw, lnh and lnl, but here in the output part, you just add output[:, [w, h, l]] and anchor[:, [w, h, l]] to get output anchor's lnw, lnh and lnl. I think the better formular should be output[:, [w, h, l]] = ln(output[:, [w, h, l]].exp() + anchor[:, [w, h, l]].exp()).

In the case all class scores are lower than self.cls_threshold_to_reg.

Dear author @linxuewu ,
This is really a nice work! However, when training "sparse4d_r101_H1" I will always meet a bug at these lines of code.

Sparse4D/projects/mmdet3d_plugin/models/sparse4d_head.py

Lines 257 to 277 in 2154d91

    
           if self.cls_threshold_to_reg > 0: 
        
               threshold = self.cls_threshold_to_reg 
        
               mask = torch.logical_and( 
        
                   mask, cls.max(dim=-1).values.sigmoid() > threshold 
        
               ) 
        
           cls = cls.flatten(end_dim=1) 
        
           cls_target = cls_target.flatten(end_dim=1) 
        
           cls_loss = self.loss_cls(cls, cls_target, avg_factor=num_pos) 
        
           mask = mask.reshape(-1) 
        
           reg_weights = reg_weights * reg.new_tensor(self.reg_weights) 
        
           reg_target = reg_target.flatten(end_dim=1)[mask] 
        
           reg = reg.flatten(end_dim=1)[mask] 
        
           reg_weights = reg_weights.flatten(end_dim=1)[mask] 
        
           reg_target = torch.where( 
        
               reg_target.isnan(), reg.new_tensor(0.0), reg_target 
        
           ) 
        
           reg_loss = self.loss_reg( 
        
               reg, reg_target, weight=reg_weights, avg_factor=num_pos 
        
           )

Where, the resulting mask is all False, causing an empty "reg_target".

在训练过程是如何控制时序输入前后顺序的？

感谢你非常棒的工作，看了你的代码后有一些疑惑请教你一下：
在V2版本中，训练过程中，instance_bank会在训练迭代的过程训练缓存当前的feature和anchor，训练集是按照sense划分的，整个数据集的数据不是连续的，如何保证从instance_bank获取的cache的feature和anchor与当前batch数据的连贯性呢？

谢谢！！！

How did you set the score_threshold ?

Thank you for your great work.

How did you set the score_threshold of decoder to extract output from the results of 3D detection on nuScenes test data table in the sparse4D v2.0 paper?

请问训练的时候怎么指定batchsize的大小

The version of mmdet3d

Hello, I found that the version of mmdet3d is 0.17 < 1.0, while the latest version is >1.0 with some changes. Is the code suitable for the >1.0 version? (In my observation based on the bbox-coder code, I think yes.) If not, where should I revise, such as dataloader or boxcoder?

Problem about tracking

Thank you for your great work!
I wonder if you have released the code related to tracking, since the 'tracking for NuScenes3DDetTrackDataset' is False for all configs.

If you have released it, is there an example config?

will release the waymo challenge code?

Where is nuscenes_kmeans900.npy?

I can't find it. Thank you

Will the noramalize_yaw be helpful?

Will the hyperparameter noramalize_yaw in the refine module be helpful to pred sin & cos?

It is a advice, not a problem

Check if your dataset permissions are executable, otherwise the training results will be very poor

How to get valid output?

Sparse4D/projects/mmdet3d_plugin/models/decoder.py

Line 15 in c48ecd2

num_output: int = 300,

In inference stage, we will get 900 anchors and after post_process we finally get 300 anchors. However most of these anchors will be FP, so how to get good predicts?Can we set score_threshold=0.5 to simply get what we want?Thanks!

为什么DenseDepthNet模块使用回归任务来做监督，而不是普通使用的分类任务？
回归损失函数为何这样设计，分母为什么是len(gt) * len(depth_preds) ？
在我的模型里，error = torch.abs(depth_preds - gt_depths).sum()，error的值是338249.6469，而分母 len(gt_depths) * len(depth_preds)的值是 30220*30220=913248400；使得最终得到的loss=338249.6469 / 913248400 = 0.00037038077143086153;这个值非常的小，同时乘以0.2的loss_weight以后就更小了，所以想问一下，这个depth_loss是否是符合预期的呢？

非常感谢~！

	mask = torch.logical_and(
	mask, cls.max(dim=-1).values.sigmoid() > threshold
	)

	if self.cls_threshold_to_reg > 0:
	threshold = self.cls_threshold_to_reg
	mask = torch.logical_and(
	mask, cls.max(dim=-1).values.sigmoid() > threshold
	)

	cls = cls.flatten(end_dim=1)
	cls_target = cls_target.flatten(end_dim=1)
	cls_loss = self.loss_cls(cls, cls_target, avg_factor=num_pos)

	mask = mask.reshape(-1)
	reg_weights = reg_weights * reg.new_tensor(self.reg_weights)
	reg_target = reg_target.flatten(end_dim=1)[mask]
	reg = reg.flatten(end_dim=1)[mask]
	reg_weights = reg_weights.flatten(end_dim=1)[mask]
	reg_target = torch.where(
	reg_target.isnan(), reg.new_tensor(0.0), reg_target
	)
	reg_loss = self.loss_reg(
	reg, reg_target, weight=reg_weights, avg_factor=num_pos
	)

linxuewu / sparse4d Goto Github PK

sparse4d's Issues

Welcome update to OpenMMLab 2.0

Recommend Projects

Recommend Topics

Recommend Org