zikangzhou / hivt Goto Github PK
View Code? Open in Web Editor NEW[CVPR 2022] HiVT: Hierarchical Vector Transformer for Multi-Agent Motion Prediction
License: Apache License 2.0
[CVPR 2022] HiVT: Hierarchical Vector Transformer for Multi-Agent Motion Prediction
License: Apache License 2.0
Thanks for contributing such amazing work!
Just a question, when we compute the cross-attention for the center agent and its neighbor agents, why do we index the edge_index[1] as rotate_mat for x_j (the neighbor agents) rather than edge_index[0]? As far as I know, the edge_index[0] represents the source, i.e., the center agent, and the edge_index[1] represents the target, i.e., the neighbor agents. Here we want to rotate the neighbor agents according to the center agent angles \theta. Thus, I think rotate_mat[edge_index[0]] is the rotate_mat parametrized by the center agent angle \theta, which is used to rotate neighbor agents.
Line 184 in 6876656
您好,周梓康博士,感谢您杰出的代码贡献,想问下您评价指标处一直报这种错误是什么原因??
报错处:class ADE(Metric):
def __init__(self,
compute_on_step: bool = True,
dist_sync_on_step: bool = False,
process_group: Optional[Any] = None,
dist_sync_fn: Callable = None) -> None:
super(ADE, self).__init__(compute_on_step=compute_on_step, dist_sync_on_step=dist_sync_on_step,
process_group=process_group, dist_sync_fn=dist_sync_fn)
self.add_state('sum', default=torch.tensor(0.0), dist_reduce_fx='sum')
self.add_state('count', default=torch.tensor(0), dist_reduce_fx='sum')
错误:
Global seed set to 2022
/home/amax/anaconda3/envs/haha/lib/python3.9/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:466: LightningDeprecationWarning: Setting Trainer(gpus=1)
is deprecated in v1.7 and will be removed in v2.0. Please use Trainer(accelerator='gpu', devices=1)
instead.
rank_zero_deprecation(
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Traceback (most recent call last):
File "/data/project/HiVT-main/train.py", line 44, in
model = HiVT(**vars(args))
File "/data/project/HiVT-main/models/hivt.py", line 86, in init
self.minADE = ADE()
File "/data/project/HiVT-main/metrics/ade.py", line 27, in init
super(ADE, self).init(compute_on_step=compute_on_step, dist_sync_on_step=dist_sync_on_step,
File "/home/amax/anaconda3/envs/haha/lib/python3.9/site-packages/torchmetrics/metric.py", line 145, in init
raise ValueError(f"Unexpected keyword arguments: {', '.join(kwargs_)}")
ValueError: Unexpected keyword arguments: compute_on_step
Hello, thank you for your amazing working.
I have a question after read your code. The set of points with Euclidean distance less than radius is contained in the set of points with Huffman distance less than radius.
If only conculate the huffman distance, will it be more quickly?
Your Sincerely
Hello, Dr. Zhou! Thank you for sharing your elegant code. But I have a question:
It seems that you make the scene centered at autonomous vehicle in code:
# make the scene centered at AV
origin = torch.tensor([av_df[19]['X'], av_df[19]['Y']], dtype=torch.float)
av_heading_vector = origin - torch.tensor([av_df[18]['X'], av_df[18]['Y']], dtype=torch.float)
theta = torch.atan2(av_heading_vector[1], av_heading_vector[0])
rotate_mat = torch.tensor([[torch.cos(theta), -torch.sin(theta)],
[torch.sin(theta), torch.cos(theta)]])
and make the coordinates of other vehicles based on that origin.
But according to your paper, shouldn't the scenes be centered at separate agents? Where are the code that corresponds to this part?
Thank you in advance!
Hi, @ZikangZhou
Thank you for sharing your great work. I just have a question regarding to the decoder modeling. In your paper, you model it as Laplacian Mixture Model instead of Gaussian. However, it seems that GMM is somehow a more common choose. Is there any specific reason for choosing LMM over GMM? Is that for training stability?
I am looking forward to your reply. Thank you in advance.
Best,
Hello,
Thank you for the great work!
Since the model is geared towards predicting the future trajectory of multiple objects in the scene, I ran the pre-trained model to measure the minADE and minFDE for all objects, rather than just the agent object type. On the Argoverse 1.1 validation set, I got the following results, when using a batch size of 1:
DATALOADER:0 VALIDATE RESULTS
{'all_actors_minADE': 1.7695937156677246,
'all_actors_minFDE': 3.879427433013916,
'val_minADE': 0.6611008644104004,
'val_minFDE': 0.9691500067710876,
'val_minMR': 0.09206525981426239,
'val_reg_loss': -0.30943363904953003}
As you can see, there's a large discrepancy between the agent-only metrics and the all_actors metrics.
One reason could be that pedestrians and bikes have distinct behavioral patterns that differ from vehicles. Is there a way to mitigate this within the model, or should HiVT be considered a vehicle-centric model?
Very interesting work. When is the code going to be available?
Hi,
Could you tell me why the attention calculation in your code is achieved by doing hadamard product and summing the elements of the last dimension, instead of dot product operation?
Thank you so much!
I write a test code for generate test result, but the FDE(k=6) of the pretrained model show a 0.1 decrease compare to the HIVT128 on the leaderboard. Could you show me your test.py.? Thank you.
Hi,
Thanks a lot for your wonderful code. Would u pls release the visualization code for qualitative result analysis?
Thanks again.
Excuse me, when I did the following instruction, I got some error.
conda install pytorch==1.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge
And the error message is as following.
`Solving environment: failed
PackagesNotFoundError: The following packages are not available from current channels:
Current channels:
Even I changed my cuda version to 11.0, I still can't solve the problem
Thanks!!
Hi!Thanks for sharing your code!
When I tried to retrain the model, I found that the reg loss gradually dropped to negative numbers, while the cls loss first went down and then rose rapidly and finally remained around 1.75.Although the minADE indicator is declining, whether cls loss has a training abnormality?
Thank you !
Thanks for your code.
Do you have to test your model performance on the waymo motion dataset? If you have do, would you please release the relative code?
Hi @ZikangZhou
Could you describe the training recipe used for the submission?
I got the test results with your model as below but there seems quite large gap compared to the result you got.
minFDE (K=6): 1.277136966106996
MR (K=6): 0.14637267573551055
minADE (K=6): 0.8215039475137975
As it is hard for me to reach your result by just increasing the training epochs, it will be very helpful if you can provide some other hints.
I am looking forward to your reply!
Hello, @ZikangZhou. Thanks for your code!
How much is the performance gap between the models using GRUDecoder and MLPDecoder?
Do you have any idea why GRUDecoder leads to lower performance?
I am looking forward to your reply!
Would you like to provide Leaderboard Submission Code? I test the checkpoint and submit to leaderboader, but got some errors.
I masked other vehicles, made sure only one moving vehicle was left in each scene, and tried to predict the track, which resulted in my edge_ Index is an empty list and cannot continue. What is a good solution?
Thanks for your code.
Do you have to test your model performance on the waymo motion dataset?
I got a really bad result of minADE more than 4 and FDE more than 10.
I am looking forward to your reply. Thank you in advance.
Thank you so much for sharing the code.
I would like to use the result visualization in HiVT code, but has the visualization code shown on the github page not been shared? Can you share it ?
Thank U!
Hi Zikang,
I noticed you used the bit-wise NOT operation "~" in the argoverse v1 dataset to process padding_mask. You set padding_mask to True or False and use "~padding_mask" to calculate the opposite. However, from my understanding, "~" is used for bit-wise NOT, which means ~True=-2 and ~False=-1, both ~True and ~False are True because they are not 0. I'm kind of confused by this, and I guess you meant to use not padding_mask or 1-padding_mask. Could you please clarify it? Thanks!
Hi, I am still confused about the environment:
I follow the instructions:
conda create -n HiVT python=3.8
conda activate HiVT
conda install pytorch==1.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge
conda install pytorch-geometric==1.7.2 -c rusty1s -c conda-forge
conda install pytorch-lightning==1.5.2 -c conda-forge
And get an error:
ModuleNotFoundError: No module named 'torch_geometric.data.storage'
It seems that the PyG should be greater than 2.0.x (see this link)
However, after upgrading PyG to 2.0.1, a new error appeared:
TypeError: inc() takes 3 positional arguments but 4 were given
Do you have any ideas? Many thanks!
After i follow the instructions of this repo:
conda create -n HiVT python=3.8
conda activate HiVT
conda install pytorch==1.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge
conda install pytorch-geometric==1.7.2 -c rusty1s -c conda-forge
conda install pytorch-lightning==1.5.2 -c conda-forge
and then i run:
python eval.py --root ../datasets/ --batch_size 32 --ckpt_path checkpoints/HiVT-64/checkpoints/epoch\=63-step\=411903.ckpt
it always show that:
Original Traceback (most recent call last):
File "/home/cunjun/miniconda3/envs/HiVT/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/home/cunjun/miniconda3/envs/HiVT/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/cunjun/miniconda3/envs/HiVT/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/cunjun/miniconda3/envs/HiVT/lib/python3.8/site-packages/torch_geometric/data/dataset.py", line 193, in __getitem__
data = self.get(self.indices()[idx])
File "/home/cunjun/wuhr/HiVT/datasets/argoverse_v1_dataset.py", line 87, in get
return torch.load(self.processed_paths[idx])
File "/home/cunjun/miniconda3/envs/HiVT/lib/python3.8/site-packages/torch/serialization.py", line 712, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/home/cunjun/miniconda3/envs/HiVT/lib/python3.8/site-packages/torch/serialization.py", line 1049, in _load
result = unpickler.load()
File "/home/cunjun/miniconda3/envs/HiVT/lib/python3.8/site-packages/torch/serialization.py", line 1042, in find_class
return super().find_class(mod_name, name)
AttributeError: Can't get attribute 'DataEdgeAttr' on <module 'torch_geometric.data.data' from '/home/cunjun/miniconda3/envs/HiVT/lib/python3.8/site-packages/torch_geometric/data/data.py'>
I'm stucked here for several times. And i don't know how to solve it. Hope to get your help.
Hi @ZikangZhou,
There is one question regarding to the temporal encoder layer. When computing the attention in temporal encoder layer, all the agent within the batch is calculated instead of batch-wise attention, I was wondering fusing such batched information? Thank you in advance.
Best,
Hello, first of all, thank you for your excellent work and dedication in making it openly available for the community to use.
I was wondering how I could adapt your code so that it can be evaluated in a version without a map, as they claim in the following paper. They have compared HiVT with your model in a way that doesn't use a map.
Thanks in advance
Thank you for sharing the excellent code!
How can I train Hivt use saved checkpoint?
example, I want to finetune the model
Thank you very much for your work. I've been doing related work recently, can you please provide a visualization of the Qualitative Results section? I would be very grateful if you could!
Hi @ZikangZhou,
To convert back to original coordinate system, the output y_hat just use rotation matrix (using av_theta) and origins of av_theta, right?
Hi.
I'm interested in your research, so I'm looking at it hard.
But when I start training code, at first, gpu is used as a high utilization as data enters well through torch.load, as steps progress, data loading time of the torch.load becomes longer and gpu utill 0, a bottleneck occurs. Have you ever had a problem like this?
It is the estimated time required and gpu util before the bottleneck.
Thank U!
Hi, Dr. Zhou,
Thanks for your great work.
What do bos_mask and bos_token mean and what do they do. The abbreviations make it strange.
HiVT/datasets/argoverse_v1_dataset.py
Line 121 in 6876656
Line 136 in 6876656
Thanks for your code!
I noticed that you did not record the classified loss. After adding the log of classification loss, I noticed that the classification loss does not converge. This leads to a bad effect of top1 performance.
Do you have any suggestions on the non convergence of classification loss.
Looking forward to your reply~
I would like to express my appreciation for your work on the HiVT project. I have been using HiVT for my research, and it has been a valuable tool for my experiments.
I am currently working on evaluating the performance of HiVT on the test set of Argoverse. However, I cannot correct load test set data. I always meet the error:
HiVT2/models/hivt.py", line 129, in validation_step
l2_norm = (torch.norm(y_hat[:, :, :, : 2] - data.y, p=2, dim=-1) * reg_mask).sum(dim=-1) # [F, N]
TypeError: unsupported operand type(s) for -: 'Tensor' and 'NoneType'
I would greatly appreciate it if you could provide some guidance or assistance.
Hi,
Thanks for your excellent work. Could you kindly tell me how to extract one vehicle feature from the output of local encoder and global interaction modules? I am new for pytorch_lightning, so I am a little confuse about it.
Best,
Joe
Hello! How to use GPU and multi-card for training? The default card 0 is the CPU for training.
Thank U!
Hi @ZikangZhou,
In your paper, you mention to use semantic attribute of agent as input to agent embedding. However, such semantic attribute is not reflected in your code. Could you please share the method for acquiring semantic attributes? As far as I know, argoverse's agent doesn't have such semantic attribute, do you refer to OBJECT_TYPE in csv file as semantic attribute?
Thank you in advance.
Best,
Is the number of local regions N the same as the number of agents (target + context) in Argoverse? If so, would you mind pointing to the code where the local data in each region is computed?
There is a single agent to predict in Argoverse. Are you trying to predict all the agents, including the context agents, with your model and compute the joint loss over all agents?
Thanks!
In "global_interactor.py", I do not find the definition of 'self.propagate'. But it is used in "_mha_block"
Hello,
Firstly, thanks for your nice work in trajectory prediction. However, I notices that you choose the best trajectory after comparing with ground truth trajectory in validation step. However, in test step, normally, we don't know the ground truth trajectory. Could you kindly tell me how to choose candidate trajectories in the inference step in HiVT?
Best,
Joe
Hi @ZikangZhou, when calculating regression loss, you seem to model x and y as independent univariate Laplace distributions if I understand correctly based on your code. I wonder what would be the intuition behind such a choice versus considering x & y jointly as a multivariate Laplace distribution? Thank you very much for your comments in advance!
论文什么时候能看呀!
Hello,
First of all, thanks for your excellent work!
I wonder if there is any visualization code or tool for local regions.
I know that there is visualization tool for scenario which is provided in argoverse-api but I want to watch just local region that local encoder use to train or infer.
Thank you :)
Thank you for your excellent work! I am dealing with some constraints on my company's server. It's isolated from the external internet, but we do have a well-maintained internal PyPI site. Hence, I can't use 'conda install' for dependencies. Could you provide a requirements.txt file to facilitate pip installations from our internal PyPI?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.