zikangzhou / hivt Goto Github PK

[CVPR 2022] HiVT: Hierarchical Vector Transformer for Multi-Agent Motion Prediction

Home Page: https://openaccess.thecvf.com/content/CVPR2022/papers/Zhou_HiVT_Hierarchical_Vector_Transformer_for_Multi-Agent_Motion_Prediction_CVPR_2022_paper.pdf

License: Apache License 2.0

Python 100.00%

autonomous-driving motion-prediction transformer cvpr2022

hivt's People

Contributors

Stargazers

Watchers

Forkers

yumianhuli2 mspuiyi herpacker zaku-zaku yangyin123456 xupercoin cidert goldfairy hs991023 billionerd xman-o hay-man cerviny monsterdove ntt720 joe12138 kejingjing88212 paramedick lycokie awekling n0wwa hisstar spicyguml tutuna tufo830 iam20cm e-kiss-me nicolesherwood vamoko mryu001 wensiyuansix ririkoa staccats obsidian6s coder-drinker maigone farmingtong d3p10y paoyes windb3ll moguijoe breaklien molierflower closegoingaway minisoco mozrj yuqiangjia xichennn gz475 hayoung-kim devoe-yun zhoujp-runner mengxingshifen1218 catchip halcyonfreed mengmengliu1998 dl-vit meihuanshan shintaro0018 piaopiaojie wyd2 yukaiyang-0532-daowu carrotsniper sejeonglee lainegates lebronremonjames sunstarchan mozhgan91 xjh199923 learning-man yxgz iyuner zhyzhyzhy123 apprenticeyc alexts10 pixelchen24 nihonges chujiexu curry0505 kanikel autofeng ll-c8 kai9877 lionel-lee zhuchichi56 aroundabout zhangdongkun98 dinngger lzyloverila litsunshine zhaixukai genowong xinchengzelin lukeaxu fengfeiqianwu lukas88664 zhaozhen2333 shanshui281 nunwang simplekang

hivt's Issues

A question about AAEncoder

Thanks for contributing such amazing work!
Just a question, when we compute the cross-attention for the center agent and its neighbor agents, why do we index the edge_index[1] as rotate_mat for x_j (the neighbor agents) rather than edge_index[0]? As far as I know, the edge_index[0] represents the source, i.e., the center agent, and the edge_index[1] represents the target, i.e., the neighbor agents. Here we want to rotate the neighbor agents according to the center agent angles \theta. Thus, I think rotate_mat[edge_index[0]] is the rotate_mat parametrized by the center agent angle \theta, which is used to rotate neighbor agents.

HiVT/models/local_encoder.py

Line 184 in 6876656

center_rotate_mat = rotate_mat[edge_index[1]]

评价指标报错

您好，周梓康博士，感谢您杰出的代码贡献，想问下您评价指标处一直报这种错误是什么原因？？
报错处：class ADE(Metric):

def __init__(self,
             compute_on_step: bool = True,
             dist_sync_on_step: bool = False,
             process_group: Optional[Any] = None,
             dist_sync_fn: Callable = None) -> None:
    super(ADE, self).__init__(compute_on_step=compute_on_step, dist_sync_on_step=dist_sync_on_step,
                              process_group=process_group, dist_sync_fn=dist_sync_fn)
    self.add_state('sum', default=torch.tensor(0.0), dist_reduce_fx='sum')
    self.add_state('count', default=torch.tensor(0), dist_reduce_fx='sum')

错误：
Global seed set to 2022
/home/amax/anaconda3/envs/haha/lib/python3.9/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:466: LightningDeprecationWarning: Setting Trainer(gpus=1) is deprecated in v1.7 and will be removed in v2.0. Please use Trainer(accelerator='gpu', devices=1) instead.
rank_zero_deprecation(
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Traceback (most recent call last):
File "/data/project/HiVT-main/train.py", line 44, in
model = HiVT(**vars(args))
File "/data/project/HiVT-main/models/hivt.py", line 86, in init
self.minADE = ADE()
File "/data/project/HiVT-main/metrics/ade.py", line 27, in init
super(ADE, self).init(compute_on_step=compute_on_step, dist_sync_on_step=dist_sync_on_step,
File "/home/amax/anaconda3/envs/haha/lib/python3.9/site-packages/torchmetrics/metric.py", line 145, in init
raise ValueError(f"Unexpected keyword arguments: {', '.join(kwargs_)}")
ValueError: Unexpected keyword arguments: compute_on_step

Why Euclidean distance needed to be conculated after Huffman Distance?

Hello, thank you for your amazing working.
I have a question after read your code. The set of points with Euclidean distance less than radius is contained in the set of points with Huffman distance less than radius.
If only conculate the huffman distance, will it be more quickly?
Your Sincerely

What does 'parallel' in args mean?

Question: where do you make the scene centered at separate agents in code?

Hello, Dr. Zhou! Thank you for sharing your elegant code. But I have a question:

It seems that you make the scene centered at autonomous vehicle in code:

    # make the scene centered at AV
    origin = torch.tensor([av_df[19]['X'], av_df[19]['Y']], dtype=torch.float)
    av_heading_vector = origin - torch.tensor([av_df[18]['X'], av_df[18]['Y']], dtype=torch.float)
    theta = torch.atan2(av_heading_vector[1], av_heading_vector[0])
    rotate_mat = torch.tensor([[torch.cos(theta), -torch.sin(theta)],
                               [torch.sin(theta), torch.cos(theta)]])

and make the coordinates of other vehicles based on that origin.

But according to your paper, shouldn't the scenes be centered at separate agents? Where are the code that corresponds to this part?

Thank you in advance!

Question About Decoder Modeling

Hi, @ZikangZhou

Thank you for sharing your great work. I just have a question regarding to the decoder modeling. In your paper, you model it as Laplacian Mixture Model instead of Gaussian. However, it seems that GMM is somehow a more common choose. Is there any specific reason for choosing LMM over GMM? Is that for training stability?
I am looking forward to your reply. Thank you in advance.

Best,

Prediction Results for non-agent objects

Hello,

Thank you for the great work!

Since the model is geared towards predicting the future trajectory of multiple objects in the scene, I ran the pre-trained model to measure the minADE and minFDE for all objects, rather than just the agent object type. On the Argoverse 1.1 validation set, I got the following results, when using a batch size of 1:

DATALOADER:0 VALIDATE RESULTS
{'all_actors_minADE': 1.7695937156677246,
 'all_actors_minFDE': 3.879427433013916,
 'val_minADE': 0.6611008644104004,
 'val_minFDE': 0.9691500067710876,
 'val_minMR': 0.09206525981426239,
 'val_reg_loss': -0.30943363904953003}

As you can see, there's a large discrepancy between the agent-only metrics and the all_actors metrics.

One reason could be that pedestrians and bikes have distinct behavioral patterns that differ from vehicles. Is there a way to mitigate this within the model, or should HiVT be considered a vehicle-centric model?

Code availability?

Very interesting work. When is the code going to be available?

Question about the attention calculation code "alpha = (query * key).sum(dim=-1) / scale"

Hi,

Could you tell me why the attention calculation in your code is achieved by doing hadamard product and summing the elements of the last dimension, instead of dot product operation?

Thank you so much!

about how to generate test result

I write a test code for generate test result, but the FDE(k=6) of the pretrained model show a 0.1 decrease compare to the HIVT128 on the leaderboard. Could you show me your test.py.? Thank you.

Code for Qualitative Results analysis

Hi,
Thanks a lot for your wonderful code. Would u pls release the visualization code for qualitative result analysis?
Thanks again.

cudatoolkit=11.1 version problem

Excuse me, when I did the following instruction, I got some error.
conda install pytorch==1.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge
And the error message is as following.
`Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

cudatoolkit=11.1
__glibc[version='>=2.17,<3.0.a0']

Current channels:

Even I changed my cuda version to 11.0, I still can't solve the problem

Thanks!!

About the loss: reg loss and cls loss

Hi!Thanks for sharing your code！
When I tried to retrain the model, I found that the reg loss gradually dropped to negative numbers, while the cls loss first went down and then rose rapidly and finally remained around 1.75.Although the minADE indicator is declining, whether cls loss has a training abnormality？
Thank you !

result on waymo motion dataset

Thanks for your code.
Do you have to test your model performance on the waymo motion dataset? If you have do, would you please release the relative code?

Test set performance

Hi @ZikangZhou
Could you describe the training recipe used for the submission?
I got the test results with your model as below but there seems quite large gap compared to the result you got.

minFDE (K=6): 1.277136966106996
MR (K=6): 0.14637267573551055
minADE (K=6): 0.8215039475137975

As it is hard for me to reach your result by just increasing the training epochs, it will be very helpful if you can provide some other hints.
I am looking forward to your reply!

Performance of GRUDecoder

Hello, @ZikangZhou. Thanks for your code!
How much is the performance gap between the models using GRUDecoder and MLPDecoder?
Do you have any idea why GRUDecoder leads to lower performance?
I am looking forward to your reply!

Leaderboard Submission Code

Would you like to provide Leaderboard Submission Code？ I test the checkpoint and submit to leaderboader, but got some errors.

how can i use only one subgraph?

I masked other vehicles, made sure only one moving vehicle was left in each scene, and tried to predict the track, which resulted in my edge_ Index is an empty list and cannot continue. What is a good solution?

Result on Waymo dataset

Thanks for your code.
Do you have to test your model performance on the waymo motion dataset?
I got a really bad result of minADE more than 4 and FDE more than 10.
I am looking forward to your reply. Thank you in advance.

Can I use the visualization code for result

Thank you so much for sharing the code.
I would like to use the result visualization in HiVT code, but has the visualization code shown on the github page not been shared? Can you share it ?
Thank U!

Bit-wise NOT operation "~" for padding_mask in data.

Hi Zikang,

I noticed you used the bit-wise NOT operation "~" in the argoverse v1 dataset to process padding_mask. You set padding_mask to True or False and use "~padding_mask" to calculate the opposite. However, from my understanding, "~" is used for bit-wise NOT, which means ~True=-2 and ~False=-1, both ~True and ~False are True because they are not 0. I'm kind of confused by this, and I guess you meant to use not padding_mask or 1-padding_mask. Could you please clarify it? Thanks!

Environment Configuration

Hi, I am still confused about the environment:

I follow the instructions:

conda create -n HiVT python=3.8
conda activate HiVT
conda install pytorch==1.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge
conda install pytorch-geometric==1.7.2 -c rusty1s -c conda-forge
conda install pytorch-lightning==1.5.2 -c conda-forge

And get an error:

ModuleNotFoundError: No module named 'torch_geometric.data.storage'

It seems that the PyG should be greater than 2.0.x (see this link)

However, after upgrading PyG to 2.0.1, a new error appeared:

TypeError: inc() takes 3 positional arguments but 4 were given

Do you have any ideas? Many thanks!

Installation Error

After i follow the instructions of this repo:

conda create -n HiVT python=3.8
conda activate HiVT
conda install pytorch==1.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge
conda install pytorch-geometric==1.7.2 -c rusty1s -c conda-forge
conda install pytorch-lightning==1.5.2 -c conda-forge

and then i run:

python eval.py --root ../datasets/ --batch_size 32 --ckpt_path checkpoints/HiVT-64/checkpoints/epoch\=63-step\=411903.ckpt

it always show that:

Original Traceback (most recent call last):
  File "/home/cunjun/miniconda3/envs/HiVT/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/cunjun/miniconda3/envs/HiVT/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/cunjun/miniconda3/envs/HiVT/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/cunjun/miniconda3/envs/HiVT/lib/python3.8/site-packages/torch_geometric/data/dataset.py", line 193, in __getitem__
    data = self.get(self.indices()[idx])
  File "/home/cunjun/wuhr/HiVT/datasets/argoverse_v1_dataset.py", line 87, in get
    return torch.load(self.processed_paths[idx])
  File "/home/cunjun/miniconda3/envs/HiVT/lib/python3.8/site-packages/torch/serialization.py", line 712, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "/home/cunjun/miniconda3/envs/HiVT/lib/python3.8/site-packages/torch/serialization.py", line 1049, in _load
    result = unpickler.load()
  File "/home/cunjun/miniconda3/envs/HiVT/lib/python3.8/site-packages/torch/serialization.py", line 1042, in find_class
    return super().find_class(mod_name, name)
AttributeError: Can't get attribute 'DataEdgeAttr' on <module 'torch_geometric.data.data' from '/home/cunjun/miniconda3/envs/HiVT/lib/python3.8/site-packages/torch_geometric/data/data.py'>

I'm stucked here for several times. And i don't know how to solve it. Hope to get your help.

Question Regarding to Temporal Encoder

Hi @ZikangZhou,

There is one question regarding to the temporal encoder layer. When computing the attention in temporal encoder layer, all the agent within the batch is calculated instead of batch-wise attention, I was wondering fusing such batched information? Thank you in advance.

Best,

HiVT map-free customization

Hello, first of all, thank you for your excellent work and dedication in making it openly available for the community to use.

I was wondering how I could adapt your code so that it can be evaluated in a version without a map, as they claim in the following paper. They have compared HiVT with your model in a way that doesn't use a map.

Thanks in advance

Train use saved checkpoint

Thank you for sharing the excellent code!
How can I train Hivt use saved checkpoint?
example, I want to finetune the model

Qualitative Results Visualization

Thank you very much for your work. I've been doing related work recently, can you please provide a visualization of the Qualitative Results section? I would be very grateful if you could!

Converting back to Original Coordinate System

Hi @ZikangZhou,

To convert back to original coordinate system, the output y_hat just use rotation matrix (using av_theta) and origins of av_theta, right?

torch.load bottleneck?

Hi.
I'm interested in your research, so I'm looking at it hard.
But when I start training code, at first, gpu is used as a high utilization as data enters well through torch.load, as steps progress, data loading time of the torch.load becomes longer and gpu utill 0, a bottleneck occurs. Have you ever had a problem like this?

It is the estimated time required and gpu util before the bottleneck.

After bottleneck

Thank U!

The meaning about bos_mask

Hi, Dr. Zhou,
Thanks for your great work.
What do bos_mask and bos_token mean and what do they do. The abbreviations make it strange.

HiVT/datasets/argoverse_v1_dataset.py

Line 121 in 6876656

bos_mask = torch.zeros(num_nodes, 20, dtype=torch.bool)

HiVT/models/local_encoder.py

Line 136 in 6876656

self.bos_token = nn.Parameter(torch.Tensor(historical_steps, embed_dim))

Code Question

At this line,

the points in x seems to be transformed onto "AV" local frame.
However, At

The trajectory (already on "AV" local frame) of a certain agent is rotated by the agent's heading angle at 20th time frame, could you tell me why need to rotate the agent trajectory?

Classification loss does not converge

Thanks for your code!
I noticed that you did not record the classified loss. After adding the log of classification loss, I noticed that the classification loss does not converge. This leads to a bad effect of top1 performance.
Do you have any suggestions on the non convergence of classification loss.
Looking forward to your reply~

How to obtain the ADE/FDE/MR result of test set?

I would like to express my appreciation for your work on the HiVT project. I have been using HiVT for my research, and it has been a valuable tool for my experiments.
I am currently working on evaluating the performance of HiVT on the test set of Argoverse. However, I cannot correct load test set data. I always meet the error:

HiVT2/models/hivt.py", line 129, in validation_step
    l2_norm = (torch.norm(y_hat[:, :, :, : 2] - data.y, p=2, dim=-1) * reg_mask).sum(dim=-1)  # [F, N]
TypeError: unsupported operand type(s) for -: 'Tensor' and 'NoneType'

I would greatly appreciate it if you could provide some guidance or assistance.

How can I extract one vehicle's feature from local encoder and globale interaction module?

Hi,
Thanks for your excellent work. Could you kindly tell me how to extract one vehicle feature from the output of local encoder and global interaction modules? I am new for pytorch_lightning, so I am a little confuse about it.
Best,
Joe

Friendly Remind

Use GPU and multi-card for model training

Hello! How to use GPU and multi-card for training? The default card 0 is the CPU for training.
Thank U!

Question Regarding to Agent Semantic Attribute

Hi @ZikangZhou,

In your paper, you mention to use semantic attribute of agent as input to agent embedding. However, such semantic attribute is not reflected in your code. Could you please share the method for acquiring semantic attributes? As far as I know, argoverse's agent doesn't have such semantic attribute, do you refer to OBJECT_TYPE in csv file as semantic attribute?
Thank you in advance.

Best,

Number of local regions

Is the number of local regions N the same as the number of agents (target + context) in Argoverse? If so, would you mind pointing to the code where the local data in each region is computed?

There is a single agent to predict in Argoverse. Are you trying to predict all the agents, including the context agents, with your model and compute the joint loss over all agents?

Thanks!

self.propagate function

In "global_interactor.py", I do not find the definition of 'self.propagate'. But it is used in "_mha_block"

Issue about inference

Hello,
Firstly, thanks for your nice work in trajectory prediction. However, I notices that you choose the best trajectory after comparing with ground truth trajectory in validation step. However, in test step, normally, we don't know the ground truth trajectory. Could you kindly tell me how to choose candidate trajectories in the inference step in HiVT?
Best,
Joe

question regarding Laplace distribution

Hi @ZikangZhou, when calculating regression loss, you seem to model x and y as independent univariate Laplace distributions if I understand correctly based on your code. I wonder what would be the intuition behind such a choice versus considering x & y jointly as a multivariate Laplace distribution? Thank you very much for your comments in advance!

你好

论文什么时候能看呀！

Visualization tool for local region

Hello,
First of all, thanks for your excellent work!
I wonder if there is any visualization code or tool for local regions.
I know that there is visualization tool for scenario which is provided in argoverse-api but I want to watch just local region that local encoder use to train or infer.
Thank you :)

"solving environment killed" while "conda install pytorch-lightning==1.5.2 -c conda-forge"

About no requirements.txt

Thank you for your excellent work! I am dealing with some constraints on my company's server. It's isolated from the external internet, but we do have a well-maintained internal PyPI site. Hence, I can't use 'conda install' for dependencies. Could you provide a requirements.txt file to facilitate pip installations from our internal PyPI?