Code Monkey home page Code Monkey logo

jperceiver's People

Contributors

sunnyhelen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

jperceiver's Issues

Our obtained trajectory from draw_odometry.py is quite different from your paper.

issue

We ran python scripts/draw_odometry.py with the pre-trained model, kitti_raw_road.pth on the "KITTI Odometry" dataset.
But, the resulting trajectory is different from yours.
Both the evaluated values of RMSE, rotation error and the estimated values for the trajectory are different from the results in your paper(https://arxiv.org/pdf/2207.07895.pdf).

・The result in the paper
According to Table4
Average sequence translation RMSE (%): 4.57
Average sequence rotation error (deg/100m): 2.94
trajectry : Fig. 7

・Our result
Average sequence translation RMSE (%): 56.6012
Average sequence rotation error (deg/m): 0.4300

Runtime environment

azureml 1.44.0

Reproduction procedures

  1. At draw_odometry.py:L28, KITTIOdomDataset argument "type" is now "static".
  2. I created test_files_07.txt with 07/road_dense128/000000.png to 07/road_dense128/001100.png and placed it in JPerceiver/mono/datasets/splits/odometry/.
  3. I set "~/<my_directory>/kitti_odometry_road.pth" in the path of model_path.
  4. I set 7 in sequence_id.

our puopose

How to get the results of your paper ?

About config files

Hi,

I am confused that which config file I should choose to obtain the pretrained models. Could you please provide scripts for training and evaluating these models?

Besides, I have some questions about the setups in the configuration file:
(1) what is the gt depth file? Does it correspond to the validation set of kitti odometry?

gt_depth_path = '/CV/users/zhaohaimei3/gt_depths.npz',#path to gt data

(2)Although loss_sum is set to 3, loss_weightS is missing in the configuration file.


elif opt.loss_sum == 3:
output = loss(generated_top_view, true_top_view) * opt.loss_weightS + \

Deprecated agroverse

Hi.
The agroverse 1.0 is deprecated and is now replaced with agroverse 1.1.
Do you have any plans to make modification?

inputs['bothD', 0, 0] missed in training process

After the datasets needed are downloaded, I ran the commanding code
# Training CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 --master_port 25629 train.py --config config/cfg_kitti_baseline_odometry_boundary_ce_iou_1024_20.py --work_dir log/odometry/
However, KeyError: ('bothD', 0, 0) happened when computing the loss_dict of odometry dataset..
I noticed that when computing the loss of kitti_object, inputs['bothD', 0, 0] could be provided by getting groundtruth vehicle256, while inputs[''bothS, 0, 0] could be provided by getting groundtruth road dense128 .
Since the topview_lossB requires inputs['bothD', 0, 0] when I run odometry dataset, how can I get this variable?
How about running kitti object dataset, where can I find inputs['bothS', 0, 0] ?
Thanks.

outputs("cam_T_cam",0,-1) is missing

After downloading dataset and pretrained model, I ran [eval_argo_both_video.py] this error:

T_ = outputs[("cam_T_cam", 0, -1)].cpu().numpy()[0]
KeyError: ('cam_T_cam', 0, -1)

IMGS_PER_GPU setting

According to the original config setting, the IMGS_PER_GPU is set to 3. However, the GPUs I use now are GTX 3080 10 GB so I changed the IMGS_PER_GPU to 1. Up to 120 epochs training (kitti odom), the eval results are
abs_rel: 0.2068, sq_rel: 1.5355, rmse: 4.9815, rmse_log: 0.2898, a1: 0.7079, a2: 0.9595, scale_mean: 1.7448, iou_road: 0.7133, mAP_road: 0.8781
These results have been achieved for around 50 epochs and haven't shown any descending trends.
Does the IMGS_PER_GPU set to 1 have a negative effect on the training/eval procedure?

Missing key(s) in state_dict

I try to evaluate the result of layout on kitti-dataset. But I meet this question of Missing key(s) in state_dict:"CycledViewProjectionB.transform_module.fc_transform.0.weight", "CycledViewProjectionB.transform_module.fc_transform.0.bias", "CycledViewProjectionB.transform_module.fc_transform.2.weight", "CycledViewProjectionB.transform_module.fc_transform.2.bias", "CycledViewProjectionB.retransform_module.fc_transform.0.weight", "CycledViewProjectionB.retransform_module.fc_transform.0.bias", "CycledViewProjectionB.retransform_module.fc_transform.2.weight", "CycledViewProjectionB.retransform_module.fc_transform.2.bias", "CrossViewTransformerB.query_conv.weight", "CrossViewTransformerB.query_conv.bias", "CrossViewTransformerB.key_conv.weight", "CrossViewTransformerB.key_conv.bias", "CrossViewTransformerB.value_conv.weight", "CrossViewTransformerB.value_conv.bias", "CrossViewTransformerB.f_conv.weight", "CrossViewTransformerB.f_conv.bias", "CrossViewTransformerB.res_conv.weight", "CrossViewTransformerB.res_conv.bias", "CrossViewTransformerB.query_conv_depth.weight", "CrossViewTransformerB.query_conv_depth.bias", "CrossViewTransformerB.key_conv_depth.weight", "CrossViewTransformerB.key_conv_depth.bias", "CrossViewTransformerB.value_conv_depth.weight", "CrossViewTransformerB.value_conv_depth.bias", "CrossViewTransformerB.conv1.conv.weight", "CrossViewTransformerB.conv1.conv.bias", "CrossViewTransformerB.conv2.conv.weight", "CrossViewTransformerB.conv2.conv.bias", "LayoutDecoderB.decoder.0.weight", "LayoutDecoderB.decoder.0.bias", "LayoutDecoderB.decoder.1.weight", "LayoutDecoderB.decoder.1.bias", "LayoutDecoderB.decoder.1.running_mean", "LayoutDecoderB.decoder.1.running_var", "LayoutDecoderB.decoder.3.weight", "LayoutDecoderB.decoder.3.bias", "LayoutDecoderB.decoder.4.weight", "LayoutDecoderB.decoder.4.bias", "LayoutDecoderB.decoder.4.running_mean", "LayoutDecoderB.decoder.4.running_var", "LayoutDecoderB.decoder.5.weight", "LayoutDecoderB.decoder.5.bias", "LayoutDecoderB.decoder.6.weight", "LayoutDecoderB.decoder.6.bias", "LayoutDecoderB.decoder.6.running_mean", "LayoutDecoderB.decoder.6.running_var", "LayoutDecoderB.decoder.8.weight", "LayoutDecoderB.decoder.8.bias", "LayoutDecoderB.decoder.9.weight", "LayoutDecoderB.decoder.9.bias", "LayoutDecoderB.decoder.9.running_mean", "LayoutDecoderB.decoder.9.running_var", "LayoutDecoderB.decoder.10.weight", "LayoutDecoderB.decoder.10.bias", "LayoutDecoderB.decoder.11.weight", "LayoutDecoderB.decoder.11.bias", "LayoutDecoderB.decoder.11.running_mean", "LayoutDecoderB.decoder.11.running_var", "LayoutDecoderB.decoder.13.weight", "LayoutDecoderB.decoder.13.bias", "LayoutDecoderB.decoder.14.weight", "LayoutDecoderB.decoder.14.bias", "LayoutDecoderB.decoder.14.running_mean", "LayoutDecoderB.decoder.14.running_var", "LayoutDecoderB.decoder.15.weight", "LayoutDecoderB.decoder.15.bias", "LayoutDecoderB.decoder.16.weight", "LayoutDecoderB.decoder.16.bias", "LayoutDecoderB.decoder.16.running_mean", "LayoutDecoderB.decoder.16.running_var", "LayoutDecoderB.decoder.18.weight", "LayoutDecoderB.decoder.18.bias", "LayoutDecoderB.decoder.19.weight", "LayoutDecoderB.decoder.19.bias", "LayoutDecoderB.decoder.19.running_mean", "LayoutDecoderB.decoder.19.running_var", "LayoutDecoderB.decoder.20.weight", "LayoutDecoderB.decoder.20.bias", "LayoutDecoderB.decoder.21.weight", "LayoutDecoderB.decoder.21.bias", "LayoutDecoderB.decoder.21.running_mean", "LayoutDecoderB.decoder.21.running_var", "LayoutDecoderB.decoder.23.weight", "LayoutDecoderB.decoder.23.bias", "LayoutDecoderB.decoder.24.weight", "LayoutDecoderB.decoder.24.bias", "LayoutDecoderB.decoder.24.running_mean", "LayoutDecoderB.decoder.24.running_var", "LayoutDecoderB.decoder.25.conv.weight", "LayoutDecoderB.decoder.25.conv.bias", "LayoutTransformDecoderB.decoder.0.weight", "LayoutTransformDecoderB.decoder.0.bias", "LayoutTransformDecoderB.decoder.1.weight", "LayoutTransformDecoderB.decoder.1.bias", "LayoutTransformDecoderB.decoder.1.running_mean", "LayoutTransformDecoderB.decoder.1.running_var", "LayoutTransformDecoderB.decoder.3.weight", "LayoutTransformDecoderB.decoder.3.bias", "LayoutTransformDecoderB.decoder.4.weight", "LayoutTransformDecoderB.decoder.4.bias", "LayoutTransformDecoderB.decoder.4.running_mean", "LayoutTransformDecoderB.decoder.4.running_var", "LayoutTransformDecoderB.decoder.5.weight", "LayoutTransformDecoderB.decoder.5.bias", "LayoutTransformDecoderB.decoder.6.weight", "LayoutTransformDecoderB.decoder.6.bias", "LayoutTransformDecoderB.decoder.6.running_mean", "LayoutTransformDecoderB.decoder.6.running_var", "LayoutTransformDecoderB.decoder.8.weight", "LayoutTransformDecoderB.decoder.8.bias", "LayoutTransformDecoderB.decoder.9.weight", "LayoutTransformDecoderB.decoder.9.bias", "LayoutTransformDecoderB.decoder.9.running_mean", "LayoutTransformDecoderB.decoder.9.running_var", "LayoutTransformDecoderB.decoder.10.weight", "LayoutTransformDecoderB.decoder.10.bias", "LayoutTransformDecoderB.decoder.11.weight", "LayoutTransformDecoderB.decoder.11.bias", "LayoutTransformDecoderB.decoder.11.running_mean", "LayoutTransformDecoderB.decoder.11.running_var", "LayoutTransformDecoderB.decoder.13.weight", "LayoutTransformDecoderB.decoder.13.bias", "LayoutTransformDecoderB.decoder.14.weight", "LayoutTransformDecoderB.decoder.14.bias", "LayoutTransformDecoderB.decoder.14.running_mean", "LayoutTransformDecoderB.decoder.14.running_var", "LayoutTransformDecoderB.decoder.15.weight", "LayoutTransformDecoderB.decoder.15.bias", "LayoutTransformDecoderB.decoder.16.weight", "LayoutTransformDecoderB.decoder.16.bias", "LayoutTransformDecoderB.decoder.16.running_mean", "LayoutTransformDecoderB.decoder.16.running_var", "LayoutTransformDecoderB.decoder.18.weight", "LayoutTransformDecoderB.decoder.18.bias", "LayoutTransformDecoderB.decoder.19.weight", "LayoutTransformDecoderB.decoder.19.bias", "LayoutTransformDecoderB.decoder.19.running_mean", "LayoutTransformDecoderB.decoder.19.running_var", "LayoutTransformDecoderB.decoder.20.weight", "LayoutTransformDecoderB.decoder.20.bias", "LayoutTransformDecoderB.decoder.21.weight", "LayoutTransformDecoderB.decoder.21.bias", "LayoutTransformDecoderB.decoder.21.running_mean", "LayoutTransformDecoderB.decoder.21.running_var", "LayoutTransformDecoderB.decoder.23.weight", "LayoutTransformDecoderB.decoder.23.bias", "LayoutTransformDecoderB.decoder.24.weight", "LayoutTransformDecoderB.decoder.24.bias", "LayoutTransformDecoderB.decoder.24.running_mean", "LayoutTransformDecoderB.decoder.24.running_var", "LayoutTransformDecoderB.decoder.25.conv.weight", "LayoutTransformDecoderB.decoder.25.conv.bias".

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.