Code Monkey home page Code Monkey logo

avod's Introduction

Aggregate View Object Detection

Build Status

This repository contains the public release of the Python implementation of our Aggregate View Object Detection (AVOD) network for 3D object detection.

Joint 3D Proposal Generation and Object Detection from View Aggregation

Jason Ku, Melissa Mozifian, Jungwook Lee, Ali Harakeh, Steven L. Waslander

If you use this code, please cite our paper:

@article{ku2018joint, 
  title={Joint 3D Proposal Generation and Object Detection from View Aggregation}, 
  author={Ku, Jason and Mozifian, Melissa and Lee, Jungwook and Harakeh, Ali and Waslander, Steven}, 
  journal={IROS}, 
  year={2018}
}

Videos

These videos show detections on several KITTI sequences and our own data in snowy and night driving conditions (with no additional training data).

AVOD Detections

here

AVOD-FPN Detections

here

KITTI Object Detection Results (3D and BEV)

AP-3D AP-BEV
Method Runtime Easy Moderate Hard Easy Moderate Hard
Car
MV3D 0.36 71.09 62.35 55.12 86.02 76.90 68.49
VoxelNet 0.23 77.47 65.11 57.73 89.35 79.26 77.39
F-PointNet 0.17 81.20 70.39 62.19 88.70 84.00 75.33
AVOD 0.08 73.59 65.78 58.38 86.80 85.44 77.73
AVOD-FPN 0.10 81.94 71.88 66.38 88.53 83.79 77.90
Pedestrian
VoxelNet 0.23 39.48 33.69 31.51 46.13 40.74 38.11
F-PointNet 0.17 51.21 44.89 40.23 58.09 50.22 47.20
AVOD 0.08 38.28 31.51 26.98 42.52 35.24 33.97
AVOD-FPN 0.10 50.80 42.81 40.88 58.75 51.05 47.54
Cyclist
VoxelNet 0.23 61.22 48.36 44.37 66.70 54.76 50.55
F-PointNet 0.17 71.96 56.77 50.39 75.38 61.96 54.68
AVOD 0.08 60.11 44.90 38.80 63.66 47.74 46.55
AVOD-FPN 0.10 64.00 52.18 46.61 68.09 57.48 50.77

Table: Comparison of results with other published methods on the KITTI 3D Object and BEV benchmarks (accessed Apr 11, 2018).

Additional Links

AVOD-SSD

There is a single stage version of AVOD available here

Average Heading Similarity (AHS) Native Evaluation

See here for more information on the modified KITTI native evaluation script.

Getting Started

Implemented and tested on Ubuntu 16.04 with Python 3.5 and Tensorflow 1.3.0.

  1. Clone this repo
git clone [email protected]:kujason/avod.git --recurse-submodules

If you forget to clone the wavedata submodule:

git submodule update --init --recursive
  1. Install Python dependencies
cd avod
pip3 install -r requirements.txt
pip3 install tensorflow-gpu==1.3.0
  1. Add avod (top level) and wavedata to your PYTHONPATH
# For virtualenvwrapper users
add2virtualenv .
add2virtualenv wavedata
# For nonvirtualenv users
export PYTHONPATH=$PYTHONPATH:'/path/to/avod'
export PYTHONPATH=$PYTHONPATH:'/path/to/avod/wavedata'
  1. Compile integral image library in wavedata
sh scripts/install/build_integral_image_lib.bash
  1. Avod uses Protobufs to configure model and training parameters. Before the framework can be used, the protos must be compiled (from top level avod folder):
sh avod/protos/run_protoc.sh

Alternatively, you can run the protoc command directly:

protoc avod/protos/*.proto --python_out=.

Training

Dataset

To train on the Kitti Object Detection Dataset:

  • Download the data and place it in your home folder at ~/Kitti/object
  • Go here and download the train.txt, val.txt and trainval.txt splits into ~/Kitti/object. Also download the planes folder into ~/Kitti/object/training

The folder should look something like the following:

Kitti
    object
        testing
        training
            calib
            image_2
            label_2
            planes
            velodyne
        train.txt
        val.txt

Mini-batch Generation

The training data needs to be pre-processed to generate mini-batches for the RPN. To configure the mini-batches, you can modify avod/configs/mb_preprocessing/rpn_[class].config. You also need to select the class you want to train on. Inside the scripts/preprocessing/gen_mini_batches.py select the classes to process. By default it processes the Car and People classes, where the flag process_[class] is set to True. The People class includes both Pedestrian and Cyclists. You can also generate mini-batches for a single class such as Pedestrian only.

Note: This script does parallel processing with num_[class]_children processes for faster processing. This can also be disabled inside the script by setting in_parallel to False.

cd avod
python scripts/preprocessing/gen_mini_batches.py

Once this script is done, you should now have the following folders inside avod/data:

data
    label_clusters
    mini_batches

Training Configuration

There are sample configuration files for training inside avod/configs. You can train on the example configs, or modify an existing configuration. To train a new configuration, copy a config, e.g. pyramid_cars_with_aug_example.config, rename this file to a unique experiment name and make sure the file name matches the checkpoint_name: 'pyramid_cars_with_aug_example' entry inside your config.

Run Trainer

To start training, run the following:

python avod/experiments/run_training.py --pipeline_config=avod/configs/pyramid_cars_with_aug_example.config

(Optional) Training defaults to using GPU device 1, and the train split. You can specify using the GPU device and data split as follows:

python avod/experiments/run_training.py --pipeline_config=avod/configs/pyramid_cars_with_aug_example.config  --device='0' --data_split='train'

Depending on your setup, training should take approximately 16 hours with a Titan Xp, and 20 hours with a GTX 1080. If the process was interrupted, training (or evaluation) will continue from the last saved checkpoint if it exists.

Run Evaluator

To start evaluation, run the following:

python avod/experiments/run_evaluation.py --pipeline_config=avod/configs/pyramid_cars_with_aug_example.config

(Optional) With additional options:

python avod/experiments/run_evaluation.py --pipeline_config=avod/configs/pyramid_cars_with_aug_example.config --device='0' --data_split='val'

The evaluator has two main modes, you can either evaluate a single checkpoint, a list of indices of checkpoints, or repeatedly. The evaluator is designed to be run in parallel with the trainer on the same GPU, to repeatedly evaluate checkpoints. This can be configured inside the same config file (look for eval_config entry).

To view the TensorBoard summaries:

cd avod/data/outputs/pyramid_cars_with_aug_example
tensorboard --logdir logs

Note: In addition to evaluating the loss, calculating accuracies, etc, the evaluator also runs the KITTI native evaluation code on each checkpoint. Predictions are converted to KITTI format and the AP is calculated for every checkpoint. The results are saved inside scripts/offline_eval/results/pyramid_cars_with_aug_example_results_0.1.txt where 0.1 is the score threshold. IoUs are set to (0.7, 0.5, 0.5)

Run Inference

To run inference on the val split, run the following script:

python avod/experiments/run_inference.py --checkpoint_name='pyramid_cars_with_aug_example' --data_split='val' --ckpt_indices=120 --device='1'

The ckpt_indices here indicates the indices of the checkpoint in the list. If the checkpoint_interval inside your config is 1000, to evaluate checkpoints 116000 and 120000, the indices should be --ckpt_indices=116 120. You can also just set this to -1 to evaluate the last checkpoint.

Viewing Results

All results should be saved in avod/data/outputs. Here you should see proposals_and_scores and final_predictions_and_scores results. To visualize these results, you can run demos/show_predictions_2d.py. The script needs to be configured to your specific experiments. The scripts/offline_eval/plot_ap.py will plot the AP vs. step, and print the 5 highest performing checkpoints for each evaluation metric at the moderate difficulty.

LICENSE

Copyright (c) 2018 Jason Ku, Melissa Mozifian, Ali Harakeh, Steven L. Waslander

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

avod's People

Contributors

fmigneault avatar kujason avatar melfm avatar villanuevab avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

avod's Issues

protoc version

Which version of "protoc" are you using in this project?

I got below error

avod/protos/kitti_utils.proto:24:5: Expected "required", "optional", or "repeated".
avod/protos/kitti_utils.proto:24:25: Missing field number.
avod/protos/kitti_dataset.proto: Import "avod/protos/kitti_utils.proto" was not found or had errors.
avod/protos/kitti_dataset.proto:39:14: "KittiUtilsConfig" is not defined.

How to generate testing results?

I can run all of the instructions, however, I'm not sure how to generate results on the testing dataset for Kitti benchmark submission.
Could you tell me how to do it?

Performance on validation set not aligned to the report in paper

"car_detection_3D AP: 82.047119 67.536583 66.807381" top performance on iteration 39000, this is the top ranked by the given script, there is still a gap on the moderate, I leave the config by default to run "avod_cars_examples.config", is there anything I am missing ?

ValueError: could not convert string to float: "b'0.00'"

(py35) yanchao@yanchao:~/avod$ python scripts/preprocessing/gen_mini_batches.py
Clustering labels 1 / 3712Traceback (most recent call last):
File "scripts/preprocessing/gen_mini_batches.py", line 199, in
main()
File "scripts/preprocessing/gen_mini_batches.py", line 120, in main
car_dataset_config_path)
File "/home/yanchao/MyProjects/avod/avod/builders/dataset_builder.py", line 154, in load_dataset_from_config
use_defaults=False)
File "/home/yanchao/MyProjects/avod/avod/builders/dataset_builder.py", line 191, in build_kitti_dataset
return KittiDataset(cfg_copy)
File "/home/yanchao/MyProjects/avod/avod/datasets/kitti/kitti_dataset.py", line 131, in init
self.kitti_utils = KittiUtils(self)
File "/home/yanchao/MyProjects/avod/avod/datasets/kitti/kitti_utils.py", line 59, in init
self.label_cluster_utils.get_clusters()
File "/home/yanchao/MyProjects/avod/avod/core/label_cluster_utils.py", line 194, in get_clusters
img_idx)
File "/home/yanchao/MyProjects/avod/wavedata/wavedata/tools/obj_detection/obj_utils.py", line 125, in read_labels
obj.truncation = float(p[1])
ValueError: could not convert string to float: "b'0.00'"

Only recall objects in the right side

Hi, thanks for your sharing. I have trained and validated your network in kitti object train/val split, it works great.
However, when I test the network on kitti raw dataset, it gives out the below results. Only the right side objects are detected in the whole sequence.
avod
So any clues for pointing out the possible reason?

How to remove undetected bounding boxes

Hi Team,

Thanks to your instruction, I am able to run your code to train, evaluate and inference on the Kitti dataset.

After running the demo generation for 2d image demos/show_predictions_2d.py, I see lots of green bounding boxes as image below. Would you mind letting me know the color coding invention you are using? What is the difference between yello and red?

And, importantly, how could I disable those green boxes?
000134

In addition, I am wondering if you have a script to generate the demo for 3d point cloud also. Any hint would be greatly appreciated.

000078

Thank you,

GPU memory usage when training

Hi,
How much GPU memory do we need to train with your codes?
When training with your codes, the number of anchors among different images are different due to the distribution of lidar points, so will the GPU memory change in different batch?
Actually when I do the training, I found the GPU memory cost is always about 4GB, where I am curious.
Thanks very much.

Result on validation set

  1. I check the training split for validation, that's the same with the one released by MV3D paper, is that the case ?
  2. By the split, on validation set, I run the "avod_cars_example.config", and get results after 120,000 iterations, in "avod_cars_example_results_05_iou_0.1.txt"
    "car_detection_3D_AP: 89.85737 80.741768 80.542282"
    Is it reasonable results ?
  3. Checked the MV3D paper, it seems they report better results on the same splitted validation set ?

Validation result doesn't convergent for "deep fusion"

Hi, thank you for sharing your code, we are doing relative research and it's very helpful for us!

Now we train the network using pyrimid_cars_with_aug_example.config, with fusion type "deep". After the validation, the best car_detection_3D results appears at around the iteration 52500 checkpoint, which is:
car_detection_3D : [52500, 120000, 70000, 100000, 102500]

In 52500 checkpoint, the result is:
car_detection_3D AP: 84.555695 74.843224 68.156281

Moreover, the result vibrates at the later checkpoints, even decrease to (car_detection_3D AP: 77.260490 68.040474 67.316147) at 117500 checkpoint, as you can see in the attachment figure.

Since it is our first time to train such a large network, my question is: is it a normal thing to have the best performance at 52500 checkpoint? Can we say that the result has already converged? Thanks and looking forward to your reply!
capture

Wrong at Evaluator&Inference script

Hi, I successfully trained the avod to 120000 iterations, but when I ran the evaluator&inference script, they both stopped when processing sample 002908, and the wrong was like

libpng error: Read Error
Traceback (most recent call last):
File "avod/experiments/run_evaluation.py", line 130, in
tf.app.run()
File "/home/prp/anaconda2/envs/py35tf13/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "avod/experiments/run_evaluation.py", line 126, in main
evaluate(model_config, eval_config, dataset_config)
File "avod/experiments/run_evaluation.py", line 83, in evaluate
model_evaluator.repeated_checkpoint_run()
File "/home/prp/chrisli/myavod/avod/avod/core/evaluator.py", line 460, in repeated_checkpoint_run
self.run_checkpoint_once(checkpoint_to_restore)
File "/home/prp/chrisli/myavod/avod/avod/core/evaluator.py", line 199, in run_checkpoint_once
feed_dict = self.model.create_feed_dict()
File "/home/prp/chrisli/myavod/avod/avod/core/models/avod_model.py", line 655, in create_feed_dict
feed_dict = self._rpn_model.create_feed_dict()
File "/home/prp/chrisli/myavod/avod/avod/core/models/rpn_model.py", line 643, in create_feed_dict
shuffle=False)
File "/home/prp/chrisli/myavod/avod/avod/datasets/kitti/kitti_dataset.py", line 424, in next_batch
samples_in_batch.extend(self.load_samples(np.arange(start, end)))
File "/home/prp/chrisli/myavod/avod/avod/datasets/kitti/kitti_dataset.py", line 277, in load_samples
rgb_image = cv_bgr_image[..., :: -1]
TypeError: 'NoneType' object is not subscriptable

The system is Ubuntu 16.04 with python3.5 and tensorflow1.3.0.

Thanks for your help.

Running the code remotely

If I run this code remotely using SSH do I need to change some thing in the code or configuration ?

Inference Time

Thanks for sharing your code!

I have followed the procedure in the README and trained the model after 53000 step, So I made an experiment using the evaluator, below is the output

Step 53000: 450 / 3769, Inference on sample 001021
Step 53000: Eval RPN Loss: objectness 0.149, regression 0.095, total 0.244
Step 53000: Eval AVOD Loss: classification 0.038, regression 1.809, total 2.091
Step 53000: Eval AVOD Loss: localization 1.310, orientation 0.499
Step 53000: RPN Objectness Accuracy: 0.95703125
Step 53000: AVOD Classification Accuracy: 0.9880478087649402
Step 53000: Total time 0.577916145324707 s
Step 53000: 451 / 3769, Inference on sample 001022
Step 53000: Eval RPN Loss: objectness 0.026, regression 0.094, total 0.119
Step 53000: Eval AVOD Loss: classification 0.019, regression 0.942, total 1.080
Step 53000: Eval AVOD Loss: localization 0.897, orientation 0.045
Step 53000: RPN Objectness Accuracy: 0.9921875
Step 53000: AVOD Classification Accuracy: 0.9970443349753695
Step 53000: Total time 0.24765753746032715 s
Step 53000: 452 / 3769, Inference on sample 001025
Step 53000: Eval RPN Loss: objectness 0.172, regression 0.175, total 0.347
Step 53000: Eval AVOD Loss: classification 0.044, regression 3.892, total 4.282
Step 53000: Eval AVOD Loss: localization 3.537, orientation 0.354
Step 53000: RPN Objectness Accuracy: 0.970703125
Step 53000: AVOD Classification Accuracy: 0.9950884086444007
Step 53000: Total time 0.2989237308502197 s
Step 53000: 453 / 3769, Inference on sample 001026

I think the inference time(around 0.3s) is slow compared with 100ms claimed in the paper, any suggestion? I'm using a 1080Ti GPU.

Which version for protoc?

I encounter the problem "avod/protos/kitti_utils.proto:24:5: Expected "required", "optional", or "repeated" when I execute "sh avod/protos/run_protoc.sh". I'm not familiar with protoc. Is there something wrong with the version of my protoc? (version is 2.5.0)

SyntaxError when executing gen_mini_batches.py

Hi,

Thank you for making your code available.

I follow your instruction in the front page but have encountered a syntax error while invoking the gen_mini_batches.py script as below.

Would you mind letting me know what I could do wrong?

tuan@mypc:~/avod$ python scripts/preprocessing/gen_mini_batches.py
Traceback (most recent call last):
  File "scripts/preprocessing/gen_mini_batches.py", line 6, in <module>
    from avod.builders.dataset_builder import DatasetBuilder
  File "avod/avod/builders/dataset_builder.py", line 169
    new_cfg=None) -> KittiDataset:
                  ^
SyntaxError: invalid syntax

Code retrieved in 5/11/2018.

Python version

Python 2.7.12 (default, Dec  4 2017, 14:50:18) 

Kitti data structure as instructed as:

Download the data and place it in your home folder at ~/Kitti/object

tuan@mypc:~/Kitti/object$ tree -L 2
.
├── training
│   ├── calib -> /opt/dataset/KITTI_3D/calib/training/calib
│   ├── image_2 -> /opt/dataset/KITTI_3D/image_2/training/image_2
│   ├── planes
│   └── velodyne -> /opt/dataset/KITTI_3D/velodyne/training/velodyne
├── train.txt
├── trainval.txt
└── val.txt

Ubuntu 16.04 LTS
GPU: nVidia 1080 Ti

Regards,
Tuan

Question, how to run without camera

Hi!

Thank you for sharing your excellent work. In the paper there is a row in Table III where you use BEV only features (RPN BEV Only). I am interested in using this network, is there a configuration file available for this?

Also, would it be possible for you to share the trained models?

calculate AP

the test set of KITTI dataset did not provide labels, How did you get the AP in your ablation experiment?

How the planes is generated?

Hi

The project provided a planes directory for model's input. I'm a little curious about the way it was generated and It seems cannot be found in the paper. Could you please suggest some related resources? Thanks!

question regarding 2D IoU in BEV

Hello,

Thank you for releasing the code to your paper!

"Background anchors are determined by calculating the 2D IoU in BEV between the anchors
and the ground truth bounding boxes. For the car class, anchors with IoU less than 0.3 are considered background anchors, while ones with IoU greater than 0.5"

I am trying to figure out how you overcame the problem of IoU calculation for non-axis aligned rectangles to determine negative and positive anchor predictions. The calculation uses 2 box_list objects.
Could you please point me towards the box_list generation for the ground truth labels. Or help me to understand the process with a few words about the content of these box_lists.
Is it an IoU-calculation between axis aligned bounding boxes around the ground truth box and anchor prediction boxes ?

Regards,
Johannes

training loss

@kujason
Hi, thank you for sharing the code. Ifollowed all the instructions and now training for CAR begins. Without modifying any code and using default config, training loss seems not right. It fluctuates and something loss are like this. Do you have any idea why this is happening?

Step 500, Total Loss 2.645, Time Elapsed 6.096 s
Step 550, Total Loss 3.583, Time Elapsed 5.431 s
Step 560, Total Loss 11.722, Time Elapsed 5.333 s
Step 570, Total Loss 3.723, Time Elapsed 5.495 s
Step 580, Total Loss 1.895, Time Elapsed 5.473 s
Step 590, Total Loss 15.548, Time Elapsed 5.860 s
Step 600, Total Loss 2.417, Time Elapsed 5.647 s

How to detect vehicle and people simultaneously

  1. As described in paper, the network detect vehicle and people separately, so how to detect both of them simultaneously?
  2. Whether the config file avod/configs/pyramid_cars_with_aug_example.config refers to the project AVOD-FPN ranking first in ketti?
    Thx!

How to generate anchor_info for model's input?

  1. The model asked an anchor_info as the input, which relays on the ground truth bounding box from my understanding.
  2. I see from the code gen_mini_batches.py generate anchor_info for both training and validation data with ground truth label. However, how could we generate anchor_info for the testing data without a ground truth label?

It would be great that you can clarify that procedure.

thanks!

Can we have less than 12000 iterations?

I am training this model and I have a slow gpu I wonder if I will have the some network performances if I cut the training to 9000 or 8000 with step of 50 ?if there any harm in that ?
because training + evaluation takes almost 54 hours for me

another question why there is no max element wise fusion option for RPN model and how "late" and "deep" fusion types perform in comparison to "early" fusion?
Thank you

Why VGG 16 and not ResNet ?

Hello ,Thank you for sharing the code of this great work
Why did you choose to use VGG16 as encoder instead of ResNet ?
Are you using Batch normalization ?
Another question how can we calculate the distance to the obstacles using the Lidar Points representation ?
Thank you

Protoc compile

Anyone who knows how to solve this problems when I compile protoc from the avod folder. Thanks.

avod/protos/kitti_utils.proto:24:5: Expected "required", "optional", or "repeated".
avod/protos/kitti_utils.proto:24:25: Missing field number.
avod/protos/kitti_dataset.proto: Import "avod/protos/kitti_utils.proto" was not found or had errors.
avod/protos/kitti_dataset.proto:39:14: "KittiUtilsConfig" is not defined.

Recursive git clone fails

This command fails, with unknown publickey error from the wavedata submodule:

git clone https://github.com/kujason/avod --recurse-submodules

But this repo can correctly be cloned within the just cloned AVOD repo with:

cd avod
git clone https://github.com/kujason/wavedata

Maybe some setting is incorrect somewhere?

I can't find the results

After training and running the evaluation I can't find the results file

FileNotFoundError: [Errno 2] No such file or directory: 'results/avod_cars_example_results_0.1.txt'

so I am wondering what is the problem?, your help will be much appreciated because I am working on my master's Thesis

New dataset

I was wondering if anyone tried avod with other dataset. Also, if anyone can give instruction to setup avod with new dataset it would be great. Thank you.

avod-ssd

where can i find the version of avod-ssd?
thanks

Project lidar to camera coordinate?

Hi, in kitti_utils.get_point_cloud, it returns the lidar points which are projected onto camera coordinate. Why don't just return the raw lidar data? Thanks~

How to run with our own data

Hi, thanks for sharing your code. I just finished the training and wonder how to do inference on our own data. I am looking into the evaluator.py, but I would be grateful if you could give me some hints. Thanks~

Preprocessing time

Read disk (image, calib, ground, pc) time:
Min: 0.01494
Max: 0.03513
Mean: 0.02009
Median: 0.02044


Create bev time:
Min: 0.01963
Max: 0.09319
Mean: 0.0378
Median: 0.034

Load sample time:
Min: 0.05957
Max: 0.18219
Mean: 0.08599
Median: 0.07644


Fill anchor time:
Min: 0.0688
Max: 0.16987
Mean: 0.0908
Median: 0.07938


Feed dict time:
Min: 0.12845
Max: 0.29517
Mean: 0.17686
Median: 0.15515

Inference time:
Min: 0.08431
Max: 2.92182
Mean: 0.16493
Median: 0.09333

Preprocessing time profiled as above, much larger more than 0.02s, I don't think my cpu is super weak, do you have any suggestion ? For example, fill anchor time is so expensive ?

Questions about 2D detection performance

First of all, congratulations on your good paper and code release.

According to your paper, many problems that existed in existing 2d box recognition have been solved. But I am curious. Why is the 2d box recognition rate not ranked at the top from kitti? ( Compared to methods that do not use point clouds)

I am curious to see if there is any limit to this lidar technique about 2d detections.

Question on card with low VRAM

Hey, my graphics card only has 2GB of VRAM is there anyway I can change the batch size for the training to work. I always get an error saying tensorflow ran out of memory. I've changed some settings in the configs but can't seem to get it to work. Or can someone maybe please upload a pretrained model I just wanted to test a few things.

Thanks

Which configurations to use to get the results in the paper ?

I am trying to reproduce the results in the paper, the ones in Table 1, but I can't. I keep getting values up to 8% less than the ones in the paper. So which configuration files (for preprocessing and training) and thresholds in the evaluation script to use to reproduce the values in the paper ? Or are there any modifications to the existing configurations (preprocessing and training) to do that ?

Your help would be much appreciated, I'm working on my masters thesis :)

thanks

thank you for sharing your code

preprocessing time

The preprocessing time is like:
Feed dict time:
Min: 0.12029
Max: 0.24133
Mean: 0.14371
Median: 0.14227
much larger than 0.02s reported ?

Question about AP in Validation set

Hi

Firstly, much thanks for your code release.

I successfully trained the avod-cars network to 120000 iteration, however, when i run the evaluation command:

python avod/experiments/run_evaluation.py --pipeline_config=avod/configs/pymarid_cars_with_aug_example.config --device='0' --data_split='val'

the result on 120000 iteration is as follows:

120000
done.
car_detection AP: 22.552376 24.332737 25.962851
car_detection_BEV AP: 22.371897 23.603966 25.545847
car_heading_BEV AP: 22.340508 23.495865 25.304886
car_detection_3D AP: 21.757717 19.721174 20.458136
car_heading_3D AP: 21.725494 19.660765 20.347580

which is much lower than the results in the paper(more than 70% basically). And I also run the evaluation on the checkpoint 110000 iteration, which is (26.233753 23.378477 27.482744) on car_detection_3D AP performance.

Results on testing split

Hi,
Thanks for your great codes.
I run your codes with pyramid_cars_with_aug_example configuration, and final get the AP about (84.0, 74.0, 67.7) on val split, which is comparable with the results on your paper.
But when I use the best model to run on the test split, and I only get AP 56.4 on moderate split from the official testing service.
So do you know where is the problem?
Thanks very much.

Questions about the comparison with MV3D

Firstly, thanks for sharing your work and codes, it definitely helps me a lot. After carefully comparing your work and MV3D, I have a few questions for the comparison result:

  1. How can you get the 0.7-3D-IoU AP of validation data set (i.e. 83.87% 72.35% 64.56% in Table I) for MV3D? I did not find the same results presented in the original MV3D paper.

  2. Actually MV3D also exploits 2x or 4x deconv operations to upsampling the last feature map to handle extra-small objects, though it cannot get a full size as you did. Therefore, except the different upsampling methods for feature map, could you help me to point out the major difference between these two works?

Questions about continue training from the last checkpoint

First of all, thank you for sharing your brilliant work.

When setting "overwrite_checkpoints: False" in the config, it means if you stop at any checkpoint and later you want to train with more iterations it will start from the last checkpoint you saved.

Here are my questions,

  1. How do you perform deciding the order of input data?
  2. When continue training from last checkpoint, will the input data order be initialized?

To determine which model is used during testing

During testing, how can we determine the best model to be used ?
For example, to repeat the results on the KITTI leading board, how can you determine the model to be used ? Do you just use the last model when 120,000 iterations are finished ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.