tom-roddick / oft Goto Github PK

View Code? Open in Web Editor NEW

198.0 198.0 50.0 228 KB

License: MIT License

Python 100.00%

oft's People

Contributors

Stargazers

Watchers

Forkers

ashwathaithal ferrouswheel davtalab weidezhang capai nergnixouhm9 gaoqiangwu chaoso glc12125 samghk li--paul plkms lzw875 yhkim8412 ansonyanxin ruguowoshiyu studian jschuurmans phymucs yizhuami lexili24 aipakchoi imdsafi09 avinashahuja hackkhai prashantraina tdnpp glaceonkori zichengduan qfwysw fangchengji alfredqin avdhutj yuhuang-ca sty61010 senwang98 pyten yijunwu tuskaw manueldiaz96 grgkaran03 truonghason re-frank felicimil qshuiqing daishiqiang123 gaehwanjo drchanchan noticeable

oft's Issues

How can I visualize the confidence map?

I would like to visualize the confidence map. My goal is to find the car center location in 2D birdview map

Use for Input Video

How can we use the infer.py for a inout video? as we won't be having calib, objects and grid information

Running inference

Hi,

First of all great work,

Do you have any intention of creating a script that can run inference? Or if you did how would you go about doing it here? I've been working on it for a bit and I am getting stuck.

Thanks!

Can I load pre-trained model and continue train

first of all, thanks for updating the repo and providing the inference script.
However, there seems to be an issue now with the heatmap based scores during training. I did a clean clone of the repo and launched training as explained in the readme. Looking up the results in Tensorboard after 600 epochs, it can be seen, that the confidence maps don't show up any local maximas (while for the previous version of the repo, the confidence maps correctly showed that the network resolved depth uncertainity with increasing number of epochs and learned to localize objects). Hyperparameters as set by default (only set batch size to 8).

The inference script - using the old model checkpoints - worked for me after adapting NMS stage. Only one method (bbox_corners) in utils.py was missing.

Do you have any idea, to get the training running again? Would appreciate any help on that - thank you!

Best regards,
Chris

Modifying code for multi class

Would changing the number of classes in the model compile from 1 automatically retrieve the labels for the other classes, or the code for label preparation would also need changing?

AttributeError: 'Tensor' object has no attribute 'bool'

Hi,sorry to bother you.The question is that when i try to run the command "python train.py train --gpu 0", an error occured: "AttributeError: 'Tensor' object has no attribute 'bool'",located in encoder.py,line 149 " mask = grid.new_zeros((self.nclass, depth-1, width-1)).bool()".i don't know how to solve this question.Hope you can help me with the error,thank you.

why multiply img_height * img_width * 0.25 when calculating area?

oft/oft/model/oft.py

Line 46 in 2deed80

* img_height * img_width * 0.25 + EPSILON).unsqueeze(1)

area = ((bbox_corners[..., 2:] - bbox_corners[..., :2]).prod(dim=-1) I think that's the code for the computing area of bbox;why do you add img_height * img_width * 0.25? and what's meaning of 0.25?
anyone could do me a favor?

infer.py does not work

Hi,

The training worked well.
But, when I run infer.py, I get the following errors:

RuntimeError: Error(s) in loading state_dict for OftNet:
Missing key(s) in state_dict: "mean", "std", "frontend.conv1.weight", "frontend.bn1.weight", "frontend.bn1.bias", "frontend.layer1.0.conv1.weight", "frontend.layer1.0.bn1.weight", "frontend.layer1.0.bn1.bias", "frontend.layer1.0.conv2.weight", "frontend.layer1.0.bn2.weight", "frontend.layer1.0.bn2.bias", "frontend.layer1.1.conv1.weight", "frontend.layer1.1.bn1.weight", "frontend.layer1.1.bn1.bias", "frontend.layer1.1.conv2.weight", "frontend.layer1.1.bn2.weight", "frontend.layer1.1.bn2.bias", "frontend.layer2.0.conv1.weight", "frontend.layer2.0.bn1.weight", "frontend.layer2.0.bn1.bias", "frontend.layer2.0.conv2.weight", "frontend.layer2.0.bn2.weight", "frontend.layer2.0.bn2.bias", "frontend.layer2.0.downsample.0.weight", "frontend.layer2.0.downsample.1.weight", "frontend.layer2.0.downsample.1.bias", "frontend.layer2.1.conv1.weight", "frontend.layer2.1.bn1.weight", "frontend.layer2.1.bn1.bias", "frontend.layer2.1.conv2.weight", "frontend.layer2.1.bn2.weight", "frontend.layer2.1.bn2.bias", "frontend.layer3.0.conv1.weight", "frontend.layer3.0.bn1.weight", "frontend.layer3.0.bn1.bias", "frontend.layer3.0.conv2.weight", "frontend.layer3.0.bn2.weight", "frontend.layer3.0.bn2.bias", "frontend.layer3.0.downsample.0.weight", "frontend.layer3.0.downsample.1.weight", "frontend.layer3.0.downsample.1.bias", "frontend.layer3.1.conv1.weight", "frontend.layer3.1.bn1.weight", "frontend.layer3.1.bn1.bias", "frontend.layer3.1.conv2.weight", "frontend.layer3.1.bn2.weight", "frontend.layer3.1.bn2.bias", "frontend.layer4.0.conv1.weight", "frontend.layer4.0.bn1.weight", "frontend.layer4.0.bn1.bias", "frontend.layer4.0.conv2.weight", "frontend.layer4.0.bn2.weight", "frontend.layer4.0.bn2.bias", "frontend.layer4.0.downsample.0.weight", "frontend.layer4.0.downsample.1.weight", "frontend.layer4.0.downsample.1.bias", "frontend.layer4.1.conv1.weight", "frontend.layer4.1.bn1.weight", "frontend.layer4.1.bn1.bias", "frontend.layer4.1.conv2.weight", "frontend.layer4.1.bn2.weight", "frontend.layer4.1.bn2.bias", "lat8.weight", "lat8.bias", "lat16.weight", "lat16.bias", "lat32.weight", "lat32.bias", "bn8.weight", "bn8.bias", "bn16.weight", "bn16.bias", "bn32.weight", "bn32.bias", "oft8.y_corners", "oft8.conv3d.weight", "oft8.conv3d.bias", "oft16.y_corners", "oft16.conv3d.weight", "oft16.conv3d.bias", "oft32.y_corners", "oft32.conv3d.weight", "oft32.conv3d.bias", "topdown.0.conv1.weight", "topdown.0.bn1.weight", "topdown.0.bn1.bias", "topdown.0.conv2.weight", "topdown.0.bn2.weight", "topdown.0.bn2.bias", "topdown.1.conv1.weight", "topdown.1.bn1.weight", "topdown.1.bn1.bias", "topdown.1.conv2.weight", "topdown.1.bn2.weight", "topdown.1.bn2.bias", "topdown.2.conv1.weight", "topdown.2.bn1.weight", "topdown.2.bn1.bias", "topdown.2.conv2.weight", "topdown.2.bn2.weight", "topdown.2.bn2.bias", "topdown.3.conv1.weight", "topdown.3.bn1.weight", "topdown.3.bn1.bias", "topdown.3.conv2.weight", "topdown.3.bn2.weight", "topdown.3.bn2.bias", "topdown.4.conv1.weight", "topdown.4.bn1.weight", "topdown.4.bn1.bias", "topdown.4.conv2.weight", "topdown.4.bn2.weight", "topdown.4.bn2.bias", "topdown.5.conv1.weight", "topdown.5.bn1.weight", "topdown.5.bn1.bias", "topdown.5.conv2.weight", "topdown.5.bn2.weight", "topdown.5.bn2.bias", "topdown.6.conv1.weight", "topdown.6.bn1.weight", "topdown.6.bn1.bias", "topdown.6.conv2.weight", "topdown.6.bn2.weight", "topdown.6.bn2.bias", "topdown.7.conv1.weight", "topdown.7.bn1.weight", "topdown.7.bn1.bias", "topdown.7.conv2.weight", "topdown.7.bn2.weight", "topdown.7.bn2.bias", "head.weight", "head.bias".
Unexpected key(s) in state_dict: "module.mean", "module.std", "module.frontend.conv1.weight", "module.frontend.bn1.weight", "module.frontend.bn1.bias", "module.frontend.layer1.0.conv1.weight", "module.frontend.layer1.0.bn1.weight", "module.frontend.layer1.0.bn1.bias", "module.frontend.layer1.0.conv2.weight", "module.frontend.layer1.0.bn2.weight", "module.frontend.layer1.0.bn2.bias", "module.frontend.layer1.1.conv1.weight", "module.frontend.layer1.1.bn1.weight", "module.frontend.layer1.1.bn1.bias", "module.frontend.layer1.1.conv2.weight", "module.frontend.layer1.1.bn2.weight", "module.frontend.layer1.1.bn2.bias", "module.frontend.layer2.0.conv1.weight", "module.frontend.layer2.0.bn1.weight", "module.frontend.layer2.0.bn1.bias", "module.frontend.layer2.0.conv2.weight", "module.frontend.layer2.0.bn2.weight", "module.frontend.layer2.0.bn2.bias", "module.frontend.layer2.0.downsample.0.weight", "module.frontend.layer2.0.downsample.1.weight", "module.frontend.layer2.0.downsample.1.bias", "module.frontend.layer2.1.conv1.weight", "module.frontend.layer2.1.bn1.weight", "module.frontend.layer2.1.bn1.bias", "module.frontend.layer2.1.conv2.weight", "module.frontend.layer2.1.bn2.weight", "module.frontend.layer2.1.bn2.bias", "module.frontend.layer3.0.conv1.weight", "module.frontend.layer3.0.bn1.weight", "module.frontend.layer3.0.bn1.bias", "module.frontend.layer3.0.conv2.weight", "module.frontend.layer3.0.bn2.weight", "module.frontend.layer3.0.bn2.bias", "module.frontend.layer3.0.downsample.0.weight", "module.frontend.layer3.0.downsample.1.weight", "module.frontend.layer3.0.downsample.1.bias", "module.frontend.layer3.1.conv1.weight", "module.frontend.layer3.1.bn1.weight", "module.frontend.layer3.1.bn1.bias", "module.frontend.layer3.1.conv2.weight", "module.frontend.layer3.1.bn2.weight", "module.frontend.layer3.1.bn2.bias", "module.frontend.layer4.0.conv1.weight", "module.frontend.layer4.0.bn1.weight", "module.frontend.layer4.0.bn1.bias", "module.frontend.layer4.0.conv2.weight", "module.frontend.layer4.0.bn2.weight", "module.frontend.layer4.0.bn2.bias", "module.frontend.layer4.0.downsample.0.weight", "module.frontend.layer4.0.downsample.1.weight", "module.frontend.layer4.0.downsample.1.bias", "module.frontend.layer4.1.conv1.weight", "module.frontend.layer4.1.bn1.weight", "module.frontend.layer4.1.bn1.bias", "module.frontend.layer4.1.conv2.weight", "module.frontend.layer4.1.bn2.weight", "module.frontend.layer4.1.bn2.bias", "module.lat8.weight", "module.lat8.bias", "module.lat16.weight", "module.lat16.bias", "module.lat32.weight", "module.lat32.bias", "module.bn8.weight", "module.bn8.bias", "module.bn16.weight", "module.bn16.bias", "module.bn32.weight", "module.bn32.bias", "module.oft8.y_corners", "module.oft8.conv3d.weight", "module.oft8.conv3d.bias", "module.oft16.y_corners", "module.oft16.conv3d.weight", "module.oft16.conv3d.bias", "module.oft32.y_corners", "module.oft32.conv3d.weight", "module.oft32.conv3d.bias", "module.topdown.0.conv1.weight", "module.topdown.0.bn1.weight", "module.topdown.0.bn1.bias", "module.topdown.0.conv2.weight", "module.topdown.0.bn2.weight", "module.topdown.0.bn2.bias", "module.topdown.1.conv1.weight", "module.topdown.1.bn1.weight", "module.topdown.1.bn1.bias", "module.topdown.1.conv2.weight", "module.topdown.1.bn2.weight", "module.topdown.1.bn2.bias", "module.topdown.2.conv1.weight", "module.topdown.2.bn1.weight", "module.topdown.2.bn1.bias", "module.topdown.2.conv2.weight", "module.topdown.2.bn2.weight", "module.topdown.2.bn2.bias", "module.topdown.3.conv1.weight", "module.topdown.3.bn1.weight", "module.topdown.3.bn1.bias", "module.topdown.3.conv2.weight", "module.topdown.3.bn2.weight", "module.topdown.3.bn2.bias", "module.topdown.4.conv1.weight", "module.topdown.4.bn1.weight", "module.topdown.4.bn1.bias", "module.topdown.4.conv2.weight", "module.topdown.4.bn2.weight", "module.topdown.4.bn2.bias", "module.topdown.5.conv1.weight", "module.topdown.5.bn1.weight", "module.topdown.5.bn1.bias", "module.topdown.5.conv2.weight", "module.topdown.5.bn2.weight", "module.topdown.5.bn2.bias", "module.topdown.6.conv1.weight", "module.topdown.6.bn1.weight", "module.topdown.6.bn1.bias", "module.topdown.6.conv2.weight", "module.topdown.6.bn2.weight", "module.topdown.6.bn2.bias", "module.topdown.7.conv1.weight", "module.topdown.7.bn1.weight", "module.topdown.7.bn1.bias", "module.topdown.7.conv2.weight", "module.topdown.7.bn2.weight", "module.topdown.7.bn2.bias", "module.head.weight", "module.head.bias".

Please help.

Best Regards
Sambit

grid_sample() behavior changed in pytorch

running OFT gets me the following warnings:
torch/nn/functional.py:3828: UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.

I use pytorch 1.4 (as this project does not include a conda environment / pip requirements file, I just tried current standard setup)

my question:
what is the target pytorch version?
OR:
in oft.py: OFT.forward(), does the F.grid_sample(..) need to have align_corners set to True or False?

How can export ONNX?

Hi,

Can you export ONNX of model?

RuntimeError: Loss diverged :()

Since there is only one camera. The calib.txt is:
P0: 6.37994953564521e+02 0.000000000000e+00 3.01236221058740e+02 0.000000000000e+02 0.000000000000e+00 6.00262324131012e+02 2.27931532293491e+02 0.000000000000e+02 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 0.000000000000e+02
Tr: -5.1912446e-03 8.5375016e-03 -4.025496e-04 -1.6381511968e+02 -3.0239484e-03 1.3941179e-03 -9.4293251e-03 -9.1064823408e+02 -7.9941676e-03 -5.0167222e-03 -3.3054171e-03 9.66718621095e+03
If the last column of P0 is all 0, the loss will be NAN.
How to solve this problem？

Loss doesn't decrease

I am trying to train with the KITTI dataset by following the README. Somehow, the total loss doesn't seem to decrease at all, even after training for 23 epochs (~6 hours). How many epoch should we expect to see total loss to start decreasing?

==> Training epoch complete
score : 2.3804e+01
position: 2.0489e+06
dimension: 4.0322e+05
angle : 1.1116e+04
total : 2.4632e+06

=== Beginning epoch 24 of 600 ===

I have Pytorch 1.3.0 if that matters.

position offsets

In the paper, you said "We use the same scale factor σ as described in Section 3.4 to normalize the position offsets within a sensible range". But in your code you used pos_std=[.5, .36, .5] to do normalization ? How did this pos_std come from ? Thank you.

determining visible objects in random_crop

Why only use x coordinate of projection of an object's position to determine visible when doing random_crop?

The loss value of the validation set does not seem to decrease

Hi all:
When i run the code, the loss value of the training set keeps decreasing.However,the loss value of the validation set does not seem to decrease. I want to know where i did wrong. Can anybody help me?

No module named 'oft.model.edoft'

I run the train.py ,there is a error like the title , Is there a problem in my env ? I modify the file in oft/model/oftnet.py ,I delete "from .edoft import ExplicitDepthOFT" , Are there some problem with this Modify

Pre-trained model

Hello @tom-roddick,

Are you going to provide some pre-trained model?

Binary mask in Encoder is incorrect

In the Encoder class, the encode function returns a binary mask which is then used in some loss computation. The current mask is all ones and I think there is a bug in this line: https://github.com/tom-roddick/oft/blob/master/oft/data/encoder.py#L51

mask = indices >= 0

If I replace this line with mask = indices > 0, the mask contains 1 where the object is present and 0 otherwise. The paper mentions that position loss, dimension loss, etc. are computed only on grid locations which intersect with ground truth objects. Hence it is essential the mask is correct.

learning rate

In the paper you said "The model is trained using SGD for 600 epochs with a batch size of 8, momentum of 0.9 and learning rate of 10−7". Why we need to set learning rate so small?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.