Code Monkey home page Code Monkey logo

tl_ssd's People

Contributors

julimueller avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

tl_ssd's Issues

Do we need the labelmap.prototxt file when we use the class "112340" as the label as you mentioned in training phase?

I want to prepared the DTLD dataset for training the model, I think I can use the class label(e.g. 112340) in the txt directly as you recommended. And then using lmdb format data created from lists of .jpg and .txt. So I don't need to using the labelmap file.
By the way, do you use the same method for data augmentation as the original SSD? It will help a lot if you could share your data layer in your train.prototxt.

"caffe.PriorBoxParameter" has no field named "offset_w".`

I use DTLD dataset for test and I didn't change deploy.prototxt anf caffemodel.
But when the ssd_dtld_test.py run at caffe.Net(), it raise the exception:
[libprotobuf ERROR google/protobuf/text_format.cc:274] Error parsing text-format caffe.NetParameter: 3668:13: Message type "caffe.PriorBoxParameter" has no field named "offset_w".
Anyone can help me?

Model with offsets adaption

Hi,

Figure 6 in the paper shows that it could improve the performance a lot by using the adaption of offsets (section III-C in the paper).
image
I am wondering if the adaptions are the configuration mentioned in the README (i.e., the offset_ws), which I have pasted below. If so, is it possible to get the model structure and the weights? Thank you!

layer {
  name: "inception_b4_concat_norm_mbox_priorbox"
  type: "PriorBox"
  bottom: "inception_b4_concat_norm"
  bottom: "data"
  top: "inception_b4_concat_norm_mbox_priorbox"
  prior_box_param {
    min_size: 7
    min_size: 10
    min_size: 15
    min_size: 25
    min_size: 35
    min_size: 50
    min_size: 70
    aspect_ratio: 0.3
    flip: false
    clip: false
    variance: 0.1
    variance: 0.1
    variance: 0.2
    variance: 0.2
    offset_w: 0.2
    offset_w: 0.4
    offset_w: 0.6
    offset_w: 0.8
    offset_h: 0.5
  }
}

Can't find the inception_c and concatenation part

Hi, Julian, In your paper you use the concat layer from inception_a3, inception_b4 and inception_c2, but I didn't find these operation in the deploy.prototxt. So are these net structures in train.prototxt only? And will you publish your training code later? Thanks for your work.

Number of priors must match number of location predictions

Hello:
I have compiled the code according to readme sucessfully.
But when I run the ssd_dtld_test.py , I meet the error:
F0215 12:27:08.644311 15733 detection_output_layer.cpp:164] Check failed: num_priors_ * num_loc_classes_ * 4 == bottom[0]->channels() (220472 vs. 110236) Number of priors must match number of location predictions.
anyone can help me solve the problem?

Inference on different image sizes

Hello,

I want to inference on difference dataset which consists of 256x512 images with pretrained model on repo.
Before that I tried with images 1024x768 and I found it very robust, but when I tried with 256x512 images its accuracy went nearly zero.

So if I want to inference on different image sizes, I am changing:

  1. ssd_dtld_test.py
    code

  2. deploy.prototxt
    code

So what should I do? Is there anything needs to be changed further?

Have a great day.

Detection or classification

Hello Julian Müller,

I was testing the model for my better understanding, and I could use the model for detection of traffic light but I couldn't get any success on classification. Should i need to change the deploy.prototxt in order to handle the classification.

Thanks,
Vishwa

Why we use raw_scale in preprocessing?

Hello everyone,

Thank you for this repo first of all.

array([[[11730., 11730., 11730., ...,  9435.,  8415.,  8415.],
        [11730., 11730., 11730., ...,  9435.,  8415.,  8415.],
        [11730., 11730., 11475., ...,  9180.,  9945.,  9945.],
        ...,
        [10200., 10200., 10965., ...,  6885.,  6885.,  6885.],
        [10200., 10200., 10200., ...,  6120.,  5865.,  5865.],
        [ 9945.,  9945.,  9690., ...,  5100.,  5100.,  5100.]],

       [[10455., 10455., 10965., ..., 16575., 13770., 13770.],
        [10455., 10455., 10965., ..., 16575., 13770., 13770.],
        [ 9690.,  9690.,  9435., ..., 14280., 15300., 15300.],
        ...,
        [10455., 10455., 10200., ..., 11220., 10455., 10455.],
        [10200., 10200.,  9945., ...,  9435.,  8670.,  8670.],
        [10455., 10455., 10200., ...,  7905.,  8415.,  8415.]],

       [[ 8160.,  8160.,  8160., ..., 13260., 13260., 13260.],
        [ 8160.,  8160.,  8160., ..., 13260., 13260., 13260.],
        [ 7650.,  7650.,  7650., ..., 10200., 13005., 13005.],
        ...,
        [ 8160.,  8160.,  8160., ...,  9435.,  9180.,  9180.],
        [ 8160.,  8160.,  8160., ...,  7905.,  7905.,  7905.],
        [ 8415.,  8415.,  8415., ...,  7140.,  7395.,  7395.]]],
      dtype=float32)

Forgive my curiosity, what am I missing here?

Have a great day.

PyTorch replication

Hi,

I have replicated this work in PyTorch py_tl_ssd. However, I cannot verify the correctness of my replication due to the compilation issues of this project (cuda driver problems). Is it possible to get the inference results on DTLD dataset of tl_ssd? So that I could compare the results and verify the correctness of my replication. Thanks for your help!

error in build "'FocalLossParameter' does not name a type"

Hi, julimueller,
I have a problem in building the code.when I built the code in step2, it went wrong with following:

root@104b0b60593a:/opt/caffe# make all
CXX src/caffe/solver.cpp
In file included from src/caffe/solver.cpp:9:0:
./include/caffe/util/bbox_util.hpp:311:69: error: 'FocalLossParameter' does not name a type const map<int, vector<NormalizedBBox> >& all_gt_bboxes, const FocalLossParameter focal_param,
                                                                     ^
./include/caffe/util/bbox_util.hpp:326:74: error: 'FocalLossParameter' does not name a type const int background_label_id, const ConfLossType loss_type, const FocalLossParameter focal_param,
                                                                     ^
In file included from src/caffe/solver.cpp:9:0:
./include/caffe/util/bbox_util.hpp:540:69: error: 'FocalLossParameter' does not name a type const map<int, vector<NormalizedBBox> >& all_gt_bboxes, const FocalLossParameter focal_param,
                                                                     ^
Makefile:575: recipe for target '.build_release/src/caffe/solver.o' failed
make: *** [.build_release/src/caffe/solver.o] Error 1

And i haven't found the definition of the FocalLossParameter in the code, I think it's a new type in tl_ssd. Wish to your reply!.

A train issue about "Number of priors must match number of location predictions."

Hi, I want to train the model using the DLTD dataset.So, I generate train.prototxt by adapting the depoly.prototxt folling follwing 3 steps:

  1. change the input to:
    layer { name: "data" type: "AnnotatedData" top: "data" top: "label" include { phase: TRAIN } transform_param { crop_h:512 crop_w:2048 mirror: true mean_value: 60 mean_value: 60 mean_value: 60 } data_param { source: "/home/sc03/datasets/DLTD/Berlin/VOC0712/lmdb/VOC0712_trainval_lmdb" batch_size: 1 backend: LMDB } }
  2. add the MultiBoxLoss layer in your readme
    layer { name: "mbox_loss" type: "MultiBoxLoss" bottom: "mbox_loc" bottom: "mbox_conf" bottom: "mbox_priorbox" bottom: "label" bottom: "mbox_state" top: "mbox_loss" include { phase: TRAIN } propagate_down: true propagate_down: true propagate_down: false propagate_down: false propagate_down: true loss_param { normalization: VALID } multibox_loss_param { loc_loss_type: SMOOTH_L1 conf_loss_type: SOFTMAX loc_weight: 1.0 num_classes: 2 share_location: true match_type: PER_PREDICTION overlap_threshold: 0.3 use_prior_for_matching: true background_label_id: 0 use_difficult_gt: true neg_pos_ratio: 3.0 neg_overlap: 0.5 code_type: CENTER_SIZE ignore_cross_boundary_bbox: false mining_type: MAX_NEGATIVE state_weight: 1.0 do_state_prediction: true num_states: 6 background_state_id: 0 state_digit: 4 state_loss_type: LOGISTIC } }
  3. Prior Box Adaptions as in your readme.
    layer { name: "inception_b4_concat_norm_mbox_priorbox" type: "PriorBox" bottom: "inception_b4_concat_norm" bottom: "data" top: "inception_b4_concat_norm_mbox_priorbox" prior_box_param { min_size: 7 min_size: 10 min_size: 15 min_size: 25 min_size: 35 min_size: 50 min_size: 70 aspect_ratio: 0.3 flip: false clip: false variance: 0.1 variance: 0.1 variance: 0.2 variance: 0.2 offset_w: 0.2 offset_w: 0.4 offset_w: 0.6 offset_w: 0.8 offset_h: 0.5 } }
    But, when I train the network, I meet an error:
    multibox_loss_layer.cpp:242] Check failed: num_priors_ * loc_classes_ * 4 == bottom[0]->channels() (440944 vs. 110236) Number of priors must match number of location predictions.

Could you tell me where I should adapt to fix the error?

Question about the number of states

Hi,

I have a question about the number of states in your work. Based on my understanding of the documentation, I think there are 4 states in total:

Specify the number of states. Please note that an additional background state is predicted as well. In other words, if your dataset contains the states red, yellow, green, you have to set the num_states to 3 + 1 = 4.

However, based on the code, there are 6 states:

  1. In the prototxt of the model structure, it shows that num_states: 4 https://github.com/julimueller/tl_ssd/blob/master/prototxt/deploy.prototxt#L3767
  2. The channel of the inception_b4_concat_norm_mbox_state is 42 (7 * 6), which also suggests that there are 6 states https://github.com/julimueller/tl_ssd/blob/master/prototxt/deploy.prototxt#L3615

If there are 6 states, what are the other two besides the red, yellow, green, and background? Thank you very much!

error: (-215:Assertion failed) (numPriors * _numLocClasses * 4) == total(inputs[0], 1)

I run you code using my own image. I dont change anything in your deploy.prototxt, THE INPUT SHAPE IS CORRECT but I encouter some error when i run net.forward()

[ERROR:0] global C:\projects\opencv-python\opencv\modules\dnn\src\dnn.cpp (3066)
cv::dnn::dnn4_v20191202::Net::Impl::getLayerShapesRecursively OPENCV/DNN: [DetectionOutput]:(detection_out): getMemoryShapes() throws exception. inputs=4 outputs=0/0
[ERROR:0] global C:\projects\opencv-python\opencv\modules\dnn\src\dnn.cpp (3069) cv::dnn::dnn4_v20191202::Net::Impl::getLayerShapesRecursively input[0] = [ 1 110236 ]
[ERROR:0] global C:\projects\opencv-python\opencv\modules\dnn\src\dnn.cpp (3069) cv::dnn::dnn4_v20191202::Net::Impl::getLayerShapesRecursively input[1] = [ 1 55118 ]
[ERROR:0] global C:\projects\opencv-python\opencv\modules\dnn\src\dnn.cpp (3069) cv::dnn::dnn4_v20191202::Net::Impl::getLayerShapesRecursively input[2] = [ 1 2 220472 ]
[ERROR:0] global C:\projects\opencv-python\opencv\modules\dnn\src\dnn.cpp (3069) cv::dnn::dnn4_v20191202::Net::Impl::getLayerShapesRecursively input[3] = [ 1 165354 ]
error: (-215:Assertion failed) (numPriors * _numLocClasses * 4) == total(inputs[0], 1) in function 'cv::dnn::DetectionOutputLayerImpl::getMemoryShapes'

CAN YOU HELP ME, I JUST WANT TO REUSE YOU CODE TO TEST IMAGES SO I DONT CHANGE ANYTHING. WHERE AM I WRONG OR WHAT SHOULD I MODIFY THE CODE. SORRY FOR MY BAD ENGLISH, THANK YOU SO MUCH.

Layer mbox_loss error when only use the prior box adaptions

when I use the original mbox_loss layer as you recommend, it print

I0613 03:33:26.476109 16066 net.cpp:434] mbox_state <- inception_b4_concat_norm_mbox_state_flat
I0613 03:33:26.476140 16066 net.cpp:408] mbox_state -> mbox_state
I0613 03:33:26.476225 16066 net.cpp:150] Setting up mbox_state
I0613 03:33:26.476246 16066 net.cpp:157] Top shape: 8 165354 (1322832)
I0613 03:33:26.476255 16066 net.cpp:165] Memory required for data: 19659154720
F0613 03:33:26.476299 16066 net.cpp:88] Check failed: layer_param.propagate_down_size() == layer_param.bottom_size() (5 vs. 4) propagate_down param must be specified either 0 or bottom_size times

But when I remove all the propagate_down_size param, it prints as following

I0613 04:03:53.034931 16106 net.cpp:100] Creating Layer mbox_loss
I0613 04:03:53.034948 16106 net.cpp:434] mbox_loss <- mbox_loc
I0613 04:03:53.034972 16106 net.cpp:434] mbox_loss <- mbox_conf
I0613 04:03:53.035014 16106 net.cpp:434] mbox_loss <- mbox_priorbox
I0613 04:03:53.035034 16106 net.cpp:434] mbox_loss <- label
I0613 04:03:53.035068 16106 net.cpp:408] mbox_loss -> mbox_loss
F0613 04:03:53.035151 16106 layer.hpp:374] Check failed: ExactNumBottomBlobs() == bottom.size() (5 vs. 4) MultiBoxLoss Layer takes 5 bottom blob(s) as input.

it occurs when I only use the prior box adaptions, and make all the boxes labels same to one class. It seems that the mbox_loss layer must have 5 bottom Blobs.

About max stride

In the paper, I don’t understand this sentence-"In consequence, a maximum stride of 0.34·5 pixels =1.7 pixels is needed to guarantee a detection of objects with a width of 5 pixels. As seen in Table I, only layer conv 1 - conv 3 can satisfy this condition." Can you explain it? I really want to know this answer. Thanks a lot.

About IoU

Hi Julian,
In Equation 3 of the paper,
image
the calculation of IoU should be the ratio of intersection to union, but it should be this
image
I don't know the reason.Is it because the value is too small to be deleted directly?Can you explain it? Thank you very much.

Best,
Dreamay

forward pass got killed

hello Julian Müller,

i'm trying to run the demo with CPU. Unfortunately, it was interrupted during forward pass. Do I have to use GPU for it? Thanks in advance.

MfG.

Unbenannt

A question about batchnorm layer

Hi,

I would like to ask about the motivation for using use_global_stats=false for BatchNorm in the deploy.prototxt.
Based on the documentation, it is suggested to set it to true in the testing phase.

  // If false, normalization is performed over the current mini-batch
  // and global statistics are accumulated (but not yet used) by a moving
  // average.
  // If true, those accumulated mean and variance values are used for the
  // normalization.
  // By default, it is set to false when the network is in the training
  // phase and true when the network is in the testing phase.
  optional bool use_global_stats = 1;

Thank you!

Index out of bound error

python ssd_dtld_test.py --predictionmap_file /home/vishwa/Downloads/project/ssd_try5/tl_ssd/test_on_dtld/prediction_map_ssd_states.json --confidence 0.2 --deploy_file /home/vishwa/Downloads/project/ssd_try5/tl_ssd/prototxt/deploy.prototxt --caffemodel_file /home/vishwa/Downloads/project/ssd_try5/tl_ssd/caffemodel/SSD_DTLD_iter_90000.caffemodel --test_file /home/vishwa/Downloads/DTLD/label/Bochum_all.yml

('CHECK: ', 5)
Traceback (most recent call last):
File "ssd_dtld_test.py", line 201, in
main(parse_args())
File "ssd_dtld_test.py", line 156, in main
result = detection.detect(img_color, args.confidence)
File "ssd_dtld_test.py", line 85, in detect
det_xmin = detections[0,0,:,3 + num_states + 1]
IndexError: index 9 is out of bounds for axis 3 with size 7

@julimueller Since number of states is 5, I am getting index out of bound error.
Do you have any idea how to fix it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.