Code Monkey home page Code Monkey logo

detect-track's Introduction

===============================================================================

Detect to Track and Track to Detect

This repository contains the code for our ICCV 2017 paper:

Christoph Feichtenhofer, Axel Pinz, Andrew Zisserman
"Detect to Track and Track to Detect"
in Proc. ICCV 2017
  • This repository also contains results for a ResNeXt-101 and Inception-v4 backbone network that perform slightly better (81.6% and 82.1% mAP on ImageNet VID val) than the ResNet-101 backbone (80.0% mAP) used in the conference version of the paper

  • This code builds on the original Matlab version of R-FCN

  • We are preparing a Python version of D&T that will support end-to-end training and inference of the RPN, Detector & Tracker.

If you find the code useful for your research, please cite our paper:

    @inproceedings{feichtenhofer2017detect,
      title={Detect to Track and Track to Detect},
      author={Feichtenhofer, Christoph and Pinz, Axel and Zisserman, Andrew},
      booktitle={International Conference on Computer Vision (ICCV)},
      year={2017}
    }

Requirements

The code was tested on Ubuntu 14.04, 16.04 and Windows 10 using NVIDIA Titan X or Z GPUs.

If you have questions regarding the implementation please contact:

Christoph Feichtenhofer <feichtenhofer AT tugraz.at>

================================================================================

Setup

  1. Download the code git clone --recursive https://github.com/feichtenhofer/detect-track
  • This will also download a modified version of the Caffe deep learning framework. In case of any issues, please follow the installation instructions in the corresponding README as well as on the Caffe website.
  1. Compile the code by running rfcn_build.m.

  2. Edit the file get_root_path.m to adjust the models and data paths.

    • Download the ImageNet VID dataset from http://image-net.org/download-images
    • Download pretrained model files and the RPN proposals, linked below and unpack them into your models/data directory.
    • In case the models are not present, the function check_dl_model will attempt to download the model to the respective directories
    • In case the RPN files are not present, the function download_proposals will attempt to download & extract the proposal files to the respective directories

Training

  • You can train your own models on ImageNet VID as follows
    • script_Detect_ILSVRC_vid_ResNet_OHEM_rpn(); to train the image-based Detection network.
    • script_DetectTrack_ILSVRC_vid_ResNet_OHEM_rpn(); to train the video-based Detection & Tacking network.

Testing

  • The scripts above have subroutines that test the learned models after training. You can also test our trained, final models available for download below. We provide three testing functions that work with a different numbers of frames at a time (i.e. processed by one GPU during the forward pass)
    1. rfcn_test(); to test the image-based Detection network.
    2. rfcn_test_vid(); to test the video-based Detection & Tacking network with 2 frames at a time.
    3. rfcn_test_vid_multiframe(); to test the video-based Detection & Tacking network with 3 frames at a time.
  • Moreover, we provide multiple testing network definitions that can be used for interesting experiments, for examüple
    • test_track.prototxt is the most simple form of D&T testing
    • test_track_reg.prototxt is a D&T version that additionally regresses the tracking boxes before performing the ROI tracking. Therefore, this procedure produces tracks that tightly encompass the underlying objects, whereas the above function tracks the proposal region (and therefore also the background area).
    • test_track_regcls.prototxt is a D&T version that additionally classifies the tracked region and computes the detection confidence as the mean of the detection score from the current frame, as well as the detection score of the tracked region in the next frame. Therefore, this method produces better results, especially if the temporal distance between the frames becomes larger and more complementary information can be integrated from the tracked region

Results on ImageNet VID

  • The networks are trained as decribed in the paper; i.e. on an intersection of the ImageNet object detection from video (VID) dataset which contains 30 classes in 3862 training videos and and the ImageNet object detection (DET) dataset (only using the data from the 30 VID classes). Validation results on the 555 videos of ImageNet VID validation are shown below.
Method test structure ResNet-50 ResNet-101 ResNeXt-101 Inception-v4
Detect test.prototxt 72.1 74.1 75.9 77.9
Detect & Track test_track.prototxt 76.5 79.8 81.4 82.0
Detect & Track test_track_regcls.prototxt 76.7 80.0 81.6 82.1
  • We show different testing network definitions in the rows and backbone networks in columns. The reported performance is mAP (in %), averaged over all videos and classes in the ImageNet VID validation subset.

Trained models

Data

Our models were trained using region proposals extracted using a Region Proposal Network that is trained on the same data as D&T. We use the RPN from craftGBD and provide the extracted proposals for training and testing on ImageNet VID and the DET subsets below.

Pre-computed object proposals for

detect-track's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

detect-track's Issues

It'd be nice to have a demo video

Hi!
I think the title is self-explanatory. If you could add some demo to run it directly after compiling it would be very nice to have.
Nice work so far!

Regarding the testing.

Hello,
is there any one who is succeeded in testing the model using pre-trained model on the CPU? If yes can you please share the process to do it??

Where is meta_vid.mat?

In the imdb funcition, there is ´ meta_det = load(fullfile(devkit_path, 'data', 'meta_vid.mat')); ´, where can I find the ´meta_vid.mat´? Thanks.

Snow

Hi, I found there was no the file ´mean_image´ in the director models/pre_trained_models/ResNet-101L. Where can I find it? Thanks

File structure and missing directory

Hi,
when we used your code to train, we encountered two problems as follows:

  1. is it possible for you to share the file structure of the root folder(/data/ILSVRC/), including Annotations, Data, devkit, ImageSets and imdb. For example, since we trained on both VID and DET, in root_path/Data, should we put VID_val and VID_train into VID folder, or just leave DET, VID, VID_train and VID_val in parallel?
  2. We have succeed constructing "imdb_ilsvrc15_train_unflip.mat", but every time when it was about to construct the corresponding roidb, we always got the warning saying "GT(xml) file empty/broken: ILSVRC2013_train_extra0/ILSVRC2013_train_00000001". But there's no corresponding annotations for the 2013 extra. We cannot find this directory from any ILSVRC dataset, so could you please share where we can get it?
    Thanks so much in advance!!

Regarding implementation of correlation layer.cpp

F0719 13:18:41.276878 6182 correlation_layer.cpp:89] Not Implemented Yet
*** Check failure stack trace: ***


     Illegal instruction detected at Thu Jul 19 13:18:43 2018 +0530

Configuration:
Crash Decoding : Disabled - No sandbox or build area path
Crash Mode : continue (default)
Default Encoding : UTF-8
Deployed : false
Desktop Environment : Unity
GNU C Library : 2.23 stable
Graphics Driver : Unknown hardware
Java Version : Java 1.8.0_144-b01 with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM mixed mode
MATLAB Architecture : glnxa64
MATLAB Entitlement ID : 5221413
MATLAB Root : /usr/local/MATLAB/R2018a
MATLAB Version : 9.4.0.813654 (R2018a)
OpenGL : hardware
Operating System : Ubuntu 16.04.4 LTS
Process ID : 6122
Processor ID : x86 Family 6 Model 142 Stepping 9, GenuineIntel
Session Key : 6d05ce2d-fa08-443f-9dd7-94ebfc528a13
Static TLS mitigation : Enabled: Full
Window System : The X.Org Foundation (11906000), display :0

Fault Count: 1

Abnormal termination

Register State (from fault):
RAX = 0000000000000001 RBX = 0000000022e43660
RCX = 00007f7edd85d788 RDX = 00007f7e891fe6a0
RSP = 00007f7edd85d770 RBP = 0000000014161c00
RSI = 0000000000000000 RDI = 0000000000000000

R8 = 0000000000000081 R9 = 0000000000000000
R10 = 00007f7f0035d650 R11 = 00007f7e88fdc15b
R12 = 000000000000017b R13 = 0000000000002388
R14 = 00007f7edd85d800 R15 = 0000000009198b88

RIP = 00007f7effb6dedc EFL = 0000000000010246

CS = 0033 FS = 0000 GS = 0000

Stack Trace (from fault):
[ 0] 0x00007f7effb6dedc /lib/x86_64-linux-gnu/libpthread.so.0+00052956 pthread_rwlock_unlock+00000044
[ 1] 0x00007f7e88fe3789 /usr/lib/x86_64-linux-gnu/libglog.so.0+00075657 ZN24glog_internal_namespace_5Mutex12ReaderUnlockEv+00000025
[ 2] 0x00007f7e88fdc360 /usr/lib/x86_64-linux-gnu/libglog.so.0+00045920 ZN6google10LogMessage5FlushEv+00000704
[ 3] 0x00007f7e88fdee1e /usr/lib/x86_64-linux-gnu/libglog.so.0+00056862 ZN6google15LogMessageFatalD2Ev+00000014
[ 4] 0x00007f7e893bd400 /home/narendrachintala/git/caffe-rfcn/matlab/+caffe/private/caffe
.mexa64+01823744
[ 5] 0x00007f7e893da2b2 /home/narendrachintala/git/caffe-rfcn/matlab/+caffe/private/caffe
.mexa64+01942194
[ 6] 0x00007f7e893da4f6 /home/narendrachintala/git/caffe-rfcn/matlab/+caffe/private/caffe
.mexa64+01942774
[ 7] 0x00007f7e89241c5f /home/narendrachintala/git/caffe-rfcn/matlab/+caffe/private/caffe_.mexa64+00269407
[ 8] 0x00007f7e8924293f /home/narendrachintala/git/caffe-rfcn/matlab/+caffe/private/caffe_.mexa64+00272703 mexFunction+00000163
[ 9] 0x00007f7eeb090080 bin/glnxa64/libmex.so+00413824
[ 10] 0x00007f7eeb090447 bin/glnxa64/libmex.so+00414791
[ 11] 0x00007f7eeb090f2b bin/glnxa64/libmex.so+00417579
[ 12] 0x00007f7eeb07b30c bin/glnxa64/libmex.so+00328460
[ 13] 0x00007f7eece842ad bin/glnxa64/libmwm_dispatcher.so+00979629 ZN8Mfh_file16dispatch_fh_implEMS_FviPP11mxArray_tagiS2_EiS2_iS2+00000829
[ 14] 0x00007f7eece84bae bin/glnxa64/libmwm_dispatcher.so+00981934 ZN8Mfh_file11dispatch_fhEiPP11mxArray_tagiS2+00000030
[ 15] 0x00007f7ee922cda1 bin/glnxa64/libmwm_lxe.so+12619169
[ 16] 0x00007f7ee922d982 bin/glnxa64/libmwm_lxe.so+12622210
[ 17] 0x00007f7ee9315fc9 bin/glnxa64/libmwm_lxe.so+13574089
[ 18] 0x00007f7ee92b7431 bin/glnxa64/libmwm_lxe.so+13186097
[ 19] 0x00007f7ee8abd5a8 bin/glnxa64/libmwm_lxe.so+04822440
[ 20] 0x00007f7ee8abfcbc bin/glnxa64/libmwm_lxe.so+04832444
[ 21] 0x00007f7ee8abc01d bin/glnxa64/libmwm_lxe.so+04816925
[ 22] 0x00007f7ee8ab5ba1 bin/glnxa64/libmwm_lxe.so+04791201
[ 23] 0x00007f7ee8ab5dd9 bin/glnxa64/libmwm_lxe.so+04791769
[ 24] 0x00007f7ee8abb846 bin/glnxa64/libmwm_lxe.so+04814918
[ 25] 0x00007f7ee8abb92f bin/glnxa64/libmwm_lxe.so+04815151
[ 26] 0x00007f7ee8bea503 bin/glnxa64/libmwm_lxe.so+06055171
[ 27] 0x00007f7ee8bedcf3 bin/glnxa64/libmwm_lxe.so+06069491
[ 28] 0x00007f7ee90fdf6d bin/glnxa64/libmwm_lxe.so+11378541
[ 29] 0x00007f7ee9219fa1 bin/glnxa64/libmwm_lxe.so+12541857
[ 30] 0x00007f7eece842ad bin/glnxa64/libmwm_dispatcher.so+00979629 ZN8Mfh_file16dispatch_fh_implEMS_FviPP11mxArray_tagiS2_EiS2_iS2+00000829
[ 31] 0x00007f7eece84bae bin/glnxa64/libmwm_dispatcher.so+00981934 ZN8Mfh_file11dispatch_fhEiPP11mxArray_tagiS2+00000030
[ 32] 0x00007f7ee922cda1 bin/glnxa64/libmwm_lxe.so+12619169
[ 33] 0x00007f7ee922d982 bin/glnxa64/libmwm_lxe.so+12622210
[ 34] 0x00007f7ee9315fc9 bin/glnxa64/libmwm_lxe.so+13574089
[ 35] 0x00007f7ee92b7431 bin/glnxa64/libmwm_lxe.so+13186097
[ 36] 0x00007f7ee8abd5a8 bin/glnxa64/libmwm_lxe.so+04822440
[ 37] 0x00007f7ee8abfcbc bin/glnxa64/libmwm_lxe.so+04832444
[ 38] 0x00007f7ee8abc01d bin/glnxa64/libmwm_lxe.so+04816925
[ 39] 0x00007f7ee8ab5ba1 bin/glnxa64/libmwm_lxe.so+04791201
[ 40] 0x00007f7ee8ab5dd9 bin/glnxa64/libmwm_lxe.so+04791769
[ 41] 0x00007f7ee8abb846 bin/glnxa64/libmwm_lxe.so+04814918
[ 42] 0x00007f7ee8abb92f bin/glnxa64/libmwm_lxe.so+04815151
[ 43] 0x00007f7ee8bea503 bin/glnxa64/libmwm_lxe.so+06055171
[ 44] 0x00007f7ee8bedcf3 bin/glnxa64/libmwm_lxe.so+06069491
[ 45] 0x00007f7ee90fdf6d bin/glnxa64/libmwm_lxe.so+11378541
[ 46] 0x00007f7ee90ab60c bin/glnxa64/libmwm_lxe.so+11040268
[ 47] 0x00007f7ee90b2448 bin/glnxa64/libmwm_lxe.so+11068488
[ 48] 0x00007f7ee90b3e22 bin/glnxa64/libmwm_lxe.so+11075106
[ 49] 0x00007f7ee9141807 bin/glnxa64/libmwm_lxe.so+11655175
[ 50] 0x00007f7ee9141aea bin/glnxa64/libmwm_lxe.so+11655914
[ 51] 0x00007f7eeb2f591a bin/glnxa64/libmwbridge.so+00207130 _Z8mnParserv+00000874
[ 52] 0x00007f7eed36ebb8 bin/glnxa64/libmwmcr.so+00641976
[ 53] 0x00007f7efd570e9f bin/glnxa64/libmwmlutil.so+06524575 _ZNSt13__future_base13_State_baseV29_M_do_setEPSt8functionIFSt10unique_ptrINS_12_Result_baseENS3_8_DeleterEEvEEPb+00000031
[ 54] 0x00007f7effb6fa99 /lib/x86_64-linux-gnu/libpthread.so.0+00060057
[ 55] 0x00007f7efd571126 bin/glnxa64/libmwmlutil.so+06525222 ZSt9call_onceIMNSt13__future_base13_State_baseV2EFvPSt8functionIFSt10unique_ptrINS0_12_Result_baseENS4_8_DeleterEEvEEPbEJPS1_S9_SA_EEvRSt9once_flagOT_DpOT0+00000102
[ 56] 0x00007f7eed36e9d3 bin/glnxa64/libmwmcr.so+00641491
[ 57] 0x00007f7f01cec1a2 bin/glnxa64/libmwmvm.so+03367330 ZN14cmddistributor15PackagedTaskIIP10invokeFuncIN7mwboost8functionIFvvEEEEENS2_10shared_ptrINS2_13unique_futureIDTclfp_EEEEEERKT+00000082
[ 58] 0x00007f7f01cec4e8 bin/glnxa64/libmwmvm.so+03368168 _ZNSt17_Function_handlerIFN7mwboost3anyEvEZN14cmddistributor15PackagedTaskIIP10createFuncINS0_8functionIFvvEEEEESt8functionIS2_ET_EUlvE_E9_M_invokeERKSt9_Any_data+00000024
[ 59] 0x00007f7eed978e6c bin/glnxa64/libmwiqm.so+00867948 _ZN7mwboost6detail8function21function_obj_invoker0ISt8functionIFNS_3anyEvEES4_E6invokeERNS1_15function_bufferE+00000028
[ 60] 0x00007f7eed97897f bin/glnxa64/libmwiqm.so+00866687 _ZN3iqm18PackagedTaskPlugin7executeEP15inWorkSpace_tagRN7mwboost10shared_ptrIN14cmddistributor17IIPCompletedEventEEE+00000447
[ 61] 0x00007f7eed956ab1 bin/glnxa64/libmwiqm.so+00727729
[ 62] 0x00007f7eed939ac8 bin/glnxa64/libmwiqm.so+00608968
[ 63] 0x00007f7eed9348bf bin/glnxa64/libmwiqm.so+00587967
[ 64] 0x00007f7f00e1ea05 bin/glnxa64/libmwservices.so+03262981
[ 65] 0x00007f7f00e1fff2 bin/glnxa64/libmwservices.so+03268594
[ 66] 0x00007f7f00e208fb bin/glnxa64/libmwservices.so+03270907 _Z25svWS_ProcessPendingEventsiib+00000187
[ 67] 0x00007f7eed36ffc3 bin/glnxa64/libmwmcr.so+00647107
[ 68] 0x00007f7eed3706a4 bin/glnxa64/libmwmcr.so+00648868
[ 69] 0x00007f7eed3693f1 bin/glnxa64/libmwmcr.so+00619505
[ 70] 0x00007f7effb686ba /lib/x86_64-linux-gnu/libpthread.so.0+00030394
[ 71] 0x00007f7effe8541d /lib/x86_64-linux-gnu/libc.so.6+01078301 clone+00000109
[ 72] 0x0000000000000000 +00000000

This error was detected while a MEX-file was running. If the MEX-file
is not an official MathWorks function, please examine its source code
for errors. Please consult the External Interfaces Guide for information
on debugging MEX-files.
** This crash report has been saved to disk as /home/narendrachintala/matlab_crash_dump.6122-1 **

Caught MathWorks::System::FatalException

I am getting this exception that the correlation layer.cpp is not implemented yet..Is anyone aware of this issue?

correlation_layer

can anybody explain some contents about the correlation_layer.cu. I can not get through how the correlation works .
thx for your kindly help~

Results about tau

Hi~recently I run your code, there are some issues about results. I get the result 79.8% when tau = 1, but I change the tau = 10, the result is so bad, only 8.1%. It confused me, thank you for your reply~~

Missing proposals and wrong .caffemodel?

Hi, I was trying to test this model on ImageNet VID (no modification to the code) and I used the trained models linked in the repo homepage (like this one or this one). The problem is that using those models I get completely random predictions: for each one of the 30 classes I get the same ~0.03 score, so the model doesn't detect any proposal and seems to act randomly, as it's been randomly initialized (I quadruple-checked that Caffe gets as input the right .caffemodel file).

For this reason I tried to train the model, but I constantly get missing proposal file errors. I checked and it seems that - for example - for the DET train dataset there are only ~53k proposal files while the ImageNet DET train dataset has got ~456k images. What am I missing here?

How can I run the rfcn_test()?

I need help. I don't know how to get the input parameters of rfcn_test() and so on.
What should I do set the parameters?

which train prototxt did you use?

There are so many train prototxts in the 'models/rfcn_prototxts/ResNet-101L_ILSVRCvid_corr' directory, what is the difference and which one did you use? Thanks!
screenshot from 2018-01-23 16 51 03

meta_vid.mat

Hello~ How can I get teh meta_vid.mat which used by imdb_from_ilsvrc15vid.m?

Clarification of N_tra in the tracking objective

In Sec. 3.3 of the paper, it's mentioned that the tracking loss is active for N_tra ground truth RoIs which have a track correspondence across the two frames (t, t+tau). I'm interpreting this as meaning that only predicted RoIs assigned to ground-truth RoIs (using an IoU>0.5) with a correspondence between the two frames are used in the tracking loss. Is this the idea?

For example, if the RoI batch size is 256, and each of these 256 RoIs are assigned to a ground truth box in frame t having a correspondence in the next frame t+tau, would I use N_tra=256 and all of their RoI-tracking deltas with the ground-truth delta in the regression of Equation (1) of the paper?

Could you provide example testing command?

Could you provide an example command to get the results on VID validation set? How to reproduce the 79.8 mAP as in the paper using rfcn_test_vid()? I'm not sure what the inputs: conf, imdb, roidb, etc., correspond to. Thank you!

Evaluation devkit: Errors during evaluation

@feichtenhofer Great work! Thanks for sharing. I'm attempting to run inference on a single VID video snippet and am running into errors when trying to call functions within the original Imagenet devkit ( taken from here). Are you using a custom/modified devkit?

The first error is an IO error from devkit/evaluation/eval_vid_detection on l117. I noticed the number of columns had changed in the input file predict_file, and I could fix this by changing:

[img_ids obj_labels obj_confs xmin ymin xmax ymax] = ...
        textread(predict_file,'%d %d %f %f %f %f %f');

to

[img_ids obj_labels unk obj_confs xmin ymin xmax ymax] = ...
        textread(predict_file,'%d %d %d %f %f %f %f %f');

After modifying the line above, I ran into another error in the evaluation:

Undefined function or variable 'eval_vid_tracking'.

This function is called within imdb/imdb_eval_ilsvrc14.m here. I noticed the file eval_vid_tracking.m doesn't exist and it doesn't seem to be in this repo. If you are using a modified devkit, can you please push it?

regarding the region proposals.

In the code it is looking for .mat file of the region proposals of the data but in the repository you have included the directory how to use them in the code if I want to do validation only?

Demo code

Can you please provide a demo code which given video frames, gives output boxes. ?
There is no need to run the test script on imagenetvid dataset if the code has to be used off the shelf for tracking purposes.

how the tracking ROI pooling layer could work?

thanks for your excellent job,but i got confused in some details in your paper.
In your paper,the tracking ROI pooling layer is operate on the stack of {Xcorr,Xreg-t,Xreg-t+1}
as far as i can see:
both Xreg-t and Xreg-t+1 layer has s shape of kk4
Xcorr consist of correlation output of conv3,4,5 respectively,and the correlation output should have shape like HW(2d+1)*(2d+1)
so:
1:how to concat different layer together like:Xcorr and Xreg-t
2:how to pool on the stacked feature map
thank you.

Is the tracking regress output is redundant with future frame detection output?

Hi, I'm reading the paper and I feel confused about the tracking regress output.
The paper says that the tracking regression output is the offset between objects in two frames.
But the objects has been detected by rfcn network in two frames actually,
what can the tracking regression output do?

I'm very appreciate if there is any answer.

the trained model

I can't open the link to download the trained model .I am in China ,angbody has the same problem?

How to implement RoI Tracking?

In the track-regression, RoI pooling is operated on the concatenation of the bounding box regression features and the correlation feature. I want to know the details about RoI pooling such as which frame's roi should be used.

This error was detected while a MEX-file was running. If the MEX-file is not an official MathWorks function, please examine its source code for errors. Please consult the External Interfaces Guide for information on debugging MEX-files

When I do evaluation, there is an error like '
This error was detected while a MEX-file was running. If the MEX-file
is not an official MathWorks function, please examine its source code
for errors. Please consult the External Interfaces Guide for information
on debugging MEX-files.'
Someone said it was caused by using different numbers of GPU with the setting in the code. I only have 2 GPU. Is there anyone know where I can change the number of GPU in the code? Or hoe can I set it for only using CPU? Thanks

Training end-to-end from scratch

Has anyone had any luck training this model end-to-end without external object proposals or pretraining the RFCN network on Imagenet DET? I've been trying to train the D(&T loss) model in pytorch and have only reached a frame mean AP of ~64% on the full imagnet VID validation set. Some implementation notes:

  • I'm only training/testing on Imagenet VID (I am not using anything from Imagenet DET).
  • As in the paper, I'm sampling 10 frames from each video snippet in the training set. These frames are sampled at regular intervals across the duration of the snippet.
  • I'm using resnet-101 with pretrained imagenet weights and am randomly initializing the RPN and RCNN.
  • I'm using correlation features on conv3, conv4, and conv5 and am regressing on the ground truth boxes in frame t --> t+tau.
  • I am using an L1 smooth loss for the tracking loss.
  • I am not linking detections across frames at the moment.
  • I am using a batch size of 2 (2 images per video, 2 videos = 4 frames total)
  • My initial lr is 5e-4

some doubts about the results

I have run the test code successfully, but I still have some doubts
image

1.To get the "Detect" results by "ImageNet CLS models" or Detect models"?

2.What is the difference between D and D&T models? I find their prototxt is the same. does the D models mean D(& T loss)?

Thank you ~

rfcn_test parameters

Hi, I am wondering how to set conf, imdb, roidb, varargin in rfcn_test.m? Thank you.

how to run D_T test code

Hi,guys,I'm new here. I've downloaded this code, but do not know how to run this code for test as the author said. Could anyone explain explicitly about the pip line of testing . Thank u very much~

Training break down

When I finetuned the RoI tracking part on the trained RFCN detector, the model training would be broken down. The RoI I used belongs to the T frame for the correlation features and the pair of frames' feature map. The initiate learning rate I set is 0.0001. How can I solve it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.