twhui / liteflownet Goto Github PK

View Code? Open in Web Editor NEW

572.0 22.0 104.0 82.43 MB

LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation, CVPR 2018 (Spotlight paper, 6.6%)

License: Other

Shell 0.37% CMake 0.30% C++ 80.43% Python 8.58% Makefile 0.66% MATLAB 0.85% Cuda 8.75% Dockerfile 0.07%

liteflownet cnn optical-flow caffe deeplearning computer-vision cvpr2018 pytorch tensorflow

liteflownet's Issues

miss file deploy.tpl.prototxt

hi,

can you plz add this file?

Also, in your paper, you mentioned the net 'LiteFlowNetX'. can you plz add the caffemodel and prototxt of this net ?

Pytorch version

will you develop a pytorch version of LiteFlowNet? and beside, i look forward to the prototex file of the network framework

LiteFlowNet cnn

I need to apply LiteFlowNet on video data and extract flow features to feed it to cnn.

What kind of features can I get from videos and how can I save them for each video file ?

Poor results from gray images with small motion like UCSD

I have test the all three model on UCSD dataset, which are gray images. I make them RGB and feed into the model, but get really bad result compared to flownet2.

I visualize the flow as Baker et al. "A Database and Evaluation Methodology for Optical Flow" (ICCV, 2007) URL: http://vision.middlebury.edu/flow/flowEval-iccv07.pdf, using the code here
It work well on other RGB dataset like CUHK avenue, which have a larger motion. Is there any wrong I missed or the models are not suitable for gray images or small motion image pairs..

first image

second image

sintel reuslt

default result

kitti result, this seems better but not practical

while the result from flownet2 much better

the number in .flo are very small, but the visualization tells the failure.

How to use code in matlab

I need to use your model in my research using Matlab.
The syntax is to import pre trained CNN models is below
net = importCaffeNetwork(protofile,datafile)
Please check the link below:
https://www.mathworks.com/help/deeplearning/ref/importcaffenetwork.html#bvo3trn-3
These two files need to be in this form
protofile = '.prototxt';
datafile = '.caffemodel';
How to get those files for Matlab or is there any other way I can use it?
Please help me with this issue.

Where is test_MODE.py file located ?

I could not find test_MODE.py file in /models/testing/ folder. Can you please tell the location.

invalid literal for int() with base 10: "b'1241"

I use 'test_iter.py' to test KITTI datasets , model is liteflownet-ft-kitti , and I meet the following error:
Traceback (most recent call last):
File "test_iter.py", line 96, in
img1_size = get_image_size(images[0][idx])
File "test_iter.py", line 15, in get_image_size
dim_list = [int(dimstr) for dimstr in str(subprocess.check_output([img_size_bin, filename])).split(',')]
File "test_iter.py", line 15, in
dim_list = [int(dimstr) for dimstr in str(subprocess.check_output([img_size_bin, filename])).split(',')]
ValueError: invalid literal for int() with base 10: "b'1241"

Difficulty in Training LiteFlowNet

Hi,

I am trying to train model for LiteFlowNet using the methodology mentioned in the paper. However, I am facing difficulties in running inference on the partial trained networks. Correct me if I am wrong. The method that I understood is to train till sub pixel refinement part of layer 6 first, followed by regularization part of layer 6 and then continue the same procedure for other layers. However, after training this way, when I run inference on the sub-portion of complete network (till the part where training has been done), the inference takes too much time outputting something like "Batch 0, img0/1_aug_L2/3/4/5 = some number" repeatedly. However, when I run inference on the whole network, inference is fast and this repetition does not take place. Can you guide me, what I am doing wrong.

Thanks for reading the post.

When I switch off the regularization, the network starts printing augmentation parameters

Running the network in testing phase,
When I switch off the regularization for all the losses, the network starts printing augmentation parameters and it takes a very long time to get the output.
Do you have idea why does it happen and how can I stop it?

find: ‘examples’: No such file or directory

After downloading the code,I only changed my cuda path in makefile.config, then:
cd LiteFlowNet
make -j 8 tools pycaffe
something wrong happened:
matlab/+caffe/private python/caffe src/gtest src/caffe src/caffe/test src/caffe/layers src/caffe/util src/caffe/proto src/caffe/solvers tools
find: ‘examples’: No such file or directory
find: ‘examples’: No such file or directory
find: ‘examples’: No such file or directory
touch python/caffe/proto/init.py
PROTOC src/caffe/proto/caffe.proto
CXX tools/convert_imageset.cpp

cuda:8.0
cudnn:5.1

could you give me some advice ?

issue about test image

I use 'test_iter.py' to test an image (3840px*2160px), cnn_model = 'liteflownet', I meet the following error:

F0927 16:26:54.105952 15751 blob.cpp:34] Check failed: shape[i] <= 2147483647 / count_ (1920 vs. 1713) blob size exceeds INT_MAX
*** Check failure stack trace: ***
@ 0x7f564540ddaa (unknown)
@ 0x7f564540dce4 (unknown)
@ 0x7f564540d6e6 (unknown)
@ 0x7f5645410687 (unknown)
@ 0x7f5645aea3de caffe::Blob<>::Reshape()
@ 0x7f5645b54dc6 caffe::BaseConvolutionLayer<>::Reshape()
@ 0x7f5645bdfb7f caffe::CuDNNConvolutionLayer<>::Reshape()
@ 0x7f5645aa6c6c caffe::Net<>::Init()
@ 0x7f5645aa8188 caffe::Net<>::Net()
@ 0x4072b4 test()
@ 0x405c8c main
@ 0x7f5643f66f45 (unknown)
@ 0x40645d (unknown)
@ (nil) (unknown)

When I test another image(2720px*1530px), the error is "Check failed: error == cudaSuccess (2 vs. 0) out of memory", I use a single Titan X(pascal), is there any restriction of the test image size or any reqirement of the hardware? There are indeed some images with smaller size can pass the test, so it's not the problem of configuration.

Would it be possible to run LiteFlowNet with CUDA 9.0?

Hi~ I wish to test your code on my computer with CUDA 9.0. Is there any solution for me? Thank you very much!

input data is color or gray image

./tools/convert_imageset_and_flow.cpp show that you load color image, but as i known, the tranditional optical flow estimation alg is based on gray scale image,why do you use color image?

resample_layer.hpp missing

Hi, it seems resample_layer.hpp not in layers/ . I met some error about opencv, ReadImagefromMat not declared in the scope. Is current repo not complete to compile successfully? Thanks.

make tools and pycaffe

make command gives the same error for both which is -> make: Nothing to be done for 'pycaffe'.

can you explain what the use of layer "FlowScaling"

I don't konw why the coeff is set to 0.05.
also, in the NetE-S: L6, why the coeff is set to 0.625?

thanks for your answers!

training templates

@twhui Hi, I really liked your work. I wanted to to use this in my gsoc project. Can you please provide the prototxt files soon ?
TIA.

How to get the optical flow value in .flow file?

I have generated the .flo files of my dataset through the pre-trained model. How to get the value (u,v) in the .flo file at pixel location (x,y) in the original frame? Please help me~

Excuse me,where is the *.template files?

Thank you for this amazing work. But I can not find the prototxt files. Could you provide me the deploy.prototxt?

deploy.tpl.prototxt does not exist

@twhui Hi, thanks for your great work firstly. When I tried to run demo, I found that './deplot/tpl.prototxt' parameter did not exist, can you provide it? Thanks a lot

How to extract features from your model trained model

I have followed the procedure of features extraction method as given in caffe website. However, it give me an error.
modified code.
caffe_bin = 'bin/extract_features.bin'
args = [caffe_bin, '../trained/' + cnn_model + '.caffemodel','tmp/deploy.prototxt', 'conv4_R_L5','/results', '1', 'leveldb','-iterations', str(1), '-gpu', '0']]

error

E0121 17:42:07.825711 10867 extract_features.cpp:62] Using CPU
E0121 17:42:09.102257 10867 extract_features.cpp:133] Extracting Features
F0121 17:42:09.102833 10867 data_augmentation_layer.cpp:211] Forward CPU Augmentation not implemented.

I already compiled it with GPU and test_iter.py is working for flow creation but not working for feature extraction.

network training

What't the reason for stage-wise training scheme in pre-training procedure. I tried firstly train like flownet2, the NAN loss occured with Flow Regularization, how to avoid this? appreciate your reply

Convert training dataset and training issues

Thanks for your sharing job, I tested your model and gained an excellent result. I want to train a my own model on the sintel dataset. I encountered some problems.
First, I am confused that how to convert the dataset and flow to the lmdb files. I downloaded the Sintel dataset and unzip into a training folder and a test folder. Then I modified the make-lmdbs-train.sh into

#!/bin/bash
../build/tools/convert_imageset_and_flow.bin training.list training_lmdb 0 lmdb
../build/tools/convert_imageset_and_flow.bin test.list test_lmdb 0 lmdb

And I run the shell script and no errors appearing. I wonder know whether it is a right method to generate new training lmbd files.

Second,
I followed the instructions and run the train.py with python train.py -gpu 0 2>&1 | tee ./log.txt^C. And here is the error info:

mkdir: cannot create directory ‘training’: File exists
I0718 16:23:17.060703 30561 upgrade_proto.cpp:1044] Attempting to upgrade input file specified using deprecated 'solver_type'                      field (enum)': ../solver.prototxt
I0718 16:23:17.060874 30561 upgrade_proto.cpp:1051] Successfully upgraded file specified using deprecated 'solver_type' field                      (enum) to 'type' field (string).
W0718 16:23:17.060883 30561 upgrade_proto.cpp:1053] Note that future Caffe releases will only support 'type' field (string) f                     or a solver's type.
I0718 16:23:17.060953 30561 caffe.cpp:185] Using GPUs 0
I0718 16:23:17.364331 30561 caffe.cpp:190] GPU 0: Tesla K80
I0718 16:23:17.935869 30561 solver.cpp:48] Initializing solver from parameters:
test_iter: 160
test_interval: 5000
base_lr: 4e-05
display: 250
max_iter: 300000
lr_policy: "multistep"
gamma: 0.5
momentum: 0.9
weight_decay: 0.0004
snapshot: 30000
snapshot_prefix: "flow"
solver_mode: GPU
device_id: 0
net: "../model/train.prototxt"
stepvalue: 120000
stepvalue: 160000
stepvalue: 200000
stepvalue: 240000
momentum2: 0.999
type: "Adam"
I0718 16:23:17.936089 30561 solver.cpp:91] Creating training net from net file: ../model/train.prototxt
F0718 16:23:17.936120 30561 io.cpp:36] Check failed: fd != -1 (-1 vs. -1) File not found: ../model/train.prototxt
*** Check failure stack trace: ***
    @     0x7f3c1b986daa  (unknown)
    @     0x7f3c1b986ce4  (unknown)
    @     0x7f3c1b9866e6  (unknown)
    @     0x7f3c1b989687  (unknown)
    @     0x7f3c1c1ab7ed  caffe::ReadProtoFromTextFile()
    @     0x7f3c1c1b4ae4  caffe::ReadNetParamsFromTextFileOrDie()
    @     0x7f3c1bff8b3b  caffe::Solver<>::InitTrainNet()
    @     0x7f3c1bff9c0c  caffe::Solver<>::Init()
    @     0x7f3c1bff9f3a  caffe::Solver<>::Solver()
    @     0x7f3c1c02c203  caffe::Creator_AdamSolver<>()
    @           0x40ed2e  caffe::SolverRegistry<>::CreateSolver()
    @           0x407ec2  train()
    @           0x405cbc  main
    @     0x7f3c1a638ec5  (unknown)
    @           0x40648d  (unknown)
    @              (nil)  (unknown)
('args:', ['-gpu', '0'])
Executing ../bin/caffe train -model ../train.prototxt -solver ../solver.prototxt -gpu 0

I found that there are some errors from the path of solver.prototxt and train.prototxt. However I just followed your steps in instructions. So how to deal with it?

Thank you~ Looking forward for your reply~

warping each channel differently

Dear Sir,

Your paper mentions : We can also use f-warp layer to displace each channel differently when
multiple flow fields are supplied. The usage, however, is beyond the scope of
this work.

Is their any special way or parameter I have to set to use this?

Thanks and Regards,
Arnab

Why is the resolution of the output of the optical flow inferred with Liteflownet half size of the original image pair?

For the Liteflownet, the resolution of the output of the optical flow when inferring with Liteflownet is the half of the original image pair, and then the output is upsampled to the same size as the original image. However, why not directly output the optical flow whose size is the same as the original image? And I notice that PWC-Net also uses the similar manner. Thank you!

the loss of the tranning phase seems falling. But in the test phase, the loss is inf.

hi, I have trained a part of your model, and the loss of the tranning phase seems falling. But in the test phase, the loss is inf.
and I saw there is " Testing phase only " in your prototxt. does them have some relationship? and why you add the " Testing phase only " part? thanks for your answer!

Help needed regarding training strategy used for finetuning with FlyingThings dataset

Hi,

I tried to replicate the lite-flow caffe model using the procedure describe in the paper. For training done on FlyingChairs dataset, the accuracy reached was close to what was reported in the paper (33.68% as compared to 32.59%). However, training on FlyingThings dataset is not increasing accuracy significantly. I can only reach till 32.63% using FlyingThinggs dataset (not 28.59 %) even after finetuning for more than 500k iterations. I had also removed the harmful dataset as pointed out in FlowNet2. One thing that I have noticed is that the training is very slow for the last layer (layer2) and accuracy has not increased much by adding this layer. Moreover, the test loss for this layer is much greater than the other layers,.

Can you guide me about the training procedure for finetuning with FlyingThings dataset. I cannot figure out what confiugurations are used in solver prototxt file for finetuning with FlyingThings dataset.

Thanks for taking your time and reading the post :)

Thanks.

‘ReadImageToCVMat’ was not declared in this scope

When compiling, I got this error. Please help! thx!
src/caffe/layers/imgreader_layer.cpp: In instantiation of ‘void caffe::ImgReaderLayer::ReadData() [with Dtype = double]’:
src/caffe/layers/imgreader_layer.cpp:80:1: required from here
src/caffe/layers/imgreader_layer.cpp:45:38: error: ‘ReadImageToCVMat’ was not declared in this scope
Makefile:576: recipe for target '.build_release/src/caffe/layers/imgreader_layer.o' failed

Convert training dataset and training issues

hello,
I followed the instructions and run the train.py with python train.py -gpu 0 2>&1 | tee ./log.txt^C. And here is the error info:

F0901 13:01:17.908138 23503 custom_data_layer.cpp:361] Check failed: mdb_env_open(mdb_env_, this->layer_param_.data_param().source().c_str(), 0x20000|0x200000, 0664) == 0 (2 vs. 0) mdb_env_open failed *** Check failure stack trace: *** @ 0x7fdcf54b55cd google::LogMessage::Fail() @ 0x7fdcf54b7433 google::LogMessage::SendToLog() @ 0x7fdcf54b515b google::LogMessage::Flush() @ 0x7fdcf54b7e1e google::LogMessageFatal::~LogMessageFatal() @ 0x7fdcf5ccf423 caffe::CustomDataLayer<>::LayerSetUp() @ 0x7fdcf5d6fada caffe::Net<>::Init() @ 0x7fdcf5d712f1 caffe::Net<>::Net() @ 0x7fdcf5d51d3a caffe::Solver<>::InitTrainNet() @ 0x7fdcf5d53077 caffe::Solver<>::Init() @ 0x7fdcf5d5341a caffe::Solver<>::Solver() @ 0x7fdcf5d3c3a3 caffe::Creator_AdamSolver<>() @ 0x40a6e8 train() @ 0x4075a8 main @ 0x7fdcf3e71830 __libc_start_main @ 0x407d19 _start @ (nil) (unknown)

How to correct this address alignment error?Thank you~ Looking forward for your reply~

train/test prototxt

Would you be willing to share the train/test prototxt?

I noticed in the train section we're asked to copy from this directory, which I cannot find:
2. Copy files from LiteFlowNet/models/training_template to a new model folder...

Additionally, in the test script it mentions "deploy.tpl.prototxt", which I also cannot find.

If you could point me in the right direction or share these files, that would be appreciated.

propagate down parameter in different layers

Hi,

Can you tell me the use of "propagate down" parameter in different layers?
For example the data augmentation layer.

Thanks and Regards,
Dhara

Training stopped at level L2

Dear Dr. Tak Wai HUI

Can you please let me know why did you stop at level L2 loss and didn't train the network for full resolution loss (L1 I mean) : 768x384?

Thanks and Regards,
Arnab

"training_template" folder not found

According to the readme, there should be a folder called "traing_template" under the folder "model" which contains the prototxt file and script for training, but I have not found it there, even in the total repository.

question about the implementation of 'f-lcon'

HI:

the paper lack of the introduction of the final implement of 'f-lcon', as I can not figure out a proper way to obtain 'f_lcon5_R(flow5_R, N2WH)' from 'softmax5_R(N9WH)' and 'flow5_S(N2W*H)'. Is the implement of 'lcon' introduced in the reference paper 'DeepFace:Closing the gap to human-level performance in face verification.'?

Access to training curves

I wanted to have a look at the drop in loss function as the training progresses.
Is it possible for you to give me access to the training curve plots?

Compile error. tools/convert_imageset_and_flow.cpp:

When I compile the project, I encountered the error as follow.
tools/convert_imageset_and_flow.cpp:144:27: error: ‘numeric_limits’ is not a member of ‘std’
tools/convert_imageset_and_flow.cpp:144:47: error: expected primary-expression before ‘short’
tools/convert_imageset_and_flow.cpp:144:47: error: expected ‘;’ before ‘short’
Could you help me?

OpenCV installation issue

First of all great job! However I have an installation error related to opencv

After I run make -j 8 all tools pycaffe I get the following error
#error "OpenCV 4.x+ requires enabled C++11 support" # error "OpenCV 4.x+ requires enabled C++11 support"

My opencv version is 3.2.0, and I have followed your instructions. Do you have any idea what might cause this error and how I can get by? Thank you for your time in advance.

Training converges but only to a minimum value of 30 for scaled_flow_R_L6_loss

Dear Sir,

Thank you for being actively replying to our queries.
I am fine tuning the network based on the weight "liteflownet". My dataset is different than the datasets used for your study. Hence the need for fine tuning. I am concentrating only on loss 6. Other losses have weight 0.

Liteflownet_training.log

The loss starts decreasing but it stops improving beyond 30. I am attaching my log file. Do you expect the loss to go further down?

I0504 23:23:12.889382 2649 solver.cpp:245] Train net output #0: scaled_flow_D1_L2_loss = 12066.4
I0504 23:23:12.889410 2649 solver.cpp:245] Train net output #1: scaled_flow_D1_L3_loss = 2873.06
I0504 23:23:12.889416 2649 solver.cpp:245] Train net output #2: scaled_flow_D1_L4_loss = 654.38
I0504 23:23:12.889422 2649 solver.cpp:245] Train net output #3: scaled_flow_D1_L5_loss = 141.828
I0504 23:23:12.889436 2649 solver.cpp:245] Train net output #4: scaled_flow_D1_L6_loss = 33.7398 (* 0.32 = 10.7967 loss)
I0504 23:23:12.889442 2649 solver.cpp:245] Train net output #5: scaled_flow_D2_L2_loss = 12453.9
I0504 23:23:12.889447 2649 solver.cpp:245] Train net output #6: scaled_flow_D2_L3_loss = 2963.34
I0504 23:23:12.889453 2649 solver.cpp:245] Train net output #7: scaled_flow_D2_L4_loss = 663.826
I0504 23:23:12.889459 2649 solver.cpp:245] Train net output #8: scaled_flow_D2_L5_loss = 142.224
I0504 23:23:12.889483 2649 solver.cpp:245] Train net output #9: scaled_flow_D2_L6_loss = 33.0278 (* 0.32 = 10.5689 loss)
I0504 23:23:12.889489 2649 solver.cpp:245] Train net output #10: scaled_flow_R_L2_loss = 12529.9
I0504 23:23:12.889497 2649 solver.cpp:245] Train net output #11: scaled_flow_R_L3_loss = 3000.42
I0504 23:23:12.889502 2649 solver.cpp:245] Train net output #12: scaled_flow_R_L4_loss = 689.57
I0504 23:23:12.889508 2649 solver.cpp:245] Train net output #13: scaled_flow_R_L5_loss = 152.621
I0504 23:23:12.889531 2649 solver.cpp:245] Train net output #14: scaled_flow_R_L6_loss = 32.6008 (* 1 = 32.6008 loss)

the training is not convergent

i follow the training step in the paper
firstly,i want to train the L6.but i found it not converge.

the solver.prototxt is :

THIS IS ONLY AN EXAMPLE. YOU CAN CHANGE THE SETTINGS, IF NECESSARY.

net: "train.prototxt"

base_lr: 1e-4
lr_policy: "multistep"
gamma: 0.5

TRAIN Batch size: 8

TEST Batch size: 4

test_iter: 160

test_interval: 5000
max_iter: 300000
snapshot: 30000

momentum: 0.9
weight_decay: 0.0004
display: 250

snapshot_prefix: "L6"
solver_mode: GPU
solver_type: ADAM
momentum2: 0.999

the train.prototxt is:
#######################################

LiteFlowNet CVPR 2018

by

T.-W. Hui, CUHK

#######################################
layer {
name: "CustomData1"
type: "CustomData"
top: "blob0"
top: "blob1"
top: "blob2y"
include {
phase: TRAIN
}
data_param {
# THIS IS ONLY AN EXAMPLE. YOU CAN CHANGE THE SETTINGS, IF NECESSARY.
source: "/data/chair_lmdb"
batch_size: 6
backend: LMDB
rand_permute: true
rand_permute_seed: 77
slice_point: 3
slice_point: 6
encoding: UINT8
encoding: UINT8
encoding: UINT16FLOW
verbose: true
}
}
layer {
name: "CustomData2"
type: "CustomData"
top: "blob0"
top: "blob1"
top: "blob2y"
include {
phase: TEST
}
data_param {
# THIS IS ONLY AN EXAMPLE. YOU CAN CHANGE THE SETTINGS, IF NECESSARY.
source: "/data/chair_lmdb"
batch_size: 1
backend: LMDB
rand_permute: true
rand_permute_seed: 77
slice_point: 3
slice_point: 6
encoding: UINT8
encoding: UINT8
encoding: UINT16FLOW
verbose: true
}
}

layer {
name: "Eltwise1"
type: "Eltwise"
bottom: "blob2y"
top: "blob2"
eltwise_param {
operation: SUM
coeff: 0.01
}
}
#######################################

Pre-processing

#######################################
layer {
name: "Eltwise1"
type: "Eltwise"
bottom: "blob0"
top: "blob4"
eltwise_param {
operation: SUM
coeff: 0.00392156862745
}
}
layer {
name: "Eltwise2"
type: "Eltwise"
bottom: "blob1"
top: "blob5"
eltwise_param {
operation: SUM
coeff: 0.00392156862745
}
}
layer {
name: "img0s_aug"
type: "DataAugmentation"
bottom: "blob4"
top: "img0_aug"
top: "blob7"
propagate_down: false
augmentation_param {
max_multiplier: 1
augment_during_test: false
recompute_mean: 3000
mean_per_pixel: false
translate {
rand_type: "uniform_bernoulli"
exp: false
mean: 0
spread: 0.4
prob: 1.0
}
rotate {
rand_type: "uniform_bernoulli"
exp: false
mean: 0
spread: 0.4
prob: 1.0
}
zoom {
rand_type: "uniform_bernoulli"
exp: true
mean: 0.2
spread: 0.4
prob: 1.0
}
squeeze {
rand_type: "uniform_bernoulli"
exp: true
mean: 0
spread: 0.3
prob: 1.0
}
lmult_pow {
rand_type: "uniform_bernoulli"
exp: true
mean: -0.2
spread: 0.4
prob: 1.0
}
lmult_mult {
rand_type: "uniform_bernoulli"
exp: true
mean: 0.0
spread: 0.4
prob: 1.0
}
lmult_add {
rand_type: "uniform_bernoulli"
exp: false
mean: 0
spread: 0.03
prob: 1.0
}
sat_pow {
rand_type: "uniform_bernoulli"
exp: true
mean: 0
spread: 0.4
prob: 1.0
}
sat_mult {
rand_type: "uniform_bernoulli"
exp: true
mean: -0.3
spread: 0.5
prob: 1.0
}
sat_add {
rand_type: "uniform_bernoulli"
exp: false
mean: 0
spread: 0.03
prob: 1.0
}
col_pow {
rand_type: "gaussian_bernoulli"
exp: true
mean: 0
spread: 0.4
prob: 1.0
}
col_mult {
rand_type: "gaussian_bernoulli"
exp: true
mean: 0
spread: 0.2
prob: 1.0
}
col_add {
rand_type: "gaussian_bernoulli"
exp: false
mean: 0
spread: 0.02
prob: 1.0
}
ladd_pow {
rand_type: "gaussian_bernoulli"
exp: true
mean: 0
spread: 0.4
prob: 1.0
}
ladd_mult {
rand_type: "gaussian_bernoulli"
exp: true
mean: 0.0
spread: 0.4
prob: 1.0
}
ladd_add {
rand_type: "gaussian_bernoulli"
exp: false
mean: 0
spread: 0.04
prob: 1.0
}
col_rotate {
rand_type: "uniform_bernoulli"
exp: false
mean: 0
spread: 1
prob: 1.0
}
crop_width: 448
crop_height: 320
chromatic_eigvec: 0.51
chromatic_eigvec: 0.56
chromatic_eigvec: 0.65
chromatic_eigvec: 0.79
chromatic_eigvec: 0.01
chromatic_eigvec: -0.62
chromatic_eigvec: 0.35
chromatic_eigvec: -0.83
chromatic_eigvec: 0.44
}
}
layer {
name: "aug_params1"
type: "GenerateAugmentationParameters"
bottom: "blob7"
bottom: "blob4"
bottom: "img0_aug"
top: "blob8"
augmentation_param {
augment_during_test: false
translate {
rand_type: "gaussian_bernoulli"
exp: false
mean: 0
spread: 0.03
prob: 1.0
}
rotate {
rand_type: "gaussian_bernoulli"
exp: false
mean: 0
spread: 0.03
prob: 1.0
}
zoom {
rand_type: "gaussian_bernoulli"
exp: true
mean: 0
spread: 0.03
prob: 1.0
}
gamma {
rand_type: "gaussian_bernoulli"
exp: true
mean: 0
spread: 0.02
prob: 1.0
}
brightness {
rand_type: "gaussian_bernoulli"
exp: false
mean: 0
spread: 0.02
prob: 1.0
}
contrast {
rand_type: "gaussian_bernoulli"
exp: true
mean: 0
spread: 0.02
prob: 1.0
}
color {
rand_type: "gaussian_bernoulli"
exp: true
mean: 0
spread: 0.02
prob: 1.0
}
}
coeff_schedule_param {
half_life: 50000
initial_coeff: 0.5
final_coeff: 1
}
}
layer {
name: "img1s_aug"
type: "DataAugmentation"
bottom: "blob5"
bottom: "blob8"
top: "img1_aug"
propagate_down: false
propagate_down: false
augmentation_param {
max_multiplier: 1
augment_during_test: false
recompute_mean: 3000
mean_per_pixel: false
crop_width: 448
crop_height: 320
chromatic_eigvec: 0.51
chromatic_eigvec: 0.56
chromatic_eigvec: 0.65
chromatic_eigvec: 0.79
chromatic_eigvec: 0.01
chromatic_eigvec: -0.62
chromatic_eigvec: 0.35
chromatic_eigvec: -0.83
chromatic_eigvec: 0.44
}
}
layer {
name: "FlowAugmentation1"
type: "FlowAugmentation"
bottom: "blob2"
bottom: "blob7"
bottom: "blob8"
top: "flow_gt_aug"
augmentation_param {
crop_width: 448
crop_height: 320
}
}
layer {
name: "FlowScaling"
type: "Eltwise"
bottom: "flow_gt_aug"
top: "scaled_flow_gt_aug"
eltwise_param {
operation: SUM
coeff: 0.05
}
}
#######################################

NetC

#######################################
layer {
name: "conv1"
type: "Convolution"
bottom: "img0_aug"
bottom: "img1_aug"
top: "F0_L1"
top: "F1_L1"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 32
pad: 3
kernel_size: 7
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU1a"
type: "ReLU"
bottom: "F0_L1"
top: "F0_L1"
relu_param { negative_slope: 0.1 }
}
layer {
name: "ReLU1b"
type: "ReLU"
bottom: "F1_L1"
top: "F1_L1"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv2_1"
type: "Convolution"
bottom: "F0_L1"
bottom: "F1_L1"
top: "F0_1_L2"
top: "F1_1_L2"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 32
pad: 1
kernel_size: 3
stride: 2
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU2a_1"
type: "ReLU"
bottom: "F0_1_L2"
top: "F0_1_L2"
relu_param { negative_slope: 0.1 }
}
layer {
name: "ReLU2b_1"
type: "ReLU"
bottom: "F1_1_L2"
top: "F1_1_L2"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv2_2"
type: "Convolution"
bottom: "F0_1_L2"
bottom: "F1_1_L2"
top: "F0_2_L2"
top: "F1_2_L2"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 32
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU2a_2"
type: "ReLU"
bottom: "F0_2_L2"
top: "F0_2_L2"
relu_param { negative_slope: 0.1 }
}
layer {
name: "ReLU2b_2"
type: "ReLU"
bottom: "F1_2_L2"
top: "F1_2_L2"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv2_3"
type: "Convolution"
bottom: "F0_2_L2"
bottom: "F1_2_L2"
top: "F0_L2"
top: "F1_L2"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 32
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU2a_3"
type: "ReLU"
bottom: "F0_L2"
top: "F0_L2"
relu_param { negative_slope: 0.1 }
}
layer {
name: "ReLU2b_3"
type: "ReLU"
bottom: "F1_L2"
top: "F1_L2"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv3_1"
type: "Convolution"
bottom: "F0_L2"
bottom: "F1_L2"
top: "F0_1_L3"
top: "F1_1_L3"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
stride: 2
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU3a_1"
type: "ReLU"
bottom: "F0_1_L3"
top: "F0_1_L3"
relu_param { negative_slope: 0.1 }
}
layer {
name: "ReLU3b_1"
type: "ReLU"
bottom: "F1_1_L3"
top: "F1_1_L3"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv3_2"
type: "Convolution"
bottom: "F0_1_L3"
bottom: "F1_1_L3"
top: "F0_L3"
top: "F1_L3"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU3a_2"
type: "ReLU"
bottom: "F0_L3"
top: "F0_L3"
relu_param { negative_slope: 0.1 }
}
layer {
name: "ReLU3b_2"
type: "ReLU"
bottom: "F1_L3"
top: "F1_L3"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv4_1"
type: "Convolution"
bottom: "F0_L3"
bottom: "F1_L3"
top: "F0_1_L4"
top: "F1_1_L4"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 96
pad: 1
kernel_size: 3
stride: 2
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU4a_1"
type: "ReLU"
bottom: "F0_1_L4"
top: "F0_1_L4"
relu_param { negative_slope: 0.1 }
}
layer {
name: "ReLU4b_1"
type: "ReLU"
bottom: "F1_1_L4"
top: "F1_1_L4"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv4_2"
type: "Convolution"
bottom: "F0_1_L4"
bottom: "F1_1_L4"
top: "F0_L4"
top: "F1_L4"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 96
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU4a_2"
type: "ReLU"
bottom: "F0_L4"
top: "F0_L4"
relu_param { negative_slope: 0.1 }
}
layer {
name: "ReLU4b_2"
type: "ReLU"
bottom: "F1_L4"
top: "F1_L4"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv5"
type: "Convolution"
bottom: "F0_L4"
bottom: "F1_L4"
top: "F0_L5"
top: "F1_L5"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
stride: 2
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU5a"
type: "ReLU"
bottom: "F0_L5"
top: "F0_L5"
relu_param { negative_slope: 0.1 }
}
layer {
name: "ReLU5b"
type: "ReLU"
bottom: "F1_L5"
top: "F1_L5"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv6"
type: "Convolution"
bottom: "F0_L5"
bottom: "F1_L5"
top: "F0_L6"
top: "F1_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 192
pad: 1
kernel_size: 3
stride: 2
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU6a"
type: "ReLU"
bottom: "F0_L6"
top: "F0_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "ReLU6b"
type: "ReLU"
bottom: "F1_L6"
top: "F1_L6"
relu_param { negative_slope: 0.1 }
}
#######################################

NetE-M: L6

#######################################
layer {
name: "corr_L6"
type: "Correlation"
bottom: "F0_L6"
bottom: "F1_L6"
top: "corr_L6"
correlation_param {
pad: 3
kernel_size: 1
max_displacement: 3
stride_1: 1
stride_2: 1
}
}
layer {
name: "ReLU_corr_L6"
type: "ReLU"
bottom: "corr_L6"
top: "corr_L6"
relu_param {
negative_slope: 0.1
}
}
layer {
name: "conv1_D1_L6"
type: "Convolution"
bottom: "corr_L6"
top: "conv1_D1_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU1_D1_L6"
type: "ReLU"
bottom: "conv1_D1_L6"
top: "conv1_D1_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv2_D1_L6"
type: "Convolution"
bottom: "conv1_D1_L6"
top: "conv2_D1_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU2_D1_L6"
type: "ReLU"
bottom: "conv2_D1_L6"
top: "conv2_D1_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv3_D1_L6"
type: "Convolution"
bottom: "conv2_D1_L6"
top: "conv3_D1_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 32
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU3_D1_L6"
type: "ReLU"
bottom: "conv3_D1_L6"
top: "conv3_D1_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "scaled_flow_D1_L6"
type: "Convolution"
bottom: "conv3_D1_L6"
top: "scaled_flow_D1_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 2
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "Downsample_L6"
type: "Downsample"
bottom: "scaled_flow_gt_aug"
bottom: "scaled_flow_D1_L6"
top: "scaled_flow_label_L6"
propagate_down: false
propagate_down: false
}
layer {
name: "scaled_flow_D1_L6_loss"
type: "L1Loss"
bottom: "scaled_flow_D1_L6"
bottom: "scaled_flow_label_L6"
top: "scaled_flow_D1_L6_loss"
loss_weight: 0.32
l1_loss_param { l2_per_location: true }
}
#######################################

NetE-S: L6

#######################################
layer {
name: "FlowUnscaling_L6_D2"
type: "Eltwise"
bottom: "scaled_flow_D1_L6"
top: "flow_D1_L6"
eltwise_param {
operation: SUM
coeff: 0.625
}
}
layer {
name: "gxy_L6"
type: "Grid"
top: "gxy_L6"
bottom: "flow_D1_L6"
propagate_down: false
}
layer {
name: "coords_D1_L6"
type: "Eltwise"
bottom: "flow_D1_L6"
bottom: "gxy_L6"
top: "coords_D1_L6"
eltwise_param { coeff: 1 coeff: 1 }
}
layer {
name: "warped_F1_L6"
type: "Warp"
bottom: "F1_L6"
bottom: "coords_D1_L6"
top: "warped_D1_F1_L6"
}
layer {
name: "F_D2_L6"
bottom: "F0_L6"
bottom: "warped_D1_F1_L6"
bottom: "scaled_flow_D1_L6"
top: "F_D2_L6"
type: "Concat"
concat_param { axis: 1 }
}
layer {
name: "conv1_D2_L6"
type: "Convolution"
bottom: "F_D2_L6"
top: "conv1_D2_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU1_D2_L6"
type: "ReLU"
bottom: "conv1_D2_L6"
top: "conv1_D2_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv2_D2_L6"
type: "Convolution"
bottom: "conv1_D2_L6"
top: "conv2_D2_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU2_D2_L6"
type: "ReLU"
bottom: "conv2_D2_L6"
top: "conv2_D2_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv3_D2_L6"
type: "Convolution"
bottom: "conv2_D2_L6"
top: "conv3_D2_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 32
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU3_D2_L6"
type: "ReLU"
bottom: "conv3_D2_L6"
top: "conv3_D2_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "scaled_flow_D2_res_L6"
type: "Convolution"
bottom: "conv3_D2_L6"
top: "scaled_flow_D2_res_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 2
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "scaled_flow_D2_L6"
type: "Eltwise"
bottom: "scaled_flow_D1_L6"
bottom: "scaled_flow_D2_res_L6"
top: "scaled_flow_D2_L6"
eltwise_param { operation: SUM }
}
layer {
name: "scaled_flow_D2_L6_loss"
type: "L1Loss"
bottom: "scaled_flow_D2_L6"
bottom: "scaled_flow_label_L6"
top: "scaled_flow_D2_L6_loss"
loss_weight: 0.32
l1_loss_param { l2_per_location: true }
}
#######################################

NetE-R: L6

#######################################
layer {
name: "slice_scaled_flow_D_L6"
type: "Slice"
bottom: "scaled_flow_D2_L6"
top: "scaled_flow_D_L6_x"
top: "scaled_flow_D_L6_y"
slice_param { axis: 1 slice_point: 1 }
}
layer {
name: "reshaped_scaled_flow_D_L6_x"
type: "Im2col"
bottom: "scaled_flow_D_L6_x"
top: "reshaped_scaled_flow_D_L6_x"
convolution_param { pad: 1 kernel_size: 3 stride: 1 }
}
layer {
name: "reshaped_scaled_flow_D_L6_y"
type: "Im2col"
bottom: "scaled_flow_D_L6_y"
top: "reshaped_scaled_flow_D_L6_y"
convolution_param { pad: 1 kernel_size: 3 stride: 1 }
}
layer {
name: "mean_scaled_flow_D_L6_x"
type: "Reduction"
bottom: "scaled_flow_D_L6_x"
top: "mean_scaled_flow_D_L6_x"
reduction_param { operation: MEAN axis: 1 coeff: -1 }
}
layer {
name: "scaled_flow_D_nomean_L6_x"
type: "Bias"
bottom: "scaled_flow_D_L6_x"
bottom: "mean_scaled_flow_D_L6_x"
top: "scaled_flow_D_nomean_L6_x"
bias_param { axis: 0 }
}
layer {
name: "mean_scaled_flow_D_L6_y"
type: "Reduction"
bottom: "scaled_flow_D_L6_y"
top: "mean_scaled_flow_D_L6_y"
reduction_param { operation: MEAN axis: 1 coeff: -1 }
}
layer {
name: "scaled_flow_D_nomean_L6_y"
type: "Bias"
bottom: "scaled_flow_D_L6_y"
bottom: "mean_scaled_flow_D_L6_y"
top: "scaled_flow_D_nomean_L6_y"
bias_param { axis: 0 }
}
layer {
name: "FlowUnscaling_L6_R"
type: "Eltwise"
bottom: "scaled_flow_D2_L6"
top: "flow_D2_L6"
eltwise_param {
operation: SUM
coeff: 0.625
}
}
layer {
name: "Downsample_img0_aug_L6"
type: "Downsample"
bottom: "img0_aug"
bottom: "flow_D2_L6"
top: "img0_aug_L6"
propagate_down: false
propagate_down: false
}
layer {
name: "Downsample_img1_aug_L6"
type: "Downsample"
bottom: "img1_aug"
bottom: "flow_D2_L6"
top: "img1_aug_L6"
propagate_down: false
propagate_down: false
}
layer {
name: "coords_R_L6"
type: "Eltwise"
bottom: "flow_D2_L6"
bottom: "gxy_L6"
top: "coords_R_L6"
eltwise_param { coeff: 1 coeff: 1 }
}
layer {
name: "warped_img1_aug_L6"
type: "Warp"
bottom: "img1_aug_L6"
bottom: "coords_R_L6"
top: "warped_img1_aug_L6"
}
layer {
name: "img_diff_L6"
type: "Eltwise"
bottom: "img0_aug_L6"
bottom: "warped_img1_aug_L6"
top: "img_diff_L6"
eltwise_param {
operation: SUM
coeff: 1.0
coeff: -1.0
}
}
layer {
name: "channelNorm_L6"
type: "ChannelNorm"
bottom: "img_diff_L6"
top: "channelNorm_L6"
}
layer {
name: "concat_F0_R_L6"
type: "Concat"
bottom: "channelNorm_L6"
bottom: "scaled_flow_D_nomean_L6_x"
bottom: "scaled_flow_D_nomean_L6_y"
bottom: "F0_L6"
top: "concat_F0_R_L6"
concat_param { axis: 1 }
}
layer {
name: "conv1_R_L6"
type: "Convolution"
bottom: "concat_F0_R_L6"
top: "conv1_R_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "relu_conv1_R_L6"
type: "ReLU"
bottom: "conv1_R_L6"
top: "conv1_R_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv2_R_L6"
type: "Convolution"
bottom: "conv1_R_L6"
top: "conv2_R_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "relu_conv2_R_L6"
type: "ReLU"
bottom: "conv2_R_L6"
top: "conv2_R_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv3_R_L6"
type: "Convolution"
bottom: "conv2_R_L6"
top: "conv3_R_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "relu_conv3_R_L6"
type: "ReLU"
bottom: "conv3_R_L6"
top: "conv3_R_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv4_R_L6"
type: "Convolution"
bottom: "conv3_R_L6"
top: "conv4_R_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "relu_conv4_R_L6"
type: "ReLU"
bottom: "conv4_R_L6"
top: "conv4_R_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv5_R_L6"
type: "Convolution"
bottom: "conv4_R_L6"
top: "conv5_R_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 32
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "relu_conv5_R_L6"
type: "ReLU"
bottom: "conv5_R_L6"
top: "conv5_R_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv6_R_L6"
type: "Convolution"
bottom: "conv5_R_L6"
top: "conv6_R_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 32
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "relu_conv6_R_L6"
type: "ReLU"
bottom: "conv6_R_L6"
top: "conv6_R_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "dist_R_L6"
type: "Convolution"
bottom: "conv6_R_L6"
top: "dist_R_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 0 }
convolution_param {
num_output: 9
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" value: 0 }
engine: CUDNN
}
}
layer {
name: "sq_dist_R_L6"
type: "Power"
bottom: "dist_R_L6"
top: "sq_dist_R_L6"
power_param { power: 2 scale: 1 shift: 0 }
}
layer {
name: "neg_sq_dist_R_L6"
type: "Eltwise"
bottom: "sq_dist_R_L6"
top: "neg_sq_dist_R_L6"
eltwise_param { operation: SUM coeff: -1 }
}
layer {
name: "exp_kernel_R_L6"
type: "Softmax"
bottom: "neg_sq_dist_R_L6"
top: "exp_kernel_R_L6"
softmax_param { axis: 1 engine: CUDNN }
}
layer {
name: "f-lconv_L6_x"
type: "Eltwise"
bottom: "reshaped_scaled_flow_D_L6_x"
bottom: "exp_kernel_R_L6"
top: "f-lconv_L6_x"
eltwise_param { operation: PROD }
}
layer {
name: "scaled_flow_R_L6_x"
type: "Convolution"
bottom: "f-lconv_L6_x"
top: "scaled_flow_R_L6_x"
param { lr_mult: 0 }
param { lr_mult: 0 }
convolution_param {
num_output: 1
kernel_size: 1
weight_filler { type: "constant" value: 1 }
}
}
layer {
name: "f-lconv_L6_y"
type: "Eltwise"
bottom: "reshaped_scaled_flow_D_L6_y"
bottom: "exp_kernel_R_L6"
top: "f-lconv_L6_y"
eltwise_param { operation: PROD }
}
layer {
name: "scaled_flow_R_L6_y"
type: "Convolution"
bottom: "f-lconv_L6_y"
top: "scaled_flow_R_L6_y"
param { lr_mult: 0 }
param { lr_mult: 0 }
convolution_param {
num_output: 1
kernel_size: 1
weight_filler { type: "constant" value: 1 }
}
}
layer {
name: "scaled_flow_R_L6"
bottom: "scaled_flow_R_L6_x"
bottom: "scaled_flow_R_L6_y"
top: "scaled_flow_R_L6"
type: "Concat"
concat_param { axis: 1 }
}
layer {
name: "scaled_flow_R_L6_loss"
type: "L1Loss"
bottom: "scaled_flow_R_L6"
bottom: "scaled_flow_label_L6"
top: "scaled_flow_R_L6_loss"
loss_weight: 0.32
l1_loss_param { l2_per_location: true }
}

how to train the MPI-Sintel dataset

I don't know how to use the MPI-Sintel dataset to generate the .list docment. Can u tell me how to use this dataset and apply to the liteflownet. thank u!!

Issues during compilation.

When I am compiling by make. I am getting the following error. Most of the caffe stuff gets compiled properly but towards the end, it crashes.

SS is attached. I am not sure what is wrong.

Also, Can you tell how much GPU memory the model uses while training?

Caffe Test Net loss

While training liteflownet in caffe, what will be a good train and test net loss (considering the same loss weights: 0.32, 0.08, 0.02, 0.01, 0.005 and 1)?

LiteFlowNetX

Hi,
Thank you for your work!

Will you release the train.prototxt of LiteFlowNetX and the solver settings?

Thank you !

what's the format of the dataset.list?

can u show the format of YOUR_TRAINING_SET.list in make-lmdbs-train.sh?

Data interface

I have downloaded the data and the caffe is configured. But I don't know where the data interface . The script file in the data file requires two list files. Is it by myself to write the program to generate a .list file for the image pair and the .flo file? Thank you!
I first contacted the experiment of optical flow. Can you give a concrete example, thank you?

why input data need to minus 0.4XX

Hello, When I use the pytorch version of the code，I found a similar operation for flownet：

input_transform = transforms.Compose([
transforms.Normalize(mean=[0.411,0.432,0.45], std=[1,1,1])
])

I guess LiteFlowNet maybe have the same operation.
what its effect on the quality of the estimated flow is?
I‘m sorry for my childish question,but I'm very confused about that.

Cannot read all entries present in lmdb file

Dear Prof. Tak-Wai Hui,

I installed your software. I have 50000 entries in my lmdb file but when I run the training prototxt file, it reads only 13000 files.
Can you please let me know the reason for that?

Thanks and Regards,
Arnab

Run and save flo successfully, but the generated "*.flo" are all 'Nan' value

Charbonnier loss in your paper

Dear Sir,

I was going through your paper. There is a statement saying :
We also fine-tuned LiteFlowNet on a mixture of Sintel clean and final training data (LiteFlowNet-ft) using the generalized Charbonnier loss.
I am little bit confused.
When I look at the default caffe parameters.

// Message that stores parameters used by L1LossLayer
message L1LossParameter {
optional bool l2_per_location = 1 [default = false];
optional bool l2_prescale_by_channels = 2 [default = false]; // Old style
optional bool normalize_by_num_entries = 3 [default = false]; // if we want to normalize not by batch size, but by the number of non-NaN entries
optional float epsilon = 4 [default = 1e-2]; // constant for smoothing near zero
optional float plateau = 3001 [default = 0]; // L1 Errors smaller than plateau-value will result in zero loss and no gradient
optional float power = 5 [default = 0.5]; // for robust loss, power < 0.5 => non-convex
}

The loss function always seems to be a charbonnier loss
alpha = 1 and epsilon^2 = 1E-2.

Did you different parameters when you explicitly mention about Charbonnier loss?

Questions about finetuning on Kitti

Thank you for your outstanding work!
Here is my question about the finetuning process on Kitti. As the network has done many augmentations, note that the groundtruth values in Kitti is very sparse, so how do you implement augmentation during finetuning on Kitti training images? Or do you only skip the augmentation steps?

twhui / liteflownet Goto Github PK

liteflownet's Issues

THIS IS ONLY AN EXAMPLE. YOU CAN CHANGE THE SETTINGS, IF NECESSARY.

TRAIN Batch size: 8

TEST Batch size: 4

LiteFlowNet CVPR 2018

by

T.-W. Hui, CUHK

Pre-processing

NetC

NetE-M: L6

NetE-S: L6

NetE-R: L6

Recommend Projects

Recommend Topics

Recommend Org