Code Monkey home page Code Monkey logo

liteflownet's Introduction

LiteFlowNet

The network structure of LiteFlowNet. For the ease of representation, only a 3-level design is shown.

A cascaded flow inference module M:S in NetE.

This repository (https://github.com/twhui/LiteFlowNet) is the offical release of LiteFlowNet for my paper LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation in CVPR 2018 (Spotlight paper, 6.6%). The up-to-date version of the paper is available on arXiv.

LiteFlowNet is a lightweight, fast, and accurate opitcal flow CNN. We develop several specialized modules including (1) pyramidal features, (2) cascaded flow inference (cost volume + sub-pixel refinement), (3) feature warping (f-warp) layer, and (4) flow regularization by feature-driven local convolution (f-lconv) layer. LiteFlowNet outperforms PWC-Net (CVPR 2018) on KITTI and has a smaller model size (less than PWC-Net by ~40%). For more details about LiteFlowNet, you may visit my project page.

Oral presentation at CVPR 2018 is also available on YouTube.

KITTI12 Testing Set (Out-Noc) KITTI15 Testing Set (Fl-all) Model Size (M)
FlowNet2 (CVPR17) 4.82% 10.41% 162.49
PWC-Net (CVPR18) 4.22% 9.60% 8.75
LiteFlowNet (CVPR18) 3.27% 9.38% 5.37

LiteFlowNet2

NEW! Our extended work (LiteFlowNet2, TPAMI 2020) is now available at https://github.com/twhui/LiteFlowNet2.

LiteFlowNet2 in TPAMI 2020, another lightweight convolutional network, is evolved from LiteFlowNet (CVPR 2018) to better address the problem of optical flow estimation by improving flow accuracy and computation time. Comparing to our earlier work, LiteFlowNet2 improves the optical flow accuracy on Sintel clean pass by 23.3%, Sintel final pass by 12.8%, KITTI 2012 by 19.6%, and KITTI 2015 by 18.8%. Its runtime is 2.2 times faster!

Sintel Clean Testing Set Sintel Final Testing Set KITTI12 Testing Set (Out-Noc) KITTI15 Testing Set (Fl-all) Model Size (M) Runtime* (ms) GTX 1080
FlowNet2 (CVPR17) 4.16 5.74 4.82% 10.41% 162 121
PWC-Net+ 3.45 4.60 3.36% 7.72% 8.75 40
LiteFlowNet2 3.48 4.69 2.63% 7.62% 6.42 40

Note: *Runtime is averaged over 100 runs for a Sintel's image pair of size 1024 × 436.

LiteFlowNet3

NEW! Our extended work (LiteFlowNet3, ECCV 2020) is now available at https://github.com/twhui/LiteFlowNet3.

We ameliorate the issue of outliers in the cost volume by amending each cost vector through an adaptive modulation prior to the flow decoding. We further improve the flow accuracy by exploring local flow consistency. To this end, each inaccurate optical flow is replaced with an accurate one from a nearby position through a novel warping of the flow field. LiteFlowNet3 not only achieves promising results on public benchmarks but also has a small model size and a fast runtime.

Sintel Clean Testing Set Sintel Final Testing Set KITTI12 Testing Set (Avg-All) KITTI15 Testing Set (Fl-fg) Model Size (M) Runtime* (ms) GTX 1080
LiteFlowNet (CVPR18) 4.54 5.38 1.6 7.99% 5.4 88
LiteFlowNet2 (TPAMI20) 3.48 4.69 1.4 7.64% 6.4 40
HD3 (CVPR19) 4.79 4.67 1.4 9.02% 39.9 128
IRR-PWC (CVPR19) 3.84 4.58 1.6 7.52% 6.4 180
LiteFlowNet3 (ECCV20) 3.03 4.53 1.3 6.96% 5.2 59

Note: *Runtime is averaged over 100 runs for a Sintel's image pair of size 1024 × 436.

License and Citation

This software and associated documentation files (the "Software"), and the research paper (LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation) including but not limited to the figures, and tables (the "Paper") are provided for academic research purposes only and without any warranty. Any commercial use requires my consent. When using any parts of the Software or the Paper in your work, please cite the following paper:

@InProceedings{hui18liteflownet,    
 author = {Tak-Wai Hui and Xiaoou Tang and Chen Change Loy},    
 title = {{LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation}},    
 booktitle = {{Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}},    
 year = {2018},  
 pages = {8981--8989},
 url = {http://mmlab.ie.cuhk.edu.hk/projects/LiteFlowNet/} 
}

Datasets

  1. FlyingChairs dataset (31GB) and train-validation split.
  2. RGB image pairs (clean pass) (37GB) and flow fields (311GB) for Things3D dataset.
  3. Sintel dataset (clean + final passes) (5.3GB).
  4. KITTI12 dataset (2GB) and KITTI15 dataset (2GB) (Simple registration is required).
FlyingChairs FlyingThings3D Sintel KITTI
Crop size 448 x 320 768 x 384 768 x 384 896 x 320
Batch size 8 4 4 4

Prerequisite

The code package comes as the modified Caffe from DispFlowNet and FlowNet2 with our new layers, scripts, and trained models.

Reimplementations in Pytorch and TensorFlow are also available.

Installation was tested under Ubuntu 14.04.5/16.04.2 with CUDA 8.0, cuDNN 5.1 and openCV 2.4.8/3.1.0.

Edit Makefile.config (and Makefile) if necessary in order to fit your machine's settings.

For openCV 3+, you may need to change opencv2/gpu/gpu.hpp to opencv2/cudaarithm.hpp in /src/caffe/layers/resample_layer.cu.

If your machine installed a newer version of cuDNN, you do not need to downgrade it. You can do the following trick:

  1. Download cudnn-8.0-linux-x64-v5.1.tgz and untar it to a temp folder, say cuda-8-cudnn-5.1

  2. Rename cudnn.h to cudnn-5.1.h in the folder /cuda-8-cudnn-5.1/include

  3. $ sudo cp cuda-8-cudnn-5.1/include/cudnn-5.1.h /usr/local/cuda/include/	
    $ sudo cp cuda-8-cudnn-5.1/lib64/lib* /usr/local/cuda/lib64/
  4. Replace #include <cudnn.h> to #include <cudnn-5.1.h> in /include/caffe/util/cudnn.hpp.

Compiling

$ cd LiteFlowNet
$ make -j 8 tools pycaffe

Feature warping (f-warp) layer

The source files include /src/caffe/layers/warp_layer.cpp, /src/caffe/layers/warp_layer.cu, and /include/caffe/layers/warp_layer.hpp.

The grid pattern that is used by f-warp layer is generated by a grid layer. The source files include /src/caffe/layers/grid_layer.cpp and /include/caffe/layers/grid_layer.hpp.

Feature-driven local convolution (f-lconv) layer

It is implemented using off-the-shelf components. More details can be found in /models/testing/depoly.prototxt or /models/training_template/train.prototxt.template by locating the code segment NetE-R.

Other layers

Two custom layers (ExpMax and NegSquare) are optimized in speed for forward-pass. Generalized Charbonnier loss is implemented in l1loss_layer. The power factor (alpha) can be adjusted in l1_loss_param { power: alpha l2_per_location: true }.

Training

  1. Prepare the training set. In /data/make-lmdbs-train.sh, change YOUR_TRAINING_SET and YOUR_TESTING_SET to your favourite dataset.
$ cd LiteFlowNet/data
$ ./make-lmdbs-train.sh
  1. Copy files from /models/training_template to a new model folder (e.g. NEW). Edit all the files and make sure the settings are correct for your application. Model for the complete network is provided. LiteFlowNet uses stage-wise training to boost the performance. Please refer to my paper for more details.
$ mkdir LiteFlowNet/models/NEW
$ cd LiteFlowNet/models/NEW
$ cp ../training_template/solver.prototxt.template solver.prototxt	
$ cp ../training_template/train.prototxt.template train.prototxt
$ cp ../training_template/train.py.template train.py
  1. Create a soft link in your new model folder
$ ln -s ../../build/tools bin
  1. Run the training script
$ ./train.py -gpu 0 2>&1 | tee ./log.txt

Trained models

The trained models (liteflownet, liteflownet-ft-sintel, liteflownet-ft-kitti) are available in the folder /models/trained. Untar the files to the same folder before you use it.

liteflownet: Trained on Chairs and then fine-tuned on Things3D.

liteflownet-ft-sintel: Model used for Sintel benchmark.

liteflownet-ft-kitti: Model used for KITTI benchmark.

Testing

  1. Open the testing folder
$ cd LiteFlowNet/models/testing
  1. Create a soft link in the folder /testing
$ ln -s ../../build/tools bin
  1. Replace MODE in ./test_MODE.py to batch if all the images has the same resolution (e.g. Sintel dataset), otherwise replace it to iter (e.g. KITTI dataset).

  2. Replace MODEL in lines 9 and 10 of test_MODE.py to one of the trained models (e.g. liteflownet-ft-sintel).

  3. Run the testing script. Flow fields (MODEL-0000000.flo, MODEL-0000001.flo, ... etc) are stored in the folder /testing/results having the same order as the image pair sequence.

$ test_MODE.py img1_pathList.txt img2_pathList.txt results

Evaluation

Average end-point error can be computed using the provided script /models/testing/util/endPointErr.m

Reimplementations in PyTorch and TensorFlow

  1. A PyTorch-based reimplementation of LiteFlowNet is available at https://github.com/sniklaus/pytorch-liteflownet.
  2. A TensorFlow-based reimplementation of LiteFlowNet is also available at https://github.com/keeper121/liteflownet-tf2.

Declaration

The early version of LiteFlowNet was submitted to ICCV 2017 for review in March 2017. The improved work was published in CVPR 2018.

liteflownet's People

Contributors

twhui avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

liteflownet's Issues

issue about test image

I use 'test_iter.py' to test an image (3840px*2160px), cnn_model = 'liteflownet', I meet the following error:

F0927 16:26:54.105952 15751 blob.cpp:34] Check failed: shape[i] <= 2147483647 / count_ (1920 vs. 1713) blob size exceeds INT_MAX
*** Check failure stack trace: ***
@ 0x7f564540ddaa (unknown)
@ 0x7f564540dce4 (unknown)
@ 0x7f564540d6e6 (unknown)
@ 0x7f5645410687 (unknown)
@ 0x7f5645aea3de caffe::Blob<>::Reshape()
@ 0x7f5645b54dc6 caffe::BaseConvolutionLayer<>::Reshape()
@ 0x7f5645bdfb7f caffe::CuDNNConvolutionLayer<>::Reshape()
@ 0x7f5645aa6c6c caffe::Net<>::Init()
@ 0x7f5645aa8188 caffe::Net<>::Net()
@ 0x4072b4 test()
@ 0x405c8c main
@ 0x7f5643f66f45 (unknown)
@ 0x40645d (unknown)
@ (nil) (unknown)

When I test another image(2720px*1530px), the error is "Check failed: error == cudaSuccess (2 vs. 0) out of memory", I use a single Titan X(pascal), is there any restriction of the test image size or any reqirement of the hardware? There are indeed some images with smaller size can pass the test, so it's not the problem of configuration.

LiteFlowNetX

Hi,
Thank you for your work!

Will you release the train.prototxt of LiteFlowNetX and the solver settings?

Thank you !

Issues during compilation.

Hi

When I am compiling by make. I am getting the following error. Most of the caffe stuff gets compiled properly but towards the end, it crashes.

SS is attached. I am not sure what is wrong.

Also, Can you tell how much GPU memory the model uses while training?

image

warping each channel differently

Dear Sir,

Your paper mentions : We can also use f-warp layer to displace each channel differently when
multiple flow fields are supplied. The usage, however, is beyond the scope of
this work.

Is their any special way or parameter I have to set to use this?

Thanks and Regards,
Arnab

Cannot read all entries present in lmdb file

Dear Prof. Tak-Wai Hui,

I installed your software. I have 50000 entries in my lmdb file but when I run the training prototxt file, it reads only 13000 files.
Can you please let me know the reason for that?

Thanks and Regards,
Arnab

Difficulty in Training LiteFlowNet

Hi,

I am trying to train model for LiteFlowNet using the methodology mentioned in the paper. However, I am facing difficulties in running inference on the partial trained networks. Correct me if I am wrong. The method that I understood is to train till sub pixel refinement part of layer 6 first, followed by regularization part of layer 6 and then continue the same procedure for other layers. However, after training this way, when I run inference on the sub-portion of complete network (till the part where training has been done), the inference takes too much time outputting something like "Batch 0, img0/1_aug_L2/3/4/5 = some number" repeatedly. However, when I run inference on the whole network, inference is fast and this repetition does not take place. Can you guide me, what I am doing wrong.

Thanks for reading the post.

training templates

@twhui Hi, I really liked your work. I wanted to to use this in my gsoc project. Can you please provide the prototxt files soon ?
TIA.

Training converges but only to a minimum value of 30 for scaled_flow_R_L6_loss

Dear Sir,

Thank you for being actively replying to our queries.
I am fine tuning the network based on the weight "liteflownet". My dataset is different than the datasets used for your study. Hence the need for fine tuning. I am concentrating only on loss 6. Other losses have weight 0.

Liteflownet_training.log

The loss starts decreasing but it stops improving beyond 30. I am attaching my log file. Do you expect the loss to go further down?

I0504 23:23:12.889382 2649 solver.cpp:245] Train net output #0: scaled_flow_D1_L2_loss = 12066.4
I0504 23:23:12.889410 2649 solver.cpp:245] Train net output #1: scaled_flow_D1_L3_loss = 2873.06
I0504 23:23:12.889416 2649 solver.cpp:245] Train net output #2: scaled_flow_D1_L4_loss = 654.38
I0504 23:23:12.889422 2649 solver.cpp:245] Train net output #3: scaled_flow_D1_L5_loss = 141.828
I0504 23:23:12.889436 2649 solver.cpp:245] Train net output #4: scaled_flow_D1_L6_loss = 33.7398 (* 0.32 = 10.7967 loss)
I0504 23:23:12.889442 2649 solver.cpp:245] Train net output #5: scaled_flow_D2_L2_loss = 12453.9
I0504 23:23:12.889447 2649 solver.cpp:245] Train net output #6: scaled_flow_D2_L3_loss = 2963.34
I0504 23:23:12.889453 2649 solver.cpp:245] Train net output #7: scaled_flow_D2_L4_loss = 663.826
I0504 23:23:12.889459 2649 solver.cpp:245] Train net output #8: scaled_flow_D2_L5_loss = 142.224
I0504 23:23:12.889483 2649 solver.cpp:245] Train net output #9: scaled_flow_D2_L6_loss = 33.0278 (* 0.32 = 10.5689 loss)
I0504 23:23:12.889489 2649 solver.cpp:245] Train net output #10: scaled_flow_R_L2_loss = 12529.9
I0504 23:23:12.889497 2649 solver.cpp:245] Train net output #11: scaled_flow_R_L3_loss = 3000.42
I0504 23:23:12.889502 2649 solver.cpp:245] Train net output #12: scaled_flow_R_L4_loss = 689.57
I0504 23:23:12.889508 2649 solver.cpp:245] Train net output #13: scaled_flow_R_L5_loss = 152.621
I0504 23:23:12.889531 2649 solver.cpp:245] Train net output #14: scaled_flow_R_L6_loss = 32.6008 (* 1 = 32.6008 loss)

Data interface

I have downloaded the data and the caffe is configured. But I don't know where the data interface . The script file in the data file requires two list files. Is it by myself to write the program to generate a .list file for the image pair and the .flo file? Thank you!
I first contacted the experiment of optical flow. Can you give a concrete example, thank you?

Poor results from gray images with small motion like UCSD

I have test the all three model on UCSD dataset, which are gray images. I make them RGB and feed into the model, but get really bad result compared to flownet2.

I visualize the flow as Baker et al. "A Database and Evaluation Methodology for Optical Flow" (ICCV, 2007) URL: http://vision.middlebury.edu/flow/flowEval-iccv07.pdf, using the code here
It work well on other RGB dataset like CUHK avenue, which have a larger motion. Is there any wrong I missed or the models are not suitable for gray images or small motion image pairs..

first image
first image

second image
second image

sintel reuslt
sintel result

default result
default result

kitti result, this seems better but not practical
kitti result

while the result from flownet2 much better
image

the number in .flo are very small, but the visualization tells the failure.

Access to training curves

I wanted to have a look at the drop in loss function as the training progresses.
Is it possible for you to give me access to the training curve plots?

make tools and pycaffe

make command gives the same error for both which is -> make: Nothing to be done for 'pycaffe'.

LiteFlowNet cnn

HI

I need to apply LiteFlowNet on video data and extract flow features to feed it to cnn.

What kind of features can I get from videos and how can I save them for each video file ?

Training stopped at level L2

Dear Dr. Tak Wai HUI

Can you please let me know why did you stop at level L2 loss and didn't train the network for full resolution loss (L1 I mean) : 768x384?

Thanks and Regards,
Arnab

Charbonnier loss in your paper

Dear Sir,

I was going through your paper. There is a statement saying :
We also fine-tuned LiteFlowNet on a mixture of Sintel clean and final training data (LiteFlowNet-ft) using the generalized Charbonnier loss.
I am little bit confused.
When I look at the default caffe parameters.

// Message that stores parameters used by L1LossLayer
message L1LossParameter {
optional bool l2_per_location = 1 [default = false];
optional bool l2_prescale_by_channels = 2 [default = false]; // Old style
optional bool normalize_by_num_entries = 3 [default = false]; // if we want to normalize not by batch size, but by the number of non-NaN entries
optional float epsilon = 4 [default = 1e-2]; // constant for smoothing near zero
optional float plateau = 3001 [default = 0]; // L1 Errors smaller than plateau-value will result in zero loss and no gradient
optional float power = 5 [default = 0.5]; // for robust loss, power < 0.5 => non-convex
}

The loss function always seems to be a charbonnier loss
alpha = 1 and epsilon^2 = 1E-2.

Did you different parameters when you explicitly mention about Charbonnier loss?

Help needed regarding training strategy used for finetuning with FlyingThings dataset

Hi,

I tried to replicate the lite-flow caffe model using the procedure describe in the paper. For training done on FlyingChairs dataset, the accuracy reached was close to what was reported in the paper (33.68% as compared to 32.59%). However, training on FlyingThings dataset is not increasing accuracy significantly. I can only reach till 32.63% using FlyingThinggs dataset (not 28.59 %) even after finetuning for more than 500k iterations. I had also removed the harmful dataset as pointed out in FlowNet2. One thing that I have noticed is that the training is very slow for the last layer (layer2) and accuracy has not increased much by adding this layer. Moreover, the test loss for this layer is much greater than the other layers,.

Can you guide me about the training procedure for finetuning with FlyingThings dataset. I cannot figure out what confiugurations are used in solver prototxt file for finetuning with FlyingThings dataset.

Thanks for taking your time and reading the post :)

Thanks.

invalid literal for int() with base 10: "b'1241"

I use 'test_iter.py' to test KITTI datasets , model is liteflownet-ft-kitti , and I meet the following error:
Traceback (most recent call last):
File "test_iter.py", line 96, in
img1_size = get_image_size(images[0][idx])
File "test_iter.py", line 15, in get_image_size
dim_list = [int(dimstr) for dimstr in str(subprocess.check_output([img_size_bin, filename])).split(',')]
File "test_iter.py", line 15, in
dim_list = [int(dimstr) for dimstr in str(subprocess.check_output([img_size_bin, filename])).split(',')]
ValueError: invalid literal for int() with base 10: "b'1241"

how to train the MPI-Sintel dataset

I don't know how to use the MPI-Sintel dataset to generate the .list docment. Can u tell me how to use this dataset and apply to the liteflownet. thank u!!

Questions about finetuning on Kitti

Thank you for your outstanding work!
Here is my question about the finetuning process on Kitti. As the network has done many augmentations, note that the groundtruth values in Kitti is very sparse, so how do you implement augmentation during finetuning on Kitti training images? Or do you only skip the augmentation steps?

why input data need to minus 0.4XX

Hello, When I use the pytorch version of the code,I found a similar operation for flownet:

input_transform = transforms.Compose([
transforms.Normalize(mean=[0.411,0.432,0.45], std=[1,1,1])
])

I guess LiteFlowNet maybe have the same operation.
what its effect on the quality of the estimated flow is?
I‘m sorry for my childish question,but I'm very confused about that.

How to extract features from your model trained model

I have followed the procedure of features extraction method as given in caffe website. However, it give me an error.
modified code.
caffe_bin = 'bin/extract_features.bin'
args = [caffe_bin, '../trained/' + cnn_model + '.caffemodel','tmp/deploy.prototxt', 'conv4_R_L5','/results', '1', 'leveldb','-iterations', str(1), '-gpu', '0']]

error

E0121 17:42:07.825711 10867 extract_features.cpp:62] Using CPU
E0121 17:42:09.102257 10867 extract_features.cpp:133] Extracting Features
F0121 17:42:09.102833 10867 data_augmentation_layer.cpp:211] Forward CPU Augmentation not implemented.

I already compiled it with GPU and test_iter.py is working for flow creation but not working for feature extraction.

OpenCV installation issue

First of all great job! However I have an installation error related to opencv

After I run make -j 8 all tools pycaffe I get the following error
#error "OpenCV 4.x+ requires enabled C++11 support" # error "OpenCV 4.x+ requires enabled C++11 support"

My opencv version is 3.2.0, and I have followed your instructions. Do you have any idea what might cause this error and how I can get by? Thank you for your time in advance.

find: ‘examples’: No such file or directory

After downloading the code,I only changed my cuda path in makefile.config, then:
cd LiteFlowNet
make -j 8 tools pycaffe
something wrong happened:
matlab/+caffe/private python/caffe src/gtest src/caffe src/caffe/test src/caffe/layers src/caffe/util src/caffe/proto src/caffe/solvers tools
find: ‘examples’: No such file or directory
find: ‘examples’: No such file or directory
find: ‘examples’: No such file or directory
touch python/caffe/proto/init.py
PROTOC src/caffe/proto/caffe.proto
CXX tools/convert_imageset.cpp

cuda:8.0
cudnn:5.1

could you give me some advice ?

‘ReadImageToCVMat’ was not declared in this scope

When compiling, I got this error. Please help! thx!
src/caffe/layers/imgreader_layer.cpp: In instantiation of ‘void caffe::ImgReaderLayer::ReadData() [with Dtype = double]’:
src/caffe/layers/imgreader_layer.cpp:80:1: required from here
src/caffe/layers/imgreader_layer.cpp:45:38: error: ‘ReadImageToCVMat’ was not declared in this scope
Makefile:576: recipe for target '.build_release/src/caffe/layers/imgreader_layer.o' failed

Convert training dataset and training issues

hello,
I followed the instructions and run the train.py with python train.py -gpu 0 2>&1 | tee ./log.txt^C. And here is the error info:

F0901 13:01:17.908138 23503 custom_data_layer.cpp:361] Check failed: mdb_env_open(mdb_env_, this->layer_param_.data_param().source().c_str(), 0x20000|0x200000, 0664) == 0 (2 vs. 0) mdb_env_open failed *** Check failure stack trace: *** @ 0x7fdcf54b55cd google::LogMessage::Fail() @ 0x7fdcf54b7433 google::LogMessage::SendToLog() @ 0x7fdcf54b515b google::LogMessage::Flush() @ 0x7fdcf54b7e1e google::LogMessageFatal::~LogMessageFatal() @ 0x7fdcf5ccf423 caffe::CustomDataLayer<>::LayerSetUp() @ 0x7fdcf5d6fada caffe::Net<>::Init() @ 0x7fdcf5d712f1 caffe::Net<>::Net() @ 0x7fdcf5d51d3a caffe::Solver<>::InitTrainNet() @ 0x7fdcf5d53077 caffe::Solver<>::Init() @ 0x7fdcf5d5341a caffe::Solver<>::Solver() @ 0x7fdcf5d3c3a3 caffe::Creator_AdamSolver<>() @ 0x40a6e8 train() @ 0x4075a8 main @ 0x7fdcf3e71830 __libc_start_main @ 0x407d19 _start @ (nil) (unknown)

How to correct this address alignment error?Thank you~ Looking forward for your reply~

Pytorch version

will you develop a pytorch version of LiteFlowNet? and beside, i look forward to the prototex file of the network framework

question about the implementation of 'f-lcon'

HI:

the paper lack of the introduction of the final implement of 'f-lcon', as I can not figure out a proper way to obtain 'f_lcon5_R(flow5_R, N2WH)' from 'softmax5_R(N9WH)' and 'flow5_S(N2W*H)'. Is the implement of 'lcon' introduced in the reference paper 'DeepFace:Closing the gap to human-level performance in face verification.'?

train/test prototxt

Would you be willing to share the train/test prototxt?

I noticed in the train section we're asked to copy from this directory, which I cannot find:
2. Copy files from LiteFlowNet/models/training_template to a new model folder...

Additionally, in the test script it mentions "deploy.tpl.prototxt", which I also cannot find.

If you could point me in the right direction or share these files, that would be appreciated.

input data is color or gray image

./tools/convert_imageset_and_flow.cpp show that you load color image, but as i known, the tranditional optical flow estimation alg is based on gray scale image,why do you use color image?

"training_template" folder not found

According to the readme, there should be a folder called "traing_template" under the folder "model" which contains the prototxt file and script for training, but I have not found it there, even in the total repository.

Convert training dataset and training issues

Thanks for your sharing job, I tested your model and gained an excellent result. I want to train a my own model on the sintel dataset. I encountered some problems.
First, I am confused that how to convert the dataset and flow to the lmdb files. I downloaded the Sintel dataset and unzip into a training folder and a test folder. Then I modified the make-lmdbs-train.sh into

#!/bin/bash
../build/tools/convert_imageset_and_flow.bin training.list training_lmdb 0 lmdb
../build/tools/convert_imageset_and_flow.bin test.list test_lmdb 0 lmdb

And I run the shell script and no errors appearing. I wonder know whether it is a right method to generate new training lmbd files.

Second,
I followed the instructions and run the train.py with python train.py -gpu 0 2>&1 | tee ./log.txt^C. And here is the error info:

mkdir: cannot create directory ‘training’: File exists
I0718 16:23:17.060703 30561 upgrade_proto.cpp:1044] Attempting to upgrade input file specified using deprecated 'solver_type'                      field (enum)': ../solver.prototxt
I0718 16:23:17.060874 30561 upgrade_proto.cpp:1051] Successfully upgraded file specified using deprecated 'solver_type' field                      (enum) to 'type' field (string).
W0718 16:23:17.060883 30561 upgrade_proto.cpp:1053] Note that future Caffe releases will only support 'type' field (string) f                     or a solver's type.
I0718 16:23:17.060953 30561 caffe.cpp:185] Using GPUs 0
I0718 16:23:17.364331 30561 caffe.cpp:190] GPU 0: Tesla K80
I0718 16:23:17.935869 30561 solver.cpp:48] Initializing solver from parameters:
test_iter: 160
test_interval: 5000
base_lr: 4e-05
display: 250
max_iter: 300000
lr_policy: "multistep"
gamma: 0.5
momentum: 0.9
weight_decay: 0.0004
snapshot: 30000
snapshot_prefix: "flow"
solver_mode: GPU
device_id: 0
net: "../model/train.prototxt"
stepvalue: 120000
stepvalue: 160000
stepvalue: 200000
stepvalue: 240000
momentum2: 0.999
type: "Adam"
I0718 16:23:17.936089 30561 solver.cpp:91] Creating training net from net file: ../model/train.prototxt
F0718 16:23:17.936120 30561 io.cpp:36] Check failed: fd != -1 (-1 vs. -1) File not found: ../model/train.prototxt
*** Check failure stack trace: ***
    @     0x7f3c1b986daa  (unknown)
    @     0x7f3c1b986ce4  (unknown)
    @     0x7f3c1b9866e6  (unknown)
    @     0x7f3c1b989687  (unknown)
    @     0x7f3c1c1ab7ed  caffe::ReadProtoFromTextFile()
    @     0x7f3c1c1b4ae4  caffe::ReadNetParamsFromTextFileOrDie()
    @     0x7f3c1bff8b3b  caffe::Solver<>::InitTrainNet()
    @     0x7f3c1bff9c0c  caffe::Solver<>::Init()
    @     0x7f3c1bff9f3a  caffe::Solver<>::Solver()
    @     0x7f3c1c02c203  caffe::Creator_AdamSolver<>()
    @           0x40ed2e  caffe::SolverRegistry<>::CreateSolver()
    @           0x407ec2  train()
    @           0x405cbc  main
    @     0x7f3c1a638ec5  (unknown)
    @           0x40648d  (unknown)
    @              (nil)  (unknown)
('args:', ['-gpu', '0'])
Executing ../bin/caffe train -model ../train.prototxt -solver ../solver.prototxt -gpu 0

I found that there are some errors from the path of solver.prototxt and train.prototxt. However I just followed your steps in instructions. So how to deal with it?

Thank you~ Looking forward for your reply~

resample_layer.hpp missing

Hi, it seems resample_layer.hpp not in layers/ . I met some error about opencv, ReadImagefromMat not declared in the scope. Is current repo not complete to compile successfully? Thanks.

the training is not convergent

i follow the training step in the paper
firstly,i want to train the L6.but i found it not converge.

the solver.prototxt is :

THIS IS ONLY AN EXAMPLE. YOU CAN CHANGE THE SETTINGS, IF NECESSARY.

net: "train.prototxt"

base_lr: 1e-4
lr_policy: "multistep"
gamma: 0.5

TRAIN Batch size: 8

TEST Batch size: 4

test_iter: 160

test_interval: 5000
max_iter: 300000
snapshot: 30000

momentum: 0.9
weight_decay: 0.0004
display: 250

snapshot_prefix: "L6"
solver_mode: GPU
solver_type: ADAM
momentum2: 0.999

the train.prototxt is:
#######################################

LiteFlowNet CVPR 2018

by

T.-W. Hui, CUHK

#######################################
layer {
name: "CustomData1"
type: "CustomData"
top: "blob0"
top: "blob1"
top: "blob2y"
include {
phase: TRAIN
}
data_param {
# THIS IS ONLY AN EXAMPLE. YOU CAN CHANGE THE SETTINGS, IF NECESSARY.
source: "/data/chair_lmdb"
batch_size: 6
backend: LMDB
rand_permute: true
rand_permute_seed: 77
slice_point: 3
slice_point: 6
encoding: UINT8
encoding: UINT8
encoding: UINT16FLOW
verbose: true
}
}
layer {
name: "CustomData2"
type: "CustomData"
top: "blob0"
top: "blob1"
top: "blob2y"
include {
phase: TEST
}
data_param {
# THIS IS ONLY AN EXAMPLE. YOU CAN CHANGE THE SETTINGS, IF NECESSARY.
source: "/data/chair_lmdb"
batch_size: 1
backend: LMDB
rand_permute: true
rand_permute_seed: 77
slice_point: 3
slice_point: 6
encoding: UINT8
encoding: UINT8
encoding: UINT16FLOW
verbose: true
}
}

layer {
name: "Eltwise1"
type: "Eltwise"
bottom: "blob2y"
top: "blob2"
eltwise_param {
operation: SUM
coeff: 0.01
}
}
#######################################

Pre-processing

#######################################
layer {
name: "Eltwise1"
type: "Eltwise"
bottom: "blob0"
top: "blob4"
eltwise_param {
operation: SUM
coeff: 0.00392156862745
}
}
layer {
name: "Eltwise2"
type: "Eltwise"
bottom: "blob1"
top: "blob5"
eltwise_param {
operation: SUM
coeff: 0.00392156862745
}
}
layer {
name: "img0s_aug"
type: "DataAugmentation"
bottom: "blob4"
top: "img0_aug"
top: "blob7"
propagate_down: false
augmentation_param {
max_multiplier: 1
augment_during_test: false
recompute_mean: 3000
mean_per_pixel: false
translate {
rand_type: "uniform_bernoulli"
exp: false
mean: 0
spread: 0.4
prob: 1.0
}
rotate {
rand_type: "uniform_bernoulli"
exp: false
mean: 0
spread: 0.4
prob: 1.0
}
zoom {
rand_type: "uniform_bernoulli"
exp: true
mean: 0.2
spread: 0.4
prob: 1.0
}
squeeze {
rand_type: "uniform_bernoulli"
exp: true
mean: 0
spread: 0.3
prob: 1.0
}
lmult_pow {
rand_type: "uniform_bernoulli"
exp: true
mean: -0.2
spread: 0.4
prob: 1.0
}
lmult_mult {
rand_type: "uniform_bernoulli"
exp: true
mean: 0.0
spread: 0.4
prob: 1.0
}
lmult_add {
rand_type: "uniform_bernoulli"
exp: false
mean: 0
spread: 0.03
prob: 1.0
}
sat_pow {
rand_type: "uniform_bernoulli"
exp: true
mean: 0
spread: 0.4
prob: 1.0
}
sat_mult {
rand_type: "uniform_bernoulli"
exp: true
mean: -0.3
spread: 0.5
prob: 1.0
}
sat_add {
rand_type: "uniform_bernoulli"
exp: false
mean: 0
spread: 0.03
prob: 1.0
}
col_pow {
rand_type: "gaussian_bernoulli"
exp: true
mean: 0
spread: 0.4
prob: 1.0
}
col_mult {
rand_type: "gaussian_bernoulli"
exp: true
mean: 0
spread: 0.2
prob: 1.0
}
col_add {
rand_type: "gaussian_bernoulli"
exp: false
mean: 0
spread: 0.02
prob: 1.0
}
ladd_pow {
rand_type: "gaussian_bernoulli"
exp: true
mean: 0
spread: 0.4
prob: 1.0
}
ladd_mult {
rand_type: "gaussian_bernoulli"
exp: true
mean: 0.0
spread: 0.4
prob: 1.0
}
ladd_add {
rand_type: "gaussian_bernoulli"
exp: false
mean: 0
spread: 0.04
prob: 1.0
}
col_rotate {
rand_type: "uniform_bernoulli"
exp: false
mean: 0
spread: 1
prob: 1.0
}
crop_width: 448
crop_height: 320
chromatic_eigvec: 0.51
chromatic_eigvec: 0.56
chromatic_eigvec: 0.65
chromatic_eigvec: 0.79
chromatic_eigvec: 0.01
chromatic_eigvec: -0.62
chromatic_eigvec: 0.35
chromatic_eigvec: -0.83
chromatic_eigvec: 0.44
}
}
layer {
name: "aug_params1"
type: "GenerateAugmentationParameters"
bottom: "blob7"
bottom: "blob4"
bottom: "img0_aug"
top: "blob8"
augmentation_param {
augment_during_test: false
translate {
rand_type: "gaussian_bernoulli"
exp: false
mean: 0
spread: 0.03
prob: 1.0
}
rotate {
rand_type: "gaussian_bernoulli"
exp: false
mean: 0
spread: 0.03
prob: 1.0
}
zoom {
rand_type: "gaussian_bernoulli"
exp: true
mean: 0
spread: 0.03
prob: 1.0
}
gamma {
rand_type: "gaussian_bernoulli"
exp: true
mean: 0
spread: 0.02
prob: 1.0
}
brightness {
rand_type: "gaussian_bernoulli"
exp: false
mean: 0
spread: 0.02
prob: 1.0
}
contrast {
rand_type: "gaussian_bernoulli"
exp: true
mean: 0
spread: 0.02
prob: 1.0
}
color {
rand_type: "gaussian_bernoulli"
exp: true
mean: 0
spread: 0.02
prob: 1.0
}
}
coeff_schedule_param {
half_life: 50000
initial_coeff: 0.5
final_coeff: 1
}
}
layer {
name: "img1s_aug"
type: "DataAugmentation"
bottom: "blob5"
bottom: "blob8"
top: "img1_aug"
propagate_down: false
propagate_down: false
augmentation_param {
max_multiplier: 1
augment_during_test: false
recompute_mean: 3000
mean_per_pixel: false
crop_width: 448
crop_height: 320
chromatic_eigvec: 0.51
chromatic_eigvec: 0.56
chromatic_eigvec: 0.65
chromatic_eigvec: 0.79
chromatic_eigvec: 0.01
chromatic_eigvec: -0.62
chromatic_eigvec: 0.35
chromatic_eigvec: -0.83
chromatic_eigvec: 0.44
}
}
layer {
name: "FlowAugmentation1"
type: "FlowAugmentation"
bottom: "blob2"
bottom: "blob7"
bottom: "blob8"
top: "flow_gt_aug"
augmentation_param {
crop_width: 448
crop_height: 320
}
}
layer {
name: "FlowScaling"
type: "Eltwise"
bottom: "flow_gt_aug"
top: "scaled_flow_gt_aug"
eltwise_param {
operation: SUM
coeff: 0.05
}
}
#######################################

NetC

#######################################
layer {
name: "conv1"
type: "Convolution"
bottom: "img0_aug"
bottom: "img1_aug"
top: "F0_L1"
top: "F1_L1"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 32
pad: 3
kernel_size: 7
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU1a"
type: "ReLU"
bottom: "F0_L1"
top: "F0_L1"
relu_param { negative_slope: 0.1 }
}
layer {
name: "ReLU1b"
type: "ReLU"
bottom: "F1_L1"
top: "F1_L1"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv2_1"
type: "Convolution"
bottom: "F0_L1"
bottom: "F1_L1"
top: "F0_1_L2"
top: "F1_1_L2"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 32
pad: 1
kernel_size: 3
stride: 2
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU2a_1"
type: "ReLU"
bottom: "F0_1_L2"
top: "F0_1_L2"
relu_param { negative_slope: 0.1 }
}
layer {
name: "ReLU2b_1"
type: "ReLU"
bottom: "F1_1_L2"
top: "F1_1_L2"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv2_2"
type: "Convolution"
bottom: "F0_1_L2"
bottom: "F1_1_L2"
top: "F0_2_L2"
top: "F1_2_L2"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 32
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU2a_2"
type: "ReLU"
bottom: "F0_2_L2"
top: "F0_2_L2"
relu_param { negative_slope: 0.1 }
}
layer {
name: "ReLU2b_2"
type: "ReLU"
bottom: "F1_2_L2"
top: "F1_2_L2"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv2_3"
type: "Convolution"
bottom: "F0_2_L2"
bottom: "F1_2_L2"
top: "F0_L2"
top: "F1_L2"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 32
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU2a_3"
type: "ReLU"
bottom: "F0_L2"
top: "F0_L2"
relu_param { negative_slope: 0.1 }
}
layer {
name: "ReLU2b_3"
type: "ReLU"
bottom: "F1_L2"
top: "F1_L2"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv3_1"
type: "Convolution"
bottom: "F0_L2"
bottom: "F1_L2"
top: "F0_1_L3"
top: "F1_1_L3"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
stride: 2
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU3a_1"
type: "ReLU"
bottom: "F0_1_L3"
top: "F0_1_L3"
relu_param { negative_slope: 0.1 }
}
layer {
name: "ReLU3b_1"
type: "ReLU"
bottom: "F1_1_L3"
top: "F1_1_L3"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv3_2"
type: "Convolution"
bottom: "F0_1_L3"
bottom: "F1_1_L3"
top: "F0_L3"
top: "F1_L3"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU3a_2"
type: "ReLU"
bottom: "F0_L3"
top: "F0_L3"
relu_param { negative_slope: 0.1 }
}
layer {
name: "ReLU3b_2"
type: "ReLU"
bottom: "F1_L3"
top: "F1_L3"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv4_1"
type: "Convolution"
bottom: "F0_L3"
bottom: "F1_L3"
top: "F0_1_L4"
top: "F1_1_L4"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 96
pad: 1
kernel_size: 3
stride: 2
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU4a_1"
type: "ReLU"
bottom: "F0_1_L4"
top: "F0_1_L4"
relu_param { negative_slope: 0.1 }
}
layer {
name: "ReLU4b_1"
type: "ReLU"
bottom: "F1_1_L4"
top: "F1_1_L4"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv4_2"
type: "Convolution"
bottom: "F0_1_L4"
bottom: "F1_1_L4"
top: "F0_L4"
top: "F1_L4"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 96
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU4a_2"
type: "ReLU"
bottom: "F0_L4"
top: "F0_L4"
relu_param { negative_slope: 0.1 }
}
layer {
name: "ReLU4b_2"
type: "ReLU"
bottom: "F1_L4"
top: "F1_L4"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv5"
type: "Convolution"
bottom: "F0_L4"
bottom: "F1_L4"
top: "F0_L5"
top: "F1_L5"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
stride: 2
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU5a"
type: "ReLU"
bottom: "F0_L5"
top: "F0_L5"
relu_param { negative_slope: 0.1 }
}
layer {
name: "ReLU5b"
type: "ReLU"
bottom: "F1_L5"
top: "F1_L5"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv6"
type: "Convolution"
bottom: "F0_L5"
bottom: "F1_L5"
top: "F0_L6"
top: "F1_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 192
pad: 1
kernel_size: 3
stride: 2
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU6a"
type: "ReLU"
bottom: "F0_L6"
top: "F0_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "ReLU6b"
type: "ReLU"
bottom: "F1_L6"
top: "F1_L6"
relu_param { negative_slope: 0.1 }
}
#######################################

NetE-M: L6

#######################################
layer {
name: "corr_L6"
type: "Correlation"
bottom: "F0_L6"
bottom: "F1_L6"
top: "corr_L6"
correlation_param {
pad: 3
kernel_size: 1
max_displacement: 3
stride_1: 1
stride_2: 1
}
}
layer {
name: "ReLU_corr_L6"
type: "ReLU"
bottom: "corr_L6"
top: "corr_L6"
relu_param {
negative_slope: 0.1
}
}
layer {
name: "conv1_D1_L6"
type: "Convolution"
bottom: "corr_L6"
top: "conv1_D1_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU1_D1_L6"
type: "ReLU"
bottom: "conv1_D1_L6"
top: "conv1_D1_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv2_D1_L6"
type: "Convolution"
bottom: "conv1_D1_L6"
top: "conv2_D1_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU2_D1_L6"
type: "ReLU"
bottom: "conv2_D1_L6"
top: "conv2_D1_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv3_D1_L6"
type: "Convolution"
bottom: "conv2_D1_L6"
top: "conv3_D1_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 32
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU3_D1_L6"
type: "ReLU"
bottom: "conv3_D1_L6"
top: "conv3_D1_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "scaled_flow_D1_L6"
type: "Convolution"
bottom: "conv3_D1_L6"
top: "scaled_flow_D1_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 2
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "Downsample_L6"
type: "Downsample"
bottom: "scaled_flow_gt_aug"
bottom: "scaled_flow_D1_L6"
top: "scaled_flow_label_L6"
propagate_down: false
propagate_down: false
}
layer {
name: "scaled_flow_D1_L6_loss"
type: "L1Loss"
bottom: "scaled_flow_D1_L6"
bottom: "scaled_flow_label_L6"
top: "scaled_flow_D1_L6_loss"
loss_weight: 0.32
l1_loss_param { l2_per_location: true }
}
#######################################

NetE-S: L6

#######################################
layer {
name: "FlowUnscaling_L6_D2"
type: "Eltwise"
bottom: "scaled_flow_D1_L6"
top: "flow_D1_L6"
eltwise_param {
operation: SUM
coeff: 0.625
}
}
layer {
name: "gxy_L6"
type: "Grid"
top: "gxy_L6"
bottom: "flow_D1_L6"
propagate_down: false
}
layer {
name: "coords_D1_L6"
type: "Eltwise"
bottom: "flow_D1_L6"
bottom: "gxy_L6"
top: "coords_D1_L6"
eltwise_param { coeff: 1 coeff: 1 }
}
layer {
name: "warped_F1_L6"
type: "Warp"
bottom: "F1_L6"
bottom: "coords_D1_L6"
top: "warped_D1_F1_L6"
}
layer {
name: "F_D2_L6"
bottom: "F0_L6"
bottom: "warped_D1_F1_L6"
bottom: "scaled_flow_D1_L6"
top: "F_D2_L6"
type: "Concat"
concat_param { axis: 1 }
}
layer {
name: "conv1_D2_L6"
type: "Convolution"
bottom: "F_D2_L6"
top: "conv1_D2_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU1_D2_L6"
type: "ReLU"
bottom: "conv1_D2_L6"
top: "conv1_D2_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv2_D2_L6"
type: "Convolution"
bottom: "conv1_D2_L6"
top: "conv2_D2_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU2_D2_L6"
type: "ReLU"
bottom: "conv2_D2_L6"
top: "conv2_D2_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv3_D2_L6"
type: "Convolution"
bottom: "conv2_D2_L6"
top: "conv3_D2_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 32
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "ReLU3_D2_L6"
type: "ReLU"
bottom: "conv3_D2_L6"
top: "conv3_D2_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "scaled_flow_D2_res_L6"
type: "Convolution"
bottom: "conv3_D2_L6"
top: "scaled_flow_D2_res_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 2
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "scaled_flow_D2_L6"
type: "Eltwise"
bottom: "scaled_flow_D1_L6"
bottom: "scaled_flow_D2_res_L6"
top: "scaled_flow_D2_L6"
eltwise_param { operation: SUM }
}
layer {
name: "scaled_flow_D2_L6_loss"
type: "L1Loss"
bottom: "scaled_flow_D2_L6"
bottom: "scaled_flow_label_L6"
top: "scaled_flow_D2_L6_loss"
loss_weight: 0.32
l1_loss_param { l2_per_location: true }
}
#######################################

NetE-R: L6

#######################################
layer {
name: "slice_scaled_flow_D_L6"
type: "Slice"
bottom: "scaled_flow_D2_L6"
top: "scaled_flow_D_L6_x"
top: "scaled_flow_D_L6_y"
slice_param { axis: 1 slice_point: 1 }
}
layer {
name: "reshaped_scaled_flow_D_L6_x"
type: "Im2col"
bottom: "scaled_flow_D_L6_x"
top: "reshaped_scaled_flow_D_L6_x"
convolution_param { pad: 1 kernel_size: 3 stride: 1 }
}
layer {
name: "reshaped_scaled_flow_D_L6_y"
type: "Im2col"
bottom: "scaled_flow_D_L6_y"
top: "reshaped_scaled_flow_D_L6_y"
convolution_param { pad: 1 kernel_size: 3 stride: 1 }
}
layer {
name: "mean_scaled_flow_D_L6_x"
type: "Reduction"
bottom: "scaled_flow_D_L6_x"
top: "mean_scaled_flow_D_L6_x"
reduction_param { operation: MEAN axis: 1 coeff: -1 }
}
layer {
name: "scaled_flow_D_nomean_L6_x"
type: "Bias"
bottom: "scaled_flow_D_L6_x"
bottom: "mean_scaled_flow_D_L6_x"
top: "scaled_flow_D_nomean_L6_x"
bias_param { axis: 0 }
}
layer {
name: "mean_scaled_flow_D_L6_y"
type: "Reduction"
bottom: "scaled_flow_D_L6_y"
top: "mean_scaled_flow_D_L6_y"
reduction_param { operation: MEAN axis: 1 coeff: -1 }
}
layer {
name: "scaled_flow_D_nomean_L6_y"
type: "Bias"
bottom: "scaled_flow_D_L6_y"
bottom: "mean_scaled_flow_D_L6_y"
top: "scaled_flow_D_nomean_L6_y"
bias_param { axis: 0 }
}
layer {
name: "FlowUnscaling_L6_R"
type: "Eltwise"
bottom: "scaled_flow_D2_L6"
top: "flow_D2_L6"
eltwise_param {
operation: SUM
coeff: 0.625
}
}
layer {
name: "Downsample_img0_aug_L6"
type: "Downsample"
bottom: "img0_aug"
bottom: "flow_D2_L6"
top: "img0_aug_L6"
propagate_down: false
propagate_down: false
}
layer {
name: "Downsample_img1_aug_L6"
type: "Downsample"
bottom: "img1_aug"
bottom: "flow_D2_L6"
top: "img1_aug_L6"
propagate_down: false
propagate_down: false
}
layer {
name: "coords_R_L6"
type: "Eltwise"
bottom: "flow_D2_L6"
bottom: "gxy_L6"
top: "coords_R_L6"
eltwise_param { coeff: 1 coeff: 1 }
}
layer {
name: "warped_img1_aug_L6"
type: "Warp"
bottom: "img1_aug_L6"
bottom: "coords_R_L6"
top: "warped_img1_aug_L6"
}
layer {
name: "img_diff_L6"
type: "Eltwise"
bottom: "img0_aug_L6"
bottom: "warped_img1_aug_L6"
top: "img_diff_L6"
eltwise_param {
operation: SUM
coeff: 1.0
coeff: -1.0
}
}
layer {
name: "channelNorm_L6"
type: "ChannelNorm"
bottom: "img_diff_L6"
top: "channelNorm_L6"
}
layer {
name: "concat_F0_R_L6"
type: "Concat"
bottom: "channelNorm_L6"
bottom: "scaled_flow_D_nomean_L6_x"
bottom: "scaled_flow_D_nomean_L6_y"
bottom: "F0_L6"
top: "concat_F0_R_L6"
concat_param { axis: 1 }
}
layer {
name: "conv1_R_L6"
type: "Convolution"
bottom: "concat_F0_R_L6"
top: "conv1_R_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "relu_conv1_R_L6"
type: "ReLU"
bottom: "conv1_R_L6"
top: "conv1_R_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv2_R_L6"
type: "Convolution"
bottom: "conv1_R_L6"
top: "conv2_R_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "relu_conv2_R_L6"
type: "ReLU"
bottom: "conv2_R_L6"
top: "conv2_R_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv3_R_L6"
type: "Convolution"
bottom: "conv2_R_L6"
top: "conv3_R_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "relu_conv3_R_L6"
type: "ReLU"
bottom: "conv3_R_L6"
top: "conv3_R_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv4_R_L6"
type: "Convolution"
bottom: "conv3_R_L6"
top: "conv4_R_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "relu_conv4_R_L6"
type: "ReLU"
bottom: "conv4_R_L6"
top: "conv4_R_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv5_R_L6"
type: "Convolution"
bottom: "conv4_R_L6"
top: "conv5_R_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 32
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "relu_conv5_R_L6"
type: "ReLU"
bottom: "conv5_R_L6"
top: "conv5_R_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "conv6_R_L6"
type: "Convolution"
bottom: "conv5_R_L6"
top: "conv6_R_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 1 decay_mult: 0 }
convolution_param {
num_output: 32
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" }
engine: CUDNN
}
}
layer {
name: "relu_conv6_R_L6"
type: "ReLU"
bottom: "conv6_R_L6"
top: "conv6_R_L6"
relu_param { negative_slope: 0.1 }
}
layer {
name: "dist_R_L6"
type: "Convolution"
bottom: "conv6_R_L6"
top: "dist_R_L6"
param { lr_mult: 1 decay_mult: 1 }
param { lr_mult: 0 }
convolution_param {
num_output: 9
pad: 1
kernel_size: 3
stride: 1
weight_filler { type: "msra" }
bias_filler { type: "constant" value: 0 }
engine: CUDNN
}
}
layer {
name: "sq_dist_R_L6"
type: "Power"
bottom: "dist_R_L6"
top: "sq_dist_R_L6"
power_param { power: 2 scale: 1 shift: 0 }
}
layer {
name: "neg_sq_dist_R_L6"
type: "Eltwise"
bottom: "sq_dist_R_L6"
top: "neg_sq_dist_R_L6"
eltwise_param { operation: SUM coeff: -1 }
}
layer {
name: "exp_kernel_R_L6"
type: "Softmax"
bottom: "neg_sq_dist_R_L6"
top: "exp_kernel_R_L6"
softmax_param { axis: 1 engine: CUDNN }
}
layer {
name: "f-lconv_L6_x"
type: "Eltwise"
bottom: "reshaped_scaled_flow_D_L6_x"
bottom: "exp_kernel_R_L6"
top: "f-lconv_L6_x"
eltwise_param { operation: PROD }
}
layer {
name: "scaled_flow_R_L6_x"
type: "Convolution"
bottom: "f-lconv_L6_x"
top: "scaled_flow_R_L6_x"
param { lr_mult: 0 }
param { lr_mult: 0 }
convolution_param {
num_output: 1
kernel_size: 1
weight_filler { type: "constant" value: 1 }
}
}
layer {
name: "f-lconv_L6_y"
type: "Eltwise"
bottom: "reshaped_scaled_flow_D_L6_y"
bottom: "exp_kernel_R_L6"
top: "f-lconv_L6_y"
eltwise_param { operation: PROD }
}
layer {
name: "scaled_flow_R_L6_y"
type: "Convolution"
bottom: "f-lconv_L6_y"
top: "scaled_flow_R_L6_y"
param { lr_mult: 0 }
param { lr_mult: 0 }
convolution_param {
num_output: 1
kernel_size: 1
weight_filler { type: "constant" value: 1 }
}
}
layer {
name: "scaled_flow_R_L6"
bottom: "scaled_flow_R_L6_x"
bottom: "scaled_flow_R_L6_y"
top: "scaled_flow_R_L6"
type: "Concat"
concat_param { axis: 1 }
}
layer {
name: "scaled_flow_R_L6_loss"
type: "L1Loss"
bottom: "scaled_flow_R_L6"
bottom: "scaled_flow_label_L6"
top: "scaled_flow_R_L6_loss"
loss_weight: 0.32
l1_loss_param { l2_per_location: true }
}

Compile error. tools/convert_imageset_and_flow.cpp:

When I compile the project, I encountered the error as follow.
tools/convert_imageset_and_flow.cpp:144:27: error: ‘numeric_limits’ is not a member of ‘std’
tools/convert_imageset_and_flow.cpp:144:47: error: expected primary-expression before ‘short’
tools/convert_imageset_and_flow.cpp:144:47: error: expected ‘;’ before ‘short’
Could you help me?

network training

What't the reason for stage-wise training scheme in pre-training procedure. I tried firstly train like flownet2, the NAN loss occured with Flow Regularization, how to avoid this? appreciate your reply

miss file deploy.tpl.prototxt

hi,

can you plz add this file?

Also, in your paper, you mentioned the net 'LiteFlowNetX'. can you plz add the caffemodel and prototxt of this net ?

Caffe Test Net loss

While training liteflownet in caffe, what will be a good train and test net loss (considering the same loss weights: 0.32, 0.08, 0.02, 0.01, 0.005 and 1)?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.