bryanyzhu / hidden-two-stream Goto Github PK

Caffe implementation for "Hidden Two-Stream Convolutional Networks for Action Recognition"

License: Other

CMake 0.27% Makefile 0.64% C++ 80.16% Cuda 9.33% MATLAB 0.86% Python 8.47% Shell 0.27%

action-recognition ucf101 hmdb51 optical-flow cnn unsupervised-learning real-time video caffe

hidden-two-stream's Introduction

Hidden Two-Stream Convolutional Networks for Action Recognition

This is the Caffe implementation of the "Hidden Two-Stream Convolutional Networks for Action Recognition". You can refer to paper for more details at Arxiv.

Dependencies

OpenCV 3 (Installation can be refered here)

Tested on Ubuntu 16.04 with Titan X GPU, CUDNN 5.1

Compiling

To get started, first compile caffe, by configuring a

"Makefile.config"

then make with

$ make -j 6 all

Training

(this assumes you compiled the code sucessfully)

Here, we take UCF101 split 1 as an example.

First, go to folder,

cd models/ucf101_split1_unsup_end

Then change the FRAME_PATH in train_rgb_split1.txt and val_rgb_split1.txt to where you store the extracted video frames,

/FRAME_PATH/WallPushups/v_WallPushups_g21_c06 111 98

This follows the format as in TSN. 111 indicates the number of frames of that video clip, and 98 represents the action label. For more details about how to construct file list for training and validation, we refer you to here.

Then you need to download the initialization models (pre-trained temporal stream CNN stacked upon pre-trained MotionNet),

UCF101 split1

Then tune the parameters in end_train_val.prototxt and end_solver.prototxt as you need, or leave as it is.

Finally, you can simply run

../../build/tools/caffe train -solver=end_solver.prototxt -weights=ucf101_split1_vgg16_init.caffemodel

NOTE: It is highly likely that you may get better performance than us if you carefully tune the hyper-params such as loss weights, learning rate etc.

Testing

(this assumes you compiled the code sucessfully)

First, download our trained models:

UCF101 split 1 UCF101 split 2 UCF101 split 3

HMDB51 split 1 HMDB51 split 2 HMDB51 split 3

Then go to this folder

cd models/ucf101_split1_unsup_end/eval_ucf101

Then run

python demo_hidden.py

But maybe you need to set paths correctly in demo_hidden.py before you run it, like model_def_file and model_file. And also change the FRAME_PATH in testlist01_with_labels.txt.

After you get both spatial and hidden predictions, the late fusion code is in folder ./test, run late_fusion.m to get the final two stream predictions.

MotionNet

The training and testing code of MotionNet is in folder

cd models/multiframe_MotionNet

The pretraied model can be downloaded at MotionNet.

Misc

There is a chance that you may get a little bit higher or lower accuracy on UCF101 and HMDB51 than the numbers reported in our paper, even using our provided trained models. This is normal because your extracted video frames may not be the same as ours, and the quality of image has an impact on the final performance. Thus, no need to raise an issue unless the performance gap is large, e.g. larger than 1%.
Since there are so many losses to compute, you may encounter model divergence in the very beginning of the training. You can simply reduce learning rate first to get a good initialization, and then back on track. Or you just rerun training several times.

TODO

Experiment on large-scale action datasets, like Sports-1M and Kinetics

License and Citation

Please cite this paper in your publications if you use this code or precomputed results for your research:

@article{hidden_ar_zhu_2017,
  title={{Hidden Two-Stream Convolutional Networks for Action Recognition}},
  author={Yi Zhu and Zhenzhong Lan and Shawn Newsam and Alexander G. Hauptmann},
  journal={arXiv preprint arXiv:1704.00389},
  year={2017}
}

Related Projects

GuidedNet: Guided Optical Flow Learning

Two_Stream Pytorch: PyTorch implementation of two-stream networks for video action recognition

Acknowledgement

The code base is borrowed from TSN, DispNet and UnsupFlownet. Thanks for open sourcing the code.

hidden-two-stream's People

Contributors

Stargazers

Watchers

Forkers

turingyizhu zhenchaocai bityangke kervinbill junmuzi qijiezhao ganneng3682140 xudonglinthu alexfridman zmlshiwo roy881020 visionalyst dkrathi457 nicehuster123 937552416 wanjinchang yesyu senlinuc fendaq jqqqqqqqqqq yanizhang101 paritoshparmar giserh zhuxinqimac wxw420 liviust jacobtom hanimiao xiaobai12345 xuyunlu1030 xiaoyu5301 feirenlg xiaopangzi313 csq20081052 ewenwan ersanliqiao lvyijin l2009312042 yiruocici gjtjx lvpchen liyeuestc naviocean silvaco xbutterflyx johnsonman elenajhy x-candy yuancoder222 batermj maodong2056 fendou201398 mujtabaasif lianzhaoy gbyy422990 umangbhatia lunalulu mtlouie-unm wangwangbuaa angelmartin00 yangzhen0000 gwliu213 jiazheng-xing vghost2008 soulsheng pengjinqiang ysong233

hidden-two-stream's Issues

Segmentation fault: while training

The Error occurs while starting the training.

Training data: UCF 101

System:
Cuda 8.0
Cudnn 5.1
Ubuntu 16.04
OpenCV 3.3
GPU = 2 x nvidia 1060

layer {
name: "FlowDeltasUClean6"
type: "Concat"
bottom: "FlowDeltasUClean6_0"
bottom: "FlowDeltasUClean6_1"
bottom: "FlowDeltasUClean6_2"
bottom: "FlowDeltasUClean6_3"
bottom: "FlowDeltasUClean6_4"
bottom:
I0314 16:56:57.780658 12026 layer_factory.hpp:77] Creating layer data
I0314 16:56:57.780689 12026 net.cpp:91] Creating Layer data
I0314 16:56:57.780700 12026 net.cpp:400] data -> data
I0314 16:56:57.780725 12026 net.cpp:400] data -> label
I0314 16:56:57.780798 12026 multi_frame_data_layer.cpp:33] Opening file: ./train_rgb_split1.txt
I0314 16:56:57.785921 12026 multi_frame_data_layer.cpp:49] A total of 9537 videos.
*** Aborted at 1521043017 (unix time) try "date -d @1521043017" if you are using GNU date ***
PC: @ 0x7fc548430e67 cv::findDecoder()
*** SIGSEGV (@0x49) received by PID 12026 (TID 0x7fc558e71b00) from PID 73; stack trace: ***
@ 0x7fc55274e4b0 (unknown)
@ 0x7fc548430e67 cv::findDecoder()
@ 0x7fc548431a01 cv::imread_()
@ 0x7fc548433e03 cv::imread()
@ 0x7fc557d4ef31 caffe::ReadSegmentMultiRGBToDatum()
@ 0x7fc557bcb49e caffe::MultiFrameDataLayer<>::DataLayerSetUp()
@ 0x7fc557b42753 caffe::BasePrefetchingDataLayer<>::LayerSetUp()
@ 0x7fc557afaac2 caffe::Net<>::Init()
@ 0x7fc557afc2e1 caffe::Net<>::Net()
@ 0x7fc557adbc3a caffe::Solver<>::InitTrainNet()
@ 0x7fc557adcf77 caffe::Solver<>::Init()
@ 0x7fc557add31a caffe::Solver<>::Solver()
@ 0x7fc557d06183 caffe::Creator_SGDSolver<>()
@ 0x40a728 train()
@ 0x4075e8 main
@ 0x7fc552739830 __libc_start_main
@ 0x407d59 _start
@ 0x0 (unknown)
Segmentation fault

Thanks in advance

libprotobuf WARNING during training

Hi, as you suggested i created my own train_rgb_split1.txt and val_rgb_split1.txt then i started to train the model, i end up with this following error. could you please help me.

V6
I0520 11:17:13.860306 16923 net.cpp:262] This network produces output accuracy
I0520 11:17:13.860309 16923 net.cpp:262] This network produces output action_loss
I0520 11:17:13.860812 16923 net.cpp:275] Network initialization done.
I0520 11:17:13.862537 16923 solver.cpp:60] Solver scaffolding done.
I0520 11:17:13.870612 16923 caffe.cpp:129] Finetuning from ucf101_split1_vgg16_init.caffemodel
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 708026968
I0520 11:17:15.177098 16923 net.cpp:753] Ignoring source layer input
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 708026968
I0520 11:17:15.739457 16923 net.cpp:753] Ignoring source layer input
I0520 11:17:15.890053 16923 caffe.cpp:219] Starting Optimization
I0520 11:17:15.890076 16923 solver.cpp:280] Solving end_to_end_train
I0520 11:17:15.890079 16923 solver.cpp:281] Learning Rate Policy: multistep
F0520 11:17:16.851193 16923 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***
@ 0x7f8d0e62e5cd google::LogMessage::Fail()
@ 0x7f8d0e630433 google::LogMessage::SendToLog()
@ 0x7f8d0e62e15b google::LogMessage::Flush()
@ 0x7f8d0e630e1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f8d0ef59240 caffe::SyncedMemory::to_gpu()
@ 0x7f8d0ef58209 caffe::SyncedMemory::mutable_gpu_data()
@ 0x7f8d0ed738c3 caffe::Blob<>::mutable_gpu_diff()
@ 0x7f8d0ef7f35f caffe::CuDNNConvolutionLayer<>::Backward_gpu()
@ 0x7f8d0ed5cfcb caffe::Net<>::BackwardFromTo()
@ 0x7f8d0ed5d02f caffe::Net<>::Backward()
@ 0x7f8d0ed505a4 caffe::Solver<>::Step()
@ 0x7f8d0ed51029 caffe::Solver<>::Solve()
@ 0x40b497 train()
@ 0x4075a8 main
@ 0x7f8d0d0e2830 __libc_start_main
@ 0x407d19 _start
@ (nil) (unknown)
Aborted (core dumped)

Where were the classes (101, 51) declared?

Hello,
I've been trying to find all occurrences, where the class number was declared. So far I found it in end_train_val.prototxt, stack_motionnet_vgg16_deploy.prototxt and demo_hidden.py. I would like to train the net on my own data, with less classes. Is that even possible? Where else do I have to change the class number? I still want to use your pretrained net with similar videos.

Thanks so much in advance.

which mean_file should i set when i use motionnet to process grey images?

When i use generate_flow_for_ucf101.py to process my own data to get flow images, i don't know what value of 'mean_file' should i set if i feed the network with grey images.
could you please give me some clues?

training & testing time

Hi, when I training , it takes 2 minutes per iteration, and when testing, the process is slow as well. Can you give me some adivce ? I use one GTX 1080ti.

dataset problem

hello, @bryanyzhu
Thanks for your sharing!
I am an undergraduate and I'm working on my graduation project but I don't have enough GPU and memory to extract images. So could you share your data with me through cloud? I'd like to have RGB frames and optical flow images of UCF101. Thank you!

spatial model

Dear Bryan Zhu,

Could you please give me the link to the VGG16 spatial model (which is pre-trained on ImageNet and trained on UCF-101)?
I followed your paper and download a model name: cuhk_action_spatial_vgg_16_split1.caffemodel from this link "https://github.com/yjxiong/caffe/tree/action_recog/models/action_recognition" but I got very low accuracy ( around 53.4%).

Thank you so much,
Best,

about compile

hi,
when i compile the makefile,there are some errors occurs that can not find the file or the directory of examples,but i can find any file or directory in your project.
I hope to get your help.thanks!

Data loading for MotionNet training

Hi,

Thanks for sharing your work. I am wondering how do you load the 11 frames for MotionNet training? For each video clip, do you take one set of 11 frames for training or multiple sets of 11 frames?

Also, in the supplementary material, you trained 400k for MotionNet. I am wondering how long does it take, how big is your training dataset for MotionNet and how many GPUs are you using?

Thanks,
Jeff Wang

looking forward the pytorch version

hello when will have the pytorch release

Seperate training on MotionNet

That was great job. Can the MotionNet be separately trained with ground truth optical flow images？

question about memory

Check failed: error == cudaSuccess (77 vs. 0) an illegal memory access was encountered
I am using 4 pieces of GTX1080Ti GPU. but there is still an error of "Check failed: error == cudaSuccess (77 vs. 0) an illegal memory access was encountered".
I searched online for a while. It seems because my GPU runs out of memory and I should decrease the batch_size. But there still was this error though I decreased the batch_size to 1.
I have seen the author used the Titan X GPU. I wonder how much memory this process occupies.

about the supervised and unsupervised learning MotionNet

thank you for your job.
here is my question: In your another paper:"Guided Optical Flow Learning" ,you compare the supervised and unsupervised when learning optical flow, in that paper you first do supervised learning and then unsupervised learning, and only unsupervised learning EPE is worse; so why in this paper only use the unsupervised learning?

An error in convert_imageset_and_disparity.cpp

I am implementing HTS in a server and I am using cmake instead of makefile.config, since I have to load the modules I need instead of giving paths.

System config:
Cudnn 8.0-v5
cuda 7.5
opencv 3.1

But while 'make all', I am getting the following error:

[ 98%] Linking CXX executable convert_imageset_and_flow
In file included from /scratch/caffe/Hidden-Two-Stream/include/thirdparty/CImg/CImg.h:208:0,
from /scratch/caffe/Hidden-Two-Stream/tools/convert_imageset_and_disparity.cpp:39:
/scratch/caffe/Hidden-Two-Stream/tools/convert_imageset_and_disparity.cpp: In function ‘int main(int, char**)’:
/scratch/caffe/Hidden-Two-Stream/tools/convert_imageset_and_disparity.cpp:336:14: error: expected unqualified-id before ‘int’
leveldb::Status status =leveldb::DB::Open(options, argv[arg_offset+2], &db);
^
/scratch/caffe/Hidden-Two-Stream/tools/convert_imageset_and_disparity.cpp:336:14: error: expected ‘;’ before ‘int’
In file included from /scratch/caffe/Hidden-Two-Stream/tools/convert_imageset_and_disparity.cpp:17:0:
/scratch/caffe/Hidden-Two-Stream/tools/convert_imageset_and_disparity.cpp:337:11: error: ‘status’ was not declared in this scope
CHECK(status.ok()) << "Failed to open leveldb " << argv[arg_offset+2];
^
/scratch/caffe/Hidden-Two-Stream/tools/convert_imageset_and_disparity.cpp:337:11: note: suggested alternatives:
In file included from /usr/include/boost/filesystem.hpp:17:0,
from /scratch/caffe/Hidden-Two-Stream/include/caffe/util/io.hpp:4,
from /scratch/caffe/Hidden-Two-Stream/tools/convert_imageset_and_disparity.cpp:37:
/usr/include/boost/filesystem/operations.hpp:281:15: note: ‘boost::filesystem::status’
file_status status(const path& p, system::error_code& ec)
^
/usr/include/boost/filesystem/operations.hpp:204:17: note: ‘boost::filesystem::detail::status’
file_status status(const path&p, system::error_code* ec=0);
^
make[2]: *** [tools/CMakeFiles/convert_imageset_and_disparity.dir/convert_imageset_and_disparity.cpp.o] Error 1
make[1]: *** [tools/CMakeFiles/convert_imageset_and_disparity.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[ 98%] Built target convert_imageset_and_flow

confused by the role LRN layer played in the training stage?

I notice there exist LRN layers in training stage when we calculate loss. But what's the effect of these LRN layers? I think normalization among RGB channels seems unnecessary. Thanks!

model for motionnet restricts the flow size

Sorry to bother, but i have trouble with motionnet, because the flow is restricted by the network, so if i resize the flow using numpy, the flow will be wrong. So what should i do if i want to get flow which is the same size as the original images? thanks!

list index out of range

Hi sir,
I am ending up with following error when i tried to plot the graphs.

adi@adi-Inspiron-7577:~/Hidden-Two-Stream/tools/extra$ python plot_training_log.py 1 /home/adi/new1.png model_1_train.log
Traceback (most recent call last):
File "plot_training_log.py", line 196, in
plot_chart(chart_type, path_to_png, path_to_logs)
File "plot_training_log.py", line 122, in plot_chart
data = load_data(data_file, x, y)
File "plot_training_log.py", line 93, in load_data
data[1].append(float(fields[field_idx1].strip()))
IndexError: list index out of range

I parsed log data into model_1_train.log file and i tried to get train.log and test.log files but i got above error. this error is because of it was not able to read lines after 15000 iterations. so i skipped lines after 15000 iterations. so i solved this error by parsing lines which is having length greater than 4(if len(fields) > 4)
def load_data(data_file, field_idx0, field_idx1):
data = [[], []]
with open(data_file, 'r') as f:
for line in f:
line = line.strip()
if line[0] != '#':
fields = line.split()
# print "line: " + str(line)
# print "fields: " + str(fields)
# print "one: " + str(fields[field_idx0])
# print "two: " + str(fields[field_idx1])
if len(fields) >= 4:
data[0].append(float(fields[field_idx0].strip()))
data[1].append(float(fields[field_idx1].strip()))
return data

after solving the above error i was able to get train.log and test.log files.
when i ran command python plot_training_log.py 1 /home/adi/new1.png train.log test.log i was getting following error..

Traceback (most recent call last):
File "/home/adi/Hidden-Two-Stream/tools/extra/extract_seconds.py", line 64, in
extract_seconds(sys.argv[1], sys.argv[2])
File "/home/adi/Hidden-Two-Stream/tools/extra/extract_seconds.py", line 49, in extract_seconds
assert start_datetime, 'Start time not found'
AssertionError: Start time not found
paste: aux4.txt: No such file or directory
rm: cannot remove 'aux4.txt': No such file or directory
Traceback (most recent call last):
File "/home/adi/Hidden-Two-Stream/tools/extra/extract_seconds.py", line 64, in
extract_seconds(sys.argv[1], sys.argv[2])
File "/home/adi/Hidden-Two-Stream/tools/extra/extract_seconds.py", line 49, in extract_seconds
assert start_datetime, 'Start time not found'
AssertionError: Start time not found
paste: aux3.txt: No such file or directory
rm: cannot remove 'aux3.txt': No such file or directory

but when i ran command python plot_training_log.py 1 /home/adi/new1.png model_1_train.log i got following graphs which are absolutely wrong i guess.

could you please help me where i did a mistake or do i need to change something to get start time.

thank you..

Need detailed tutorial steps on implementation

Hello Yi Zhu,
I am trying to implement this code but i am getting errors. could you just please send me more detailed steps on implementation. thank you

Comparison on THUMOS14

Hi, I am inspired much by your work, but confused about some details.
In Table 4 of your paper, there are some experiments on THUMOS14.
I'd like to know what exactly the train data and test data are.
Because the train data of THUMOS14 is UCF101.
Or the used train data is the 1010 validation dataset and testdata is 1574 test set?
Did you use the background data?

Look forward to your reply. Thanks

is there a tensorflow implementation?

how did you suppress background?

Hi,
With your help i managed to train and test the model properly. Now when i test the model with real time actions it is predicting actions correctly irrespective of the background and objects which i did not understand, could please tell me how you suppressed background and focused only on actions. just i want to know more about approach you used for this operation.

Thank you..

Tiny MotionNet

Hello, thank you for the great work!
I want to ask whether the available MotionNet model is Tiny MotionNet or not, and if not, do you have it available elsewhere ready to be used for fine-tuning?

Could you give a more detailed tutorial about compile the code

Could you give a more detailed tutorial about compile the code, I met some errors and I don't know how to solve it.
thank you

Best!

Error While testing

Hi, I'm getting following error while training, this is because i changed batch_size from 50 to 5 due to lack of memory. could you please tell me where do i need to change the shape to get it right.
net.blobs['data'].data[...] = np.transpose(rgb[:,:,:,span], (3,2,1,0))
ValueError: could not broadcast input array from shape (5,33,224,224) into shape (50,33,224,224)

Training Motion Net

Hi Bryan,

Thank you so much for your great work.
I 've read your paper and I'm trying to follow your implementation.
I have a few quick questions:

Whats the purpose of the file generate_flow_for_ucf101.py? How can I use it for training the MotionNet from scratch (If possible, could you please provide instructions for training MotionNet)?
In your paper, you mentioned about stacking MotionNet with a Temporal Stream, that means stacking 2 Caffe models file. How can we do it?

Best regards,
John Huang

Not able to run late_fusion.m

Hi Sir,
I am new to ubuntu and am not able to run late_fusion.m file with the command
math -script late_fusion.m. how did you run this file.
Thank you..

Question about the init temporal model

In this paper, you said "the temporal model is initialized with the snapshot provided by Wang",could you give me the snapshot URL?
Thank you.

where is the spatial stream cnn?

hello @bryanyzhu ,
Thanks for your sharing. But your code bring me a lot confusion.
1,in demo_hidden.py, it writes a lot of "spatial" words, but it is for Motionnet and Temporal Stream CNN ;
2, for late Fusion, i can not find any code to genenrate "spatial_quality100_split1.mat" and "temporal_hidden_split1.mat" files,can you help me ?
3, what about the speed? to test the stack_motionnet_vgg16 model, it is very slow on 2.4G CPU and titan X GPU, about 60s for each video,even 1400s

Timeout Error while Testing

System:
CUDA 8.0
CuDNN 5.1
GPU Nvidia 1060 6GB

The training process worked fine, so I wanted to test it according to your description, but I get the following error:

I0323 16:50:23.343171 19562 net.cpp:753] Ignoring source layer action_loss
we got 3783 test videos
1
F0323 16:50:33.064666 19562 math_functions.cu:79] Check failed: error == cudaSuccess (6 vs. 0)  the launch timed out and was terminated
*** Check failure stack trace: ***
Aborted

I printed the 1 to see where it seems to stop, which is in file Hidden-Two-Stream/models/ucf101_split1_unsup_end/eval_ucf101/HiddenTemporalPrediction.py:

for bb in range(num_batches):
    span = range(batch_size*bb, min(rgb.shape[3],batch_size*(bb+1)))
    net.blobs['data'].data[...] = np.transpose(rgb[:,:,:,span], (3,2,1,0))
    print 1
    output = net.forward()
    print 2
    prediction[:, span] = np.transpose(output[feature_layer])

The error message shows that the error gets thrown in Hidden-Two-Stream/src/caffe/util/math_functions.cu:

void caffe_gpu_memcpy(const size_t N, const void* X, void* Y) {
  if (X != Y) {
    CUDA_CHECK(cudaMemcpy(Y, X, N, cudaMemcpyDefault));  // NOLINT(caffe/alt_fn)
  }
}

The problem is I cant seem to find the connection from net.forward() to caffe_gpu_memcpy and where it gets called. Can you help me solve this error and tell me where caffe_gpu_memcpy is called?

Thank you in advance.

I want to ask a question about training and test networks

@bryanyzhu
I want to ask question, why the networks are different between trainning and test?
you use end_train_val.prototxt for training , but you use stack_motionnet_vgg16_deploy.prototxt

Thanks!

out of memory while testing

I'm using GTX 1080 and ran out of memory while training, and I solved it by reducing batch size from 8 to 4. But I still run out of memory while testing, so is there any ways to cut batch size in testing?

question

I0712 16:28:16.027132 24234 multi_frame_data_layer.cpp:33] Opening file: ./train_rgb_split1.txt
I0712 16:28:16.027184 24234 multi_frame_data_layer.cpp:49] A total of 1 videos.
*** buffer overflow detected ***: ../../build/tools/caffe terminated

what is action_loss

Hi Sir,
During training i encountered the word action_loss, what is it and how it is evaluated? and I would like to plot graph for train loss vs iterations. I am not able to plot graphs using default scripts. so, could you please tell me that what parameter should i consider for the train_loss in the log file.

thanks..

About Flow loss

Hi Sir,
could you please explain unsupervised loss, In paper, motion information calculated using two frames and reconstruct flow image using predicted flow and frame2. Here, I understood that 'Img10Norm" is frame 1 and "predict_flow_6_0" is frame 2 , "downsampled_img10_6" is predicted filed. is that correct? I did not understand how image is reconstructed from "downsampled_img10_6"
`layer {
name: "WarpDownsample6_10"
type: "Downsample"
bottom: "img10Norm"
bottom: "predict_flow6_0"
top: "downsampled_img10_6"
propagate_down: false
propagate_down: false
}

warp frame 2 back to frame 1

layer{
name: "FlowScale6_0"
type: "Scale"
bottom: "predict_flow6_0"
top: "FlowScale6_0"
scale_param {
filler {
type: "constant"
value: 0.625
}
bias_term : false
}
param {
lr_mult: 0
}
}`

Provide Dockerfile setup

Some confuse about the definition of "bordermask"

Hello Yi,
Thanks for your fantastic work and code!
I am new to the task of video and I have a question about the definition of "bordermask":

What is the DummyData layer about "flowBorderMask" and "borderMask" work for?
Why do we need to use these masks when computing CharbonnierLoss?

Thanks so much in advance

About training question

Dear Sir,
During training i meet the question as follow:

I0717 14:36:00.550038  2451 layer_factory.hpp:77] Creating layer data
I0717 14:36:00.550062  2451 net.cpp:91] Creating Layer data
I0717 14:36:00.550067  2451 net.cpp:400] data -> data
I0717 14:36:00.550086  2451 net.cpp:400] data -> label
I0717 14:36:00.550129  2451 multi_frame_data_layer.cpp:33] Opening file: /home/zh/Hidden-Two-Stream-master/models/ucf101_split1_unsup_end/train_rgb_split1.txt
I0717 14:36:00.554026  2451 multi_frame_data_layer.cpp:49] A total of 9537 videos.
E0717 14:36:00.554062  2451 io.cpp:453] Could not load file /home/zh/Hidden-Two-Stream-master/UCF-101/WallPushups/v_WallPushups_g21_c06/image_0060.jpg
F0717 14:36:00.554143  2451 multi_frame_data_layer.cpp:65] Check failed: ReadSegmentMultiRGBToDatum(lines_[lines_id_].first, lines_[lines_id_].second, offsets, new_height, new_width, new_length, &datum)

How is the image_0060.jpg generated in this code? and i cannot download your pre-trained models. If it is convenient, can you send it to my email? my email: [email protected]
Thanks!

Could you please release the pytorch version?

Error while running demo_hidden.py

Hello,thanks for your sharing.Here I got an error while running demo_hidden.py:
we got 3783 test videos
/home/user/UCF101/ApplyEyeMakeup/v_ApplyEyeMakeup_g01_c01
img_file is /home/user/UCF101/ApplyEyeMakeup/v_ApplyEyeMakeup_g01_c01/image_0001.jpg
OpenCV Error: Assertion failed (ssize.width > 0 && ssize.height > 0) in resize, file /tmp/binarydeb/ros-kinetic-opencv3-3.3.1/modules/imgproc/src/resize.cpp, line 3939
Traceback (most recent call last):
File "demo_hidden.py", line 87, in
main()
File "demo_hidden.py", line 61, in main
start_frame)
File "/home/user/workspace/action_recognition/Hidden-Two-Stream-master/models/ucf101_split1_unsup_end/eval_ucf101/HiddenTemporalPrediction.py", line 39, in HiddenTemporalPrediction
img = cv2.resize(img, dims[1::-1])
cv2.error: /tmp/binarydeb/ros-kinetic-opencv3-3.3.1/modules/imgproc/src/resize.cpp:3939: error: (-215) ssize.width > 0 && ssize.height > 0 in function resize

I have changed the FRAME_PATH in testlist01_with_labels.txt like:
/home/user/UCF101/ApplyEyeMakeup/v_ApplyEyeMakeup_g01_c01 1
/home/user/UCF101/ApplyEyeMakeup/v_ApplyEyeMakeup_g01_c02 1
/home/user/UCF101/ApplyEyeMakeup/v_ApplyEyeMakeup_g01_c03 1
It seems it can't find the frame,and I print the img_file in HiddenTemporalPrediction.py, it prints:
/home/user/UCF101/ApplyEyeMakeup/v_ApplyEyeMakeup_g01_c01/image_0001.jpg
however i only have v_ApplyEyeMakeup_g01_c01.avi rather than v_ApplyEyeMakeup_g01_c01/image_0001.jpg. So how can I get the image_0001.jpg?
Thanks.Waiting for your reply.

information about pre-requisites

Hi, Thank you so much for the code. This project is so interesting, i want to implement it on my computer.
my PC specifications:
ubuntu 16.04
cuda 8.0, cudnn 5.1
nvidia Gtx1060
RAM 16GB.
I installed caffe and compiled. I almost set to start the project. before that, I want to know few things about extracted video frames and is it possible to test real-time action recognition with this project?. I searched for extracted frames in your given repository but I couldn't find any. could you please tell me that where the extracted frames folder in a repository and how these frames relate to lists available in train_rgb_split1.txt.
I think these questions may be basic for you but I am a beginner. I hope you will answer patiently.
thank you.

when runing demo_hidden.py

Hi,
When I run demo_hidden.py following your tutorial, it will have the following error:
I0323 16:50:23.343171 19562 net.cpp:753] Ignoring source layer action_loss we got 3783 test videos /home/zh/Hidden-Two-Stream-master/UCF-101/ApplyEyeMakeup/v_ApplyEyeMakeup_g01_c01 (0, 13) /home/zh/Hidden-Two-Stream-master/UCF-101/ApplyEyeMakeup/v_ApplyEyeMakeup_g01_c02 (0, 0) Killed
I don't know why it kills the process (my card is GTX 1080ti, and i changed the batch_size to 10)

rng.hpp:22:79: error: 'RandomGeneratorparameter' does not name a type

 Hi, when I downloaded the project and built it，“error: 'RandomGeneratorparameter' does not name a type" was found.   How do I solve it？

the accuracy I get is 69%

I followed the readme to train the model without changing any paramter, but the accuracy I got was about 69% on UCF101-split1, which is much lower than you mentioned in your paper(84.88%).
could you please give me some advices? Thanks!

Two-frame MotionNet

Hi Yi,

Amazing work and thanks a ton for the code. Just a question, to replicate the MotionNet (2-frame) experiment as mentioned in Table III of your paper, do we need to just change new_length from 11 to 2 (and surely, wherever it affects) ? Or do we need to change other parameters such as, no. of filters in 'Convolution' layers ?

Cheers!

getting error during training

I followed the steps given for training like changing the frame in train_rgb_split1.txt and val_rgb_split1.txt
and downloaded the UCF101 split1. when run the given command for training i'm getting following error. I'm new to caffe can you please tell me the steps for training a model. Thank you.

F0513 15:45:57.349095 15471 solver.cpp:440] Cannot write to snapshot prefix './logs_end/vgg16_end'. Make sure that the directory exists and is writeable.
*** Check failure stack trace: ***
@ 0x7fa4e2e435cd google::LogMessage::Fail()
@ 0x7fa4e2e45433 google::LogMessage::SendToLog()
@ 0x7fa4e2e4315b google::LogMessage::Flush()
@ 0x7fa4e2e45e1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7fa4e376a3ed caffe::Solver<>::CheckSnapshotWritePermissions()
@ 0x7fa4e376d6c4 caffe::Solver<>::Init()
@ 0x7fa4e376da7a caffe::Solver<>::Solver()
@ 0x7fa4e34f6ca3 caffe::Creator_SGDSolver<>()
@ 0x40a6e8 train()
@ 0x4075a8 main
@ 0x7fa4e18f7830 __libc_start_main
@ 0x407d19 _start
@ (nil) (unknown)
Aborted (core dumped)