jfzhang95 / pytorch-video-recognition Goto Github PK

View Code? Open in Web Editor NEW

1.2K 18.0 249.0 4.07 MB

PyTorch implemented C3D, R3D, R2Plus1D models for video activity recognition.

License: MIT License

Python 100.00%

c3d r2plus1d r3d video-activity-recognition

pytorch-video-recognition's Introduction

pytorch-video-recognition

Introduction

This repo contains several models for video action recognition, including C3D, R2Plus1D, R3D, inplemented using PyTorch (0.4.0). Currently, we train these models on UCF101 and HMDB51 datasets. More models and datasets will be available soon!

Note: An interesting online web game based on C3D model is in here.

Installation

The code was tested with Anaconda and Python 3.5. After installing the Anaconda environment:

Clone the repo:

git clone https://github.com/jfzhang95/pytorch-video-recognition.git
cd pytorch-video-recognition

Install dependencies:

For PyTorch dependency, see pytorch.org for more details.

For custom dependencies:
```
conda install opencv
pip install tqdm scikit-learn tensorboardX
```
Download pretrained model from BaiduYun or GoogleDrive. Currently only support pretrained model for C3D.
Configure your dataset and pretrained model path in mypath.py.
You can choose different models and datasets in train.py.

To train the model, please do:
```
python train.py
```

Datasets:

I used two different datasets: UCF101 and HMDB.

Dataset directory tree is shown below

UCF101 Make sure to put the files as the following structure:

UCF-101
├── ApplyEyeMakeup
│   ├── v_ApplyEyeMakeup_g01_c01.avi
│   └── ...
├── ApplyLipstick
│   ├── v_ApplyLipstick_g01_c01.avi
│   └── ...
└── Archery
│   ├── v_Archery_g01_c01.avi
│   └── ...

After pre-processing, the output dir's structure is as follows:

ucf101
├── ApplyEyeMakeup
│   ├── v_ApplyEyeMakeup_g01_c01
│   │   ├── 00001.jpg
│   │   └── ...
│   └── ...
├── ApplyLipstick
│   ├── v_ApplyLipstick_g01_c01
│   │   ├── 00001.jpg
│   │   └── ...
│   └── ...
└── Archery
│   ├── v_Archery_g01_c01
│   │   ├── 00001.jpg
│   │   └── ...
│   └── ...

Note: HMDB dataset's directory tree is similar to UCF101 dataset's.

Experiments

These models were trained in machine with NVIDIA TITAN X 12gb GPU. Note that I splited train/val/test data for each dataset using sklearn. If you want to train models using official train/val/test data, you can look in dataset.py, and modify it to your needs.

Currently, I only train C3D model in UCF and HMDB datasets. The train/val/test accuracy and loss curves for each experiment are shown below:

UCF101

HMDB51

Experiments for other models will be updated soon ...

pytorch-video-recognition's People

Contributors

Stargazers

Watchers

Forkers

naviocean collector-m niedan1976 xuzengmin junwenchen panna19951227 sadjadasghari yingchao-mai bzp92 leoniekevandenbulk javafaker minjeekim00 zn787726661 zhiqijiang amirunpri2018 zwcheng cantonioupao kekedan guoruiwang xiaohai0520 white1973 yuanqin27 boosting xiaoyang-coder erdongchendou corner4world swall0w bonifacechacha jimreno ioekg ylqi andrewhuman mengshus bbp94 livingbio liujiawei2333 elaa0505 ramnamaqsood acewjh junmuzi michealsbingham lxgychen xixizhitan felixzhang7 kamata1729 wuzhan11 pingaowang chase2816 ecustboy zhanj7 xzwwx neudeep simon5u wayne980 alicegaz andrew-zhu kongxuanzhi jiaming-wang aminaba123 ngiraffe fjchange chichiyingka mozafari-dev ontheway361 tkone2018 changchenzhao duyanggithub priteshgohil yinanhe abhishekaich27 xianyuanliu zhleezhenghan pandasea suhaisheng yangshushuaige beebrain bupt-zjie wuyang556 wangyingquan dreamer121121 pppppam zack624 sohuemily fuqianggu amseej louiewuliyu alexanderhucheerful maitchison edwintenagyei367 liyantett mishal9 wangaoqi-waq sanchit2843 ymlwww xiao00yang wongyinj chrisjill ts1993hy runningpp zhenhao-huang

pytorch-video-recognition's Issues

model request

may i have your fine-tuned model?

nn.CrossEntropyLoss should work with logit scores

if you use nn.CrossEntropyLoss the input should be the scores before softmax layer. So you should remove the softmax layer.

load checkpoint

optimizer.step()
File "/home/z/anaconda3/envs/py3/lib/python3.6/site-packages/torch/optim/sgd.py", line 101, in step
buf.mul_(momentum).add_(1 - dampening, d_p)
RuntimeError: Expected object of type torch.FloatTensor but found type torch.cuda.FloatTensor for argument #4 'other'

Is there any method about verification of trained model?

First of all, i really appreciate to your helpful code !
Finally, i finished train my own dataset model, i look forward to testing my trained model.
If anybody know method, please advise to me :)
Thank you !

inference.py 中的model.train() 应该是model.eval()吧

how can I do transfer learning with my dataset ?

Please help

How can i test this model?

Inference.py question for frame process

Hello @jfzhang95 , thanks for your share of c3d implement.

When I inference with trained c3d model, I notice that you did some process to central croped frame.

listed here:

tmp = tmp_ - np.array([[[90.0, 98.0, 102.0]]])

Kindly help to explain that what purpose of doing this operation.

Thanks again.

I can't download pretrained model

video is very lag with pretrained model in inference.py in any av

KeyError: 'state_dict'

I run inference.py with pretrained model:
Traceback (most recent call last):
File "inference.py", line 78, in
main()
File "inference.py", line 32, in main
model.load_state_dict(checkpoint['state_dict'])
KeyError: 'state_dict'

mypath.py

Decreasing Test Accuracy in README .png

In your README, the tensorboardx output for Test Acc is steadily decreasing over all 100 epochs. Is that just a 'typo' or were those your actual results?

PS Thanks a ton for the code. It has been very helpful!

Pre-trained model not downloading because login required

Can you please send me any other link of your pre-trained model. Because i am not in china and i can't login to baidu App.
Thanks

C3D training from scratch

hi @jfzhang95 , thanks for your code. I'm trying to train C3D model from scratch using your code. I haven't change any setting. After several epoches, the training loss remains to be NaN. What should I do to train C3D model from scratch ? I'm using UCF101 dataset.

about splitting the dataset

it seems that you are splitting the ucf101 randomly, but we are not allowed to do like this, right?

RuntimeError: cuDNN error: CUDNN_STATUS_MAPPING_ERROR

When I run the train.py occur this problem, anyone can solve this?

Traceback (most recent call last):
File "/home/common1/huangjing/MyCode/PythonCode/pytorch-video-recognition/train.py", line 201, in
train_model()
File "/home/common1/huangjing/MyCode/PythonCode/pytorch-video-recognition/train.py", line 138, in train_model
loss.backward()
File "/home/huangjing/miniconda3/envs/PyCharm/lib/python3.7/site-packages/torch/tensor.py", line 150, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/huangjing/miniconda3/envs/PyCharm/lib/python3.7/site-packages/torch/autograd/init.py", line 99, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: cuDNN error: CUDNN_STATUS_MAPPING_ERROR

The order of frame sequences in Videodataset is wrong.

frames = sorted([os.path.join(file_dir, img) for img in os.listdir(file_dir)])
this operation will get something like
0001
00010
00011
00012
...
00019
0002
00020
00021
...
the correct code should be
frames = sorted([os.path.join(file_dir, img) for img in os.listdir(file_dir)],key=lambda x:int(x.split('/')[-1][:-4]))
0001
0002
0003
0004
...

No such file or directory: '/path/to/Models/c3d-pretrained.pth'

when run train.py, I counter the problem followed:
Traceback (most recent call last):
File "/tmp/pycharm_project_466/train.py", line 202, in
train_model()
File "/tmp/pycharm_project_466/train.py", line 61, in train_model
model = C3D_model.C3D(num_classes=num_classes, pretrained=True)
File "/tmp/pycharm_project_466/network/C3D_model.py", line 42, in init
self.__load_pretrained_weights()
File "/tmp/pycharm_project_466/network/C3D_model.py", line 109, in __load_pretrained_weights
p_dict = torch.load(Path.model_dir())
File "/home/zhanghao/anaconda3/envs/python35/lib/python3.5/site-packages/torch/serialization.py", line 356, in load
f = open(f, 'rb')
FileNotFoundError: [Errno 2] No such file or directory: '/path/to/Models/c3d-pretrained.pth'
How can I solve it?

ResNet Network Structure is Not Correct???

In R3D_model and R2+1D model.
bn and relu already included in the SpatioTemporalConv
but in SpatioTemporalResBlock you add bn and relu after SpatioTemporalConv

KeyError: 'state_dict' when running inference.py file

---> 32 model.load_state_dict(checkpoint['state_dict'])
33 # model.load_state_dict(torch.load("C:\Users####\pytorch-video-recognition\c3d-pretrained.pth.tar", map_location=lambda storage, loc: storage))
34

KeyError: 'state_dict'

The unbelievable results,train ACC is about 100%!

Your work is rather prefect! I got the train Acc 0.9987220447284345,val acc 0.9923857868020304 and test acc 0.9851063829787234,It seems too good to be true,can you check the result whether these results are true?

your pretrained model work very badly in inference.py in any av

your pretrained model work very badly in inference.py in any av. why ?
I init models as follows:
model = C3D_model.C3D(num_classes=101, pretrained=True)
# checkpoint = torch.load('./models/C3D_ucf101_epoch-39.pth', map_location=lambda storage, loc: storage)
# model.load_state_dict(checkpoint['state_dict'])
model.to(device)
model.eval()

Evaluation on official split 01 of UCF101

Hello,

as mentioned in the description, sklearn is used to split train/val/test data for each dataset. Has anybody tried to train and evaluate C3D model on the split 01 of UCF101? I gave it a try, where validation set is the same to test set and got the following results:

Is the official split such different than the sklearn split to justify the much lower performance accuracy?

The accuracy of C3D training from scartch is low

The accuracy of C3D training from scratch is 30% with lr=1e-5 and the accuracy is below 1% with lr=1e-3 , which are lower than the paper claimed. I also notice someone added BN to C3D and the accuracy is about 45%. Anybody knows why?

train_test_split on ucf 101 dataset

Hi. Thank you for your uploading code.

I have a question in the dataset.py code.

As I know, ucf101 dataset have train/test list file, but in that code it divided by train_test_split by random.

So, it may cause the overlap problem in train / test dataset.

How do you think about this problem?

VideoDataset has a bug about data normalization

There is a bug about normalize(self, buffer) function in dataset.py, it has not normalize data to [0, 1], which we usually do this in Deep Learning training process with PyTorch.
And I also tested it, if we don't normalize it, the training process was totally failed when I used the official train/test split of UCF101, after 54 epochs, the testing accuracy was only around 5%.
And if we normalize it, the training process was fine, after 5 epochs, it obtained 8.2% testing accuracy.

pytorch-video-recognition/dataloaders/dataset.py

Line 204 in ca37de9

def normalize(self, buffer):

train from resume epochs

something wrong?

out of memory?

Hi! I try the code with TITANXp 12Gb, but before it starts training, it reported that "Runtimerror: cuda out of memory".
Do your guys meet the same issues?
Could anyone give some ideas for me.....Thanks

Memmory Error for Batch Size >50

The only extra piece of code I added is Data Parallel, to use both of my GPUs. So I am training for a larger batch size of 50 on my 2 GPUs (GeForce GTX TITAN X -12GB RAM) . It runs the first 10-15 epochs but then it spits out a MemoryError . When I tried to train it with a batch size of 200 it gave me an error from the beginning. I know that if i reduce the batch size it will solve the error, however can someone provide an alternative solution?

How can I check the testing accuracy for each individual action class?

I want to check the testing accuracy for each individual action. This way i can identify the weakest action and train it more. So how can I do this?

Downsample step

I was checking your Pytorch implementation of the R2Plus1D model against the implementation in Caffe2 in the repository of the original paper (https://github.com/facebookresearch/VMZ), and I was wondering why you chose to implement the downsample step as a SpatioTemporalConv layer, while in the original implementation they seem to use only one Conv3D layer. They have coded it as follows:

if (num_filters != input_filters) or down_sampling:
shortcut_blob = self.model.ConvNd(
shortcut_blob,
'shortcut_projection_%d' % self.comp_count,
input_filters,
num_filters,
[1, 1, 1],
weight_init=("MSRAFill", {}),
strides=use_striding,
no_bias=self.no_bias,
)
if spatial_batch_norm:
shortcut_blob = self.model.SpatialBN(
shortcut_blob,
'shortcut_projection_%d_spatbn' % self.comp_count,
num_filters,
epsilon=1e-3,
is_test=self.is_test,
)

Was this design choice on purpose, and if so, could you perhaps tell me why?

Thanks!

When I open the folder there is more than 16 images.

clip_len (int): Determines how many frames are there in each clip. Defaults to 16. But when I open the folder of the preprocessed video I see more than 16 images, somentimes 32 or 40

KeyError: 'state_dict' when running inference.py

Traceback (most recent call last):
File "inference.py", line 77, in
main()
File "inference.py", line 32, in main
model.load_state_dict(checkpoint['state_dict'])
KeyError: 'state_dict'

Is random flip accomplished only in testing set?

According to following lines of code from dataset.py, it seems like random flip is triggered only for testing data.

pytorch-video-recognition/dataloaders/dataset.py

Lines 81 to 83 in a63b85f

    
           if self.split == 'test': 
        
               # Perform data augmentation 
        
               buffer = self.randomflip(buffer)

Do I get it wrong, or this is a bug where 'test' should be changed by 'train'?

feng feng

how to pre-processing data?

how to pre-processing video to img ?
thank you

the problem in the load_frames

def load_frames(self, file_dir):
frames = sorted([os.path.join(file_dir, img) for img in os.listdir(file_dir)])
frame_count = len(frames)
buffer = np.empty((frame_count, self.resize_height, self.resize_width, 3), np.dtype('float32'))
for i, frame_name in enumerate(frames):
frame = np.array(cv2.imread(frame_name)).astype(np.float64)
frame -= np.array([[[90.0, 98.0, 102.0]]])
# frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
buffer[i] = frame

    # convert from [T, H, W, C] format to [C, T, H, W] (what PyTorch uses)
    # T = Time, H = Height, W = Width, C = Channels
    buffer = buffer.transpose((3, 0, 1, 2))

    return buffer

in the load_frames why you minus [90,98,102]

Is your pre-training model trained through the UCF101 database?

Hi, we got an incredible result from the pre-training model you provided. Is your pre-training model from the UCF101 database?

Is the R3D model based on resnet-18 with the basic building block?

Thanks for the great work!
Is the R3D model based on resnet-18 with the "basic building block"? It seems so, but want to confirm if it is the exactly same implementation.

"the basic building block" I meant is as Figure 5 Left shows in the original resnet paper:
https://arxiv.org/pdf/1512.03385.pdf

Pretraining dataset

Hi!
Thanks for the repo, I've recently implemented your model, and I was wondering if you could tell what dataset was used to obtain the pretrained weights?
Thanks

Train from scratch

Anyone try to train from scratch on Ucf101 on C3D? The accuracy keep 1%. I use other models implemented by myself and the accuracy is also 1%. The learning rate is 1e-5. Does anyone have some idea on it?

why the training loss always none?

I got some loss like this:


100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 424/424 [04:10<00:00,  2.24it/s]
[train] Epoch: 22/100 Loss: nan Acc: 0.010870849580527
Execution time: 250.25667172999238

100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 108/108 [00:26<00:00,  5.16it/s]
[val] Epoch: 22/100 Loss: nan Acc: 0.011121408711770158
Execution time: 26.448329468010343

100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 424/424 [04:09<00:00,  2.23it/s]
[train] Epoch: 23/100 Loss: nan Acc: 0.010870849580527
Execution time: 249.90277546200377

100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 108/108 [00:26<00:00,  5.09it/s]
[val] Epoch: 23/100 Loss: nan Acc: 0.011121408711770158
Execution time: 26.87914375399123

100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 424/424 [04:09<00:00,  2.24it/s]
[train] Epoch: 24/100 Loss: nan Acc: 0.010870849580527
Execution time: 249.9237438449927

100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 108/108 [00:26<00:00,  5.16it/s]
[val] Epoch: 24/100 Loss: nan Acc: 0.011121408711770158
Execution time: 26.460865497996565

It;s all nan, for what reason maybe?

about mypath.py

I don't know the directory tree about the root_dir and output_dir,when i run train.py,just show not found files.please display the directory tree.thanks.

Dataset not loading

I encountered some problem running the train.py file and i think it's related to the dataset. After downloaded the UCF101 dataset, I've put it in the UCF-101 folder and run the train.py file but getting some errors. Anyone with a solution? Thanks

size mismatch for fc8.bias: copying a param with shape torch.Size([487]) from checkpoint, the shape in current model is torch.Size([101])

This project is very interesting, thank you bloggers for sharing. Is this model really trained on ucf101? Why is it labeled 487?

Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

when i run dataset.py,it get this error:
Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)

Why data enhancement on test datasets,not on train datasets?

pytorch-video-recognition/dataloaders/dataset.py

Line 81 in ca37de9

if self.split == 'test':

C3D training from scratch met RuntimeError: CUDNN_STATUS_EXECUTION_FAILED

Hello @jfzhang95 , thanks for your code firstly.

I'm trying to train C3D from scratch on my own ucf101 style dataset.

I changed ucf101 config from 101 to 2 & num_workers=1 in train.py and dataset path in mypath.py, except mentioned above I didn't change any other settings.

When I run 'python train.py', I got this runtime error and don't know what happened.

Traceback (most recent call last):
File "C:/Users/google/Desktop/pytorch-video-recognition-master/train.py", line 203, in
train_model()
File "C:/Users/google/Desktop/pytorch-video-recognition-master/train.py", line 131, in train_model
outputs = model(inputs)
File "D:\Anaconda3\envs\video\lib\site-packages\torch\nn\modules\module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "C:\Users\google\Desktop\pytorch-video-recognition-master\network\C3D_model.py", line 46, in forward
x = self.relu(self.conv1(x))
File "D:\Anaconda3\envs\video\lib\site-packages\torch\nn\modules\module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "D:\Anaconda3\envs\video\lib\site-packages\torch\nn\modules\conv.py", line 421, in forward
self.padding, self.dilation, self.groups)
RuntimeError: CUDNN_STATUS_EXECUTION_FAILED

The env is win10 cuda9 torch0.4.0, I'm not sure if I should run this under linux.

Thanks if anyone can help.

About the Features

Sorry to bother! I used your pretrained model to extract video features on HMDB51 dataset, However, I find that every video has similar features, each dimension about the value 0.7.

	if self.split == 'test':
	# Perform data augmentation
	buffer = self.randomflip(buffer)

jfzhang95 / pytorch-video-recognition Goto Github PK

pytorch-video-recognition's Introduction

pytorch-video-recognition

Introduction

Installation

Datasets:

Experiments

pytorch-video-recognition's People

Contributors

Stargazers

Watchers

Forkers

pytorch-video-recognition's Issues

Recommend Projects

Recommend Topics

Recommend Org