eriklindernoren / action-recognition Goto Github PK

View Code? Open in Web Editor NEW

173.0 9.0 71.0 8.72 MB

Exploration of different solutions to action recognition in video, using neural networks implemented in PyTorch.

Shell 0.76% Python 99.24%

action-recognition video-classification pytorch

action-recognition's People

Contributors

Stargazers

Watchers

Forkers

gemadang govan111 jingang-cv panna19951227 kumarkarun mkzirncz1 sujit-deokar jovialio gracekafuu normatica anhvaut wangwill topliftarm emaadparacha liguiming77 imsrbh angenieux beebrain javierlorenzod kevintrannz swimmingcreative brightjay2 sandy4321 i6173215 am-official daryl149 ronales yuangao-cs iamweiweishi baskaranangappan cvasrest kevinwyze pcshih mon7erey noelcodes iamrishab enansakib wuzhan11 heavytrowa volpepe zzwei1 omergilani wuchaowei2012 damoswl sanjoykundu amwons rdgozum yinjiayang chetanmreddy yuchenzhao sabbirirfan wj123-12 lmore ndhuyvn1994 mbencherif suhendaragung20 sunpengfei1122 jennicolas avall daydayupdyp abdelmalek0 shanye1516 key-cc dlab-arp sangwooji bjnaga lichunyun1123 nikitashrivastava29 hong1218 jebeaven

action-recognition's Issues

Terminology mistake

Your model is different from ConvLSTM proposed in this paper: https://arxiv.org/abs/1506.04214, where 2D-LSTM is applied to output of each convolution layer in a CNN, usually used for pixel-level video prediction.

This implementation should be called CNN-LSTM.

Inspired from your work, I updated this repo to use fastai2

You can have a look here:
https://github.com/tcapelle/action_recognition
Thanks you!

test error

Hello, I have all the requirements to run test_on_video.py but I keep getting a path_to_video error.

Here it is;

Traceback (most recent call last):
File "test_on_video.py", line 38, in
labels = sorted(list(set(os.listdir(opt.video_path))))
NotADirectoryError: [WinError 267] The directory name is invalid: 'C:/Users/Windows/Documents/Action-Recognition/test/v_Surfing_g03_c04.avi'

I have tried everything but it still not working.

What can be done?

size mismatch for output_layers.3.bias: copying a param with shape torch.Size([101]) from checkpoint, the shape in current model is torch.Size([105]).

I get this error when I run test_on_video.py file.

RuntimeError Traceback (most recent call last)
in ()
----> 1 model.load_state_dict(torch.load(checkpoint_model))
2 model.eval()

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
828 if len(error_msgs) > 0:
829 raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
--> 830 self.class.name, "\n\t".join(error_msgs)))
831 return _IncompatibleKeys(missing_keys, unexpected_keys)
832

RuntimeError: Error(s) in loading state_dict for ConvLSTM:
size mismatch for output_layers.3.weight: copying a param with shape torch.Size([101, 1024]) from checkpoint, the shape in current model is torch.Size([105, 1024]).
size mismatch for output_layers.3.bias: copying a param with shape torch.Size([101]) from checkpoint, the shape in current model is torch.Size([105]).

latent_att not defined before.

Action-Recognition/models.py

Line 68 in b43ec09

latent_att = self.latent_attention(latent_att)

train.py: error: ambiguous option: --img_dim could match --img_dim_H, --img_dim_W

I get this error when I run train.py file.

    print(sequence[0])
IndexError: list index out of range

Testing performance on official UCF-101 split 1

I can only get about 76% on UCF-101 split 1 testing dataset and the model seems overfitting...
How can I fix the overfitting problem?

Running test_on_video.py encountered "unexpected keyword argument 'input_shape'" error

python3 test_on_video.py --video_path data/UCF-101/SoccerPenalty/v_SoccerPenalty_g01_c01.avi --checkpoint_model model_checkpoints/ConvLSTM_150.pth

Namespace(channels=3, checkpoint_model='model_checkpoints/ConvLSTM_150.pth', dataset_path='data/UCF-101-frames', image_dim=112, latent_dim=512, video_path='data/UCF-101/SoccerPenalty/v_SoccerPenalty_g01_c01.avi')
Traceback (most recent call last):
File "test_on_video.py", line 41, in
model = ConvLSTM(input_shape=input_shape, num_classes=len(labels), latent_dim=opt.latent_dim)
TypeError: init() got an unexpected keyword argument 'input_shape'

list index out of range when start training

Hello, I'm trying to start training with the UCF-101 dataset.

I've done a few adaptations on your code to get where I am now.

I downloaded the ucf 101 dataset in .avi. I then extracted all of the frames using extract_frames.py

After, I downloaded the train and test split files for the dataset from here ,

yjxiong/temporal-segment-networks#177

for the split_path argument I'm passing the path to the folder ( named ucfTrainTestlist ) containing classInd.txt , testlist01.txt , trainlist01.txt

Here are the args im passing to start traning:

python train.py --dataset_path data/frames/-frames/data/frames --split_path ucfTrainTestlist/ --split_number 1

and here is the error I'm getting:

--- Epoch 0 ---
Traceback (most recent call last):
File "train.py", line 115, in
for batch_i, (X, y) in enumerate(train_dataloader):
File "C:\Users\Windows\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 819, in next
return self._process_data(data)
File "C:\Users\Windows\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 846, in _process_data
data.reraise()
File "C:\Users\Windows\Anaconda3\lib\site-packages\torch_utils.py", line 369, in reraise
raise self.exc_type(msg)
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "C:\Users\Windows\Anaconda3\lib\site-packages\torch\utils\data_utils\worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "C:\Users\Windows\Anaconda3\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "C:\Users\Windows\Anaconda3\lib\site-packages\torch\utils\data_utils\fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "C:\Users\Windows\Documents\Action-Recognition\dataset.py", line 78, in getitem
image_paths = self._pad_to_length(image_paths)
File "C:\Users\Windows\Documents\Action-Recognition\dataset.py", line 67, in _pad_to_length
left_pad = sequence[0]
IndexError: list index out of range

I point the dataset_path to the folder called frames and inside this folder the video frames are divided in sub folders. These sub folders are named after the names of each video.

when I try to use your pretrained model，it give some error....

Missing key(s) in state_dict: "lstm.lstm.weight_ih_l0_reverse", "lstm.lstm.weight_hh_l0_reverse", "lstm.lstm.bias_ih_l0_reverse", "lstm.lstm.bias_hh_l0_reverse", "output_layers.0.weight", "output_layers.0.bias", "output_layers.1.weight", "output_layers.1.bias", "output_layers.1.running_mean", "output_layers.1.running_var", "output_layers.3.weight", "output_layers.3.bias", "attention_layer.weight", "attention_layer.bias".
Unexpected key(s) in state_dict: "lstm.final.0.weight", "lstm.final.0.bias", "lstm.final.1.weight", "lstm.final.1.bias", "lstm.final.1.running_mean", "lstm.final.1.running_var", "lstm.final.1.num_batches_tracked", "lstm.final.3.weight", "lstm.final.3.bias".

Softmax in Model Output, then using CE Loss

Thank you for the interesting work here.

I've just encountered one issue with the code. The ConvLSTM model outputs softmax as the last layer, but then in the training script CrossEntropyLoss is performed. CE Loss already performs a softmax on the input, so you do not want to do softmax on a softmax twice. Instead, the ConvLSTM should output the classification (Linear) layer prior to the Softmax to put into CE loss. The softmax probabilities can be computed later in the test set evaluation step to determine the test accuracy.

Please let me know if others agree with this small change to the code.

Also, what type of Attention is being used? Is it the dot-product?

ValueError: not enough values to unpack (expected 2, got 1)

Namespace(dataset_path='UCF-101')
Traceback (most recent call last):
File "extract_frames.py", line 31, in
sequence_type, sequence_name = video_path.split(".avi")[0].split("/")[-2:]
ValueError: not enough values to unpack (expected 2, got 1)

Small bug fix

Thanks for sharing the repo!

extract_frames(video_path, time_left)
should be
extract_frames(video_path)

Action-Recognition/data/extract_frames.py

Line 42 in b43ec09

extract_frames(video_path, time_left),

Hello Mr.Linder-Norén

Hello Mr.Linder-Norén, I am very sorry to bother you.
I have downloaded your Action-Recognition code and have learned a lot.
But I still have some a question: Can you teach me how to use the Attention Module in you model?
Thank you very much for your reply.

how i can solve it

RuntimeError: Error(s) in loading state_dict for ConvLSTM:
Missing key(s) in state_dict: "lstm.lstm.weight_ih_l0_reverse", "lstm.lstm.weight_hh_l0_reverse", "lstm.lstm.bias_ih_l0_reverse", "lstm.lstm.bias_hh_l0_reverse", "output_layers.0.weight", "output_layers.0.bias", "output_layers.1.weight", "output_layers.1.bias", "output_layers.1.running_mean", "output_layers.1.running_var", "output_layers.3.weight", "output_layers.3.bias", "attention_layer.weight", "attention_layer.bias".
Unexpected key(s) in state_dict: "lstm.final.0.weight", "lstm.final.0.bias", "lstm.final.1.weight", "lstm.final.1.bias", "lstm.final.1.running_mean", "lstm.final.1.running_var", "lstm.final.1.num_batches_tracked", "lstm.final.3.weight", "lstm.final.3.bias"

Error: Regarding Loading Pre-Trained Weights

I am using pre-trained weights which is given by you. But I am facing this problem.Please can you guide me to resolve this issue.Thank you

Namespace(channels=3, checkpoint_model='ConvLSTM_150.pth', dataset_path='data/UCF-101-frames', image_dim=224, latent_dim=512, video_path='1.mp4')
Traceback (most recent call last):
File "test_on_video.py", line 49, in
model.load_state_dict(torch.load(opt.checkpoint_model))
File "/home/naeem/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 839, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for ConvLSTM:
Missing key(s) in state_dict: "lstm.lstm.weight_ih_l0_reverse", "lstm.lstm.weight_hh_l0_reverse", "lstm.lstm.bias_ih_l0_reverse", "lstm.lstm.bias_hh_l0_reverse", "lstm.output_layers.0.weight", "lstm.output_layers.0.bias", "lstm.output_layers.1.weight", "lstm.output_layers.1.bias", "lstm.output_layers.1.running_mean", "lstm.output_layers.1.running_var", "lstm.output_layers.3.weight", "lstm.output_layers.3.bias".
Unexpected key(s) in state_dict: "lstm.final.0.weight", "lstm.final.0.bias", "lstm.final.1.weight", "lstm.final.1.bias", "lstm.final.1.running_mean", "lstm.final.1.running_var", "lstm.final.1.num_batches_tracked", "lstm.final.3.weight", "lstm.final.3.bias".

AttributeError: 'Namespace' object has no attribute 'sequence_length'

While Testing on test.py, I am getting this error "'Namespace' object has no attribute 'sequence_length'" , and I trained my own model, not using the default one.And if I am specifying the sequence_length for example 40. Its is giving error "IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)" Please help

how long it will cost on cpu machine when predict one video.

can anybody share the predict time cost?

Train issue

This project is really interesting.

I tried to train the model, but i always get a random list index out of range error during the training phase.

I used torch 1.2 till 1.3.1, cuda 10.1, always the same error.

Anyone has an idea how to fix that?

python3 train.py --dataset_path data/UCF-101-frames/ --split_path data/ucfTrainTestlist --num_epochs 200 --sequence_length 20 --img_dim 112 --latent_dim 512 --batch_size 64
Namespace(batch_size=64, channels=3, checkpoint_interval=5, checkpoint_model='', dataset_path='data/UCF-101-frames/', img_dim=112, latent_dim=512, num_epochs=200, sequence_length=20, split_number=1, split_path='data/ucfTrainTestlist')
cuda
--- Epoch 0 ---
[Epoch 0/200] [Batch 22/150] [Loss: 4.612639 (4.613988), Acc: 4.69% (2.31%)] ETA: 8:49:23.620145Traceback (most recent call last):
File "train.py", line 116, in
for batch_i, (X, y) in enumerate(train_dataloader):
File "/home/gary/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 801, in next
return self._process_data(data)
File "/home/gary/.local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 846, in _process_data
data.reraise()
File "/home/gary/.local/lib/python3.6/site-packages/torch/_utils.py", line 385, in reraise
raise self.exc_type(msg)
IndexError: Caught IndexError in DataLoader worker process 3.
Original Traceback (most recent call last):
File "/home/gary/.local/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/home/gary/.local/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/gary/.local/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/opt/4tbdrive1/experiments/Action-Recognition/dataset.py", line 83, in getitem
image_paths = self._pad_to_length(image_paths)
File "/opt/4tbdrive1/experiments/Action-Recognition/dataset.py", line 67, in _pad_to_length
left_pad = sequence[0]
IndexError: list index out of range

Paper link for this repository!!!

Hey, I am going to publish my work pretty soon and I want to cite your work. How can I cite your work , is there any paper link for this repository ?