craston / mars Goto Github PK

View Code? Open in Web Editor NEW

161.0 4.0 44.0 47 KB

MARS: Motion-Augmented RGB Stream for Action Recognition

License: MIT License

Python 95.25% Shell 1.93% C++ 2.82%

mars's People

Stargazers

Watchers

Forkers

caoliangjie salt-fly karenz17 jhvics1 bityangke hzhang57 b0ttle xiaochehe hnn123 tuliplab lianzhaoy zhzhuangxue 21hub seominseok0429 yusong-tan gzaraunitn zhao-yl konsmard mathilde173 henryle97 jz72 newalice veechiry ole168 alex-bd vincentwpj superctj aakgun dimbyr huangzushu flaber123 styanddty gratitudation nyu-dice-lab niejr hoanganh27042001 nysaginu happyman11 jjp-beijing vaniro lianglili

mars's Issues

how to get the label file of HMDB51

I sincerely appreciate the provided code.I want to know how to get the label file of HMDB51. I only downloaded the video file.
Thank you very much.

UCF101 dataset train error

Sorry for disturbing

But i got another problem

HMDB51 dataset worked fine, to train UCF101 dataset I just changed train part:
print("Preprocessing train data ...")
train_data = globals()['{}_test'.format(opt.dataset)](split = opt.split, train = 0, opt = opt) --> train 1 to 0

It seems everything is fine, however i got error below:

Is it related with code or I made mistake

Preprocessing train data ...
Length of train data = 3678
Preprocessing validation data ...
Length of validation data = 3678
Preparing datatloaders ...
Length of train datatloader = 114
Length of validation datatloader = 114
Loading model... resnext 101
loading pretrained model trained_models/kinetics/RGB_Kinetics_16f.pth
Layers to finetune : ['layer4', 'fc']
Initializing the optimizer ...
lr = 0.001 momentum = 0.9 dampening = 0.9 weight_decay = 1e-05, nesterov = False
LR patience = 10
run
Traceback (most recent call last):
File "train.py", line 119, in
for i, (inputs, targets) in enumerate(train_dataloader):
File "/home/fazlik/python36/local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 582, in next
return self._process_next_batch(batch)
File "/home/fazlik/python36/local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
ValueError: Traceback (most recent call last):
File "/home/fazlik/python36/local/lib/python3.5/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/fazlik/python36/local/lib/python3.5/site-packages/torch/utils/data/_utils/worker.py", line 99, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/fazlik/Desktop/MARS/dataset/dataset.py", line 288, in getitem
clip = get_train_video(self.opt, frame_path, Total_frames)
File "/home/fazlik/Desktop/MARS/dataset/dataset.py", line 96, in get_train_video
start_frame = np.random.randint(0, Total_frames)
File "mtrand.pyx", line 992, in mtrand.RandomState.randint
ValueError: Range cannot be empty (low >= high) unless no samples are taken

the setting of weight_decay

Hi,
thank you for your excellent work. In your paper the weight-decay sets to 5e-4 but it is 1e-5 in your code which is quiet different. Can you tell me which setting is right or which setting is better .Thanks.

How to train MERS model?

Figure 2: Training to mimic the Flow stream. We first train the Flow stream to classify actions using optical flow clips with cross entropy loss and freeze its weights. To mimic flow features using RGB frames, in step 1, we backpropagate the MSE loss through all the layers of MERS except the last layer. In step 2, we separately train the last layer of MERS with a cross entropy loss.

Dear @craston , I am a llitle confused in your paper after looking your code .
According to your paper, I think a model except the last layer has been trained with mse loss . And then you use this model to train the last layer with entropy loss. This is two separate process .
However, according to your code, you performance two steps in an epoch.
So the right operation is performancing two steps in an epoch ?

Whether there is a bug in resnext.py ?

MARS/models/resnext.py

Line 224 in 578e40f

else:

Shouldn't "else " Be match to "if" ?

Running SomeThingSomeThingV1 test_RGB problem.

Hi,
I have reproduced the test results on HMDB51 and UCF101 successfully.
However, I have trouble with running test SomeThingSomeThingV1.
I downloaded frame_folder from SomeThingSomeThingV1 main page(https://20bn.com/datasets/something-something/v1), then I added code for reading SomeThingSomeThingV1 in dataset.py . Finally, I ran test SomeThingSomeThingV1 single stream RGB only 64f:

python3 test_single_stream.py --batch_size 1 --n_classes 174 --model resnext --model_depth 101 --log 1 --dataset SmtSmt --modality RGB --sample_duration 64 --split 1 --only_RGB --resume_path1 "/host/mars/models/SMTSMT/RGB_Something_Something_64f.pth" --frame_dir "/host/SomethingSomethingV1_validation_frames/" --annotation_path "/host/mars/dataset/SmtSmt_labels/" --result_path "/host/mars/results_test_smtsmt/"

It was successfully executed but the result is completely wrong (accuracy is only 0.3%). Can you identify what i did wrong ? Do I need to modify the frame_folder after downloading ? Or can you help me demonstrate steps for running test SomeThingSomeThingV1 single stream RGB only 64f ?

(Here I attached file dataset.py with code for SomeThingSomeThingV1 added, just in case you need to check it).

How can I give input to 3d CNN?

The 3d CNN works with the videos, MRI, and scan datasets. Can anyone tell me If I have to feed the input (video) to the proposed 3d CNN network, and train it's weights, how can I able to do that? as 3d CNN expect 5 dimensional inputs;

[batch size, channels, depth, height, weight]

how can I extract depth from the videos?

If I have 10 video of 10 different classes. The duration of each video is 6 seconds. I extract 2 frames for each second and it goes around 12 frames for each video.

Size of RGB videos is 112x112 --> Height = 112, Width=112, and Channels=3

If I keep the batch size equals 2

1 video --> 6 seconds --> 12 frames (1sec == 2frames) [each frame (3,112,112)]

10 videos (10 classes) --> 60 seconds --> 120 frames
So the 5 dimensions will be something like this; [2, 3, 12, 112, 112]

2 --> Two videos will be processed for each batch size.
3 --> RGB channel
12 --> each video contains 12 frames
112 --> Height of each video
112 --> Width of each video

First, I need to label all 10 videos [3, 12, 112, 112] --> [channels, frames (depth), height, width], then I am feeding it to the Data Loader (Pytorch) to make it to batch size [2, 3, 12, 112, 112]

I use data loader in Pytorch, I am keeping its batch size equals 2 as I am processing 2 videos each time during the training, this way my 10 videos will be trained for 5 times.

Am I right? or can you suggest any other method to do this?

Code error: Normalize code does not apply

The code below receives a variable called tensor and outputs the tensor to which nothing is applied.

class Normalize(object):
    """Normalize an tensor image with mean and standard deviation.
    Given mean: (R, G, B) and std: (R, G, B),
    will normalize each channel of the torch.*Tensor, i.e.
    channel = (channel - mean) / std
    Args:
        mean (sequence): Sequence of means for R, G, B channels respecitvely.
        std (sequence): Sequence of standard deviations for R, G, B channels
            respecitvely.
    """

    def __init__(self, mean, std):
        self.mean = mean
        self.std = std

    def __call__(self, tensor):
        """
        Args:
            tensor (Tensor): Tensor image of size (C, H, W) to be normalized.
        Returns:
            Tensor: Normalized image.
        """
        # TODO: make efficient
        for t, m, s in zip(tensor, self.mean, self.std):
            t.sub_(m).div_(s)
        return tensor

    def randomize_parameters(self):
        pass

RuntimeError: Function AddmmBackward returned an invalid gradient at index 1 - got [1, 2048] but expected shape compatible with [1, 32768]?

Can you tell me why I am getting this?

where can I get the trained models like MARS on HMDB51-2,3

hello, I'm trying to implement on HMDB51-2,3 datasets, using HMDB51-1 model weight for pre-training.The accuracy obtained was less than 76% on test set. I want to know whether you still have a trained model.

Question about training 64f parameter?

Hello, i got a question about whether the parameter of training 64f the MARS model in HMDB dataset is the same as 16f ?

Could you share the MiniKinetics trained model?

Is the MiniKinetics trained model available? Could you share it?

extract flow problem

When i run the extract_frames_flows.py to extract flow and frames , it show me
sh: /home/ncrasto/code/workspace/action-recog-release/utils1/tvl1_videoframes: No such file or directory
i have run g++ -std=c++11 tvl1_videoframes.cpp -o tvl1_videoframes -I${OPENCV}include/opencv4/ -L${OPENCV}lib64 -lopencv_objdetect -lopencv_features2d -lopencv_imgproc -lopencv_highgui -lopencv_core -lopencv_imgcodecs -lopencv_cudaoptflow -lopencv_cudaarithm

could you help me what should i do,thank u very much

Please tell me about your optical flow directory, so that I can format it the way your code accept?

Can you tell me the shape of your input flow?

is it [batch_size, clips, x-y channels, weight, height] ---> [64, 16, 20, 224,224]
10 is for x-axis and 10 for y-axis?

Could you show some examples about how to train Flow model on something-v1?

percentage difference for previous SOTA paper

I have another question

In your paper RGB results for UCF-101 dataset is 95.2% for 64f-clip

In 'Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet' CVPR 2018 paper the result is 94.5%

However, approaches are the same in both cases(resnext-101).

Could you please explain, why your RGB result is higher than 94.5%.

I could not find any difference between your source code and 'Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet' paper's source code.

That is why i am wondering why two different results are obtained.

Maybe you added something which i have not mentioned in your code.

Thanks

Something-something dataset missing?

In dataset.py, it seems something-something is missing?

Training UCF-101 dataset

Can someone train UCF-101 dataset

I can not train it.

If you could train please, leave comment how(details)

Thank you

About MRS+Flow+RGB testing

When I tested MRS+Flow+RGB, my accuracy was only 94.8%.May I know how you tested it? I wish you could tell me

How to accelerate the extraction of optical flow ？

about flow_graphs

Why can the resnext handle 2 channel optical flow graphs

The accuracy of validation during training RGB stream is basically unchanged

Thank you for your good effort.

I want to ask that the accuracy of validation during training RGB stream is basically unchanged. I tried to fine-tune the RGB stream on UCF101 dataset. The accuracy of validation is stable at around 0.86 since step 40.

Is my training process right? I do not modify the hyperparameters.

Can you tell me, what was your input size to the model? just tell me for single batch_size?

how to train two-stream(RGB+Flow) Network?

Thank you for publishing this good work!
In your paper, you have mentioned the result of two-stream rgb+flow. However, I didn't find the method to train two-stream in this repo.
Can you give me this training script？

Thanks!

about Accuracy

How to look at the accuracy? The accuracy displayed is the accuracy of a batch size. How to display the accuracy of a epoch？

Extracting optical flow

Hi,

Firstly, thanks for a great job

However, I have an issue.

To extract optical flow I need to install OpenCV(with GPU support).

I could not install OpenCV as you mentioned.

Could you please help me with this issue

Thank you

The test script does not work.

My data structure is as below.

Here is my pre-trained model.
/workspace/MARS/MARS/trained_models/RGB_HMDB51_64f.pth

My test script is below.

python3 test_single_stream.py --batch_size 1 --n_classes 51 --model resnext --model_depth 101 --log 0 --dataset HMDB51 --modality RGB --sample_duration 64 --split 1 --only_RGB --resume_path1 "trained_models/RGB_HMDB51_64f.pth" --frame_dir "dataset/HMDB51/" --annotation_path "dataset/HMDB51_labels/testTrainMulti_7030_splits/"

It proceeds like the photo below and it no longer works.

Question about preprocessing data

MARS/dataset/preprocess_data.py

Line 308 in ae2749d

I = ToTensor(1)(I)

Meaning you didn't normalize image to range [0., 1.] ?

how to test three-stream result for rgb+mars+flow

Thank you for publishing this amazing work!
In your paper, you have mentioned the result of three-stream rgb+mars+flow. However, I didn't find the method to test three-stream result in this code.
Could you or anyone else who follow this work publish the code for testing three-stream results?

HMDB51 testing error!!!!

i have extracted frames from videos and RGB_HMDB51_16f.pth as you mention in Readme but whenever i try to test the script i get the following error.can you you please help me out to solve this issue. Thank you.

UCF 101 Training Guide

Could you please give me some tips about how to train your models on UCF101 dataset?
Thanks

One more question about training the flow_HMDB51_64f model

May i ask one more question? When the training the 64f Flow HMDB51 with the pretrained model 'Kinetics', however the accuracy of the 400 epoch model cannot reach the result as yours. Did you choose the last epoch model or others. Can you give me some advice on this?

A small query about finetuning

Hello,

I want to finetune the MARS and MERS on my own dataset using pre-trained weights of Kinetics-400 provided by you. For that I need to extract optical flow from my dataset, right?

Thank you, for sharing this wonderful repo online! Hope to hear from you soon!

Regards,
Ishan

Class Activation Map visualizing script.

Hi there ,can you share the class activation map script to visualize how the model is shifting its attention from time to time throughout a video?
Thanks in anticipation.

a bug in code when train/test ucf101

MARS/dataset/dataset.py

Line 278 in 578e40f

    
           frame_path = os.path.join(self.opt.frame_dir, self.idx_class.get(label_id + 1), video[0])

should be changed as below:
frame_path = video[0]

MERS weights request

Hi,
Thank you for the models and code. I am doing research on 3D activity recognition models like MARS/MERS. On your g-drive you have provided the Kinetics MARS_64f.pth; I was wondering if you could also provide Kinetics MERS_64f.pth so that I could compare them please.
Thank you

The accuracy of validation is low

when i execution

For RGB stream:

python test_single_stream.py --batch_size 1 --n_classes 51 --model resnext --model_depth 101
--log 0 --dataset HMDB51 --modality RGB --sample_duration 16 --split 1 --only_RGB
--resume_path1 "trained_models/HMDB51/RGB_HMDB51_16f.pth"
--frame_dir "dataset/HMDB51"
--annotation_path "dataset/HMDB51_labels"
--result_path "results/"

For single stream MARS:

python test_single_stream.py --batch_size 1 --n_classes 51 --model resnext --model_depth 101
--log 0 --dataset HMDB51 --modality RGB --sample_duration 16 --split 1 --only_RGB
--resume_path1 "trained_models/HMDB51/MARS_HMDB51_16f.pth"
--frame_dir "dataset/HMDB51"
--annotation_path "dataset/HMDB51_labels"
--result_path "results/"
For two streams RGB+MARS:

python test_two_stream.py --batch_size 1 --n_classes 51 --model resnext --model_depth 101
--log 0 --dataset HMDB51 --modality RGB --sample_duration 16 --split 1 --only_RGB
--resume_path1 "trained_models/HMDB51/RGB_HMDB51_16f.pth"
--resume_path2 "trained_models/HMDB51/MARS_HMDB51_16f.pth"
--frame_dir "dataset/HMDB51"
--annotation_path "dataset/HMDB51_labels"
--result_path "results/"

the top 1 is lower 70%

test_resnext101_HMDB51_1_RGB_16.txt

random frame is not working?

Hi,

MARS/dataset/dataset.py

Line 96 in b706443

start_frame = np.random.randint(0, Total_frames)

'start_frame' is not used, meaning always fetching the same clip from the first frame for each video.

Am I missing anything?

Thank you very much

For Flow stream use test_single_stream.py

Thank you for publishing the code !
I have a question about flow, in the extract_frames_flows.py, line 95 to line 96:

cv2.imwrite(os.path.join(outdir, 'TVL1jpg_x_%05d.jpg' % (i)), iflow[:, :, 0])
cv2.imwrite(os.path.join(outdir, 'TVL1jpg_y_%05d.jpg' % (i)), iflow[:, :, 1])

so the shape of the TVL1jpg_x.jpg and TVL1jpg_y.jpg are both (256,256), while in the test_single_stream.py, line 35 to line 36:

if opt.modality=='RGB': opt.input_channels = 3
elif opt.modality=='Flow': opt.input_channels = 2

when I run the test_single_stream.py, it occured an error: Length of validation data = 0
I want know how it processes the two gray pictures to meet input_channels = 2.

Flow accuracy

when i train for flow stream i can't get the accuracy as you write in the paper ,only get 66.45%accuracy.why?

Run pre-trained model on single video

Hi, I would like to use the pre-trained weights on HMDB to infer the model on a single random video and get the actions from that video? Can you give me a little help with the steps required? What pre-processing does the video require?

Petru

The data set code seems to be out of date (and likely not functioning)

Hi, thank you very much for publishing the code!

I think the data set code might be outdated, though. Here, it uses the variable, which is not defined:

MARS/dataset/dataset.py

Line 255 in ae2749d

if self.train_valtest==1:

There's an obvious fix for this, but I now I wonder if you might have published not the latest and greatest version of the code.

Furthermore, the code that parses annotation files seems to be wrong as well.

First, do you use **action detection annotations from here?

If this is the case, each line has the format:

or

In other words, there's no class ID for test entries, but there's a class ID for training.

Thus, the following two lines fail silently for training entries.

MARS/dataset/dataset.py

Line 264 in ae2749d

class_id = self.class_idx.get(line.split('/')[0]) - 1

Could you share all of the finetune hyperparameters on hmdb51?

Could you share all of the finetune hyperparameters on hmdb51? For example, momentum, epoch, running rate ...

Thank you very much for a good Research.

Different results of HMDB-51

Thank you very much for your work @craston. We really appreciate it.

I tested the model of 'RGB_HMDB_51_16f' in HMDB split 1. But I just got the accuracy of 55.7% which is 66.7% in your paper. The command is as follows:
python test_single_stream.py --batch_size 1 --n_classes 51 --model resnext --model_depth 101
--log 0 --dataset HMDB51 --modality RGB --sample_duration 16 --split 1 --only_RGB
--resume_path1 "../pretrained_model/RGB_HMDB51_16f.pth"
--frame_dir "../hmdb-51-1f"
--annotation_path "dataset/HMDB51_labels"
--result_path "results/"
--n_workers 4

Did I make a mistake somewhere? Thank you very much.

The speed to extract the flow map is too slow

The speed to extract the flow map is too slow > _ <

HMDB51 training error!!!!

Namespace(MARS=False, MARS_alpha=50.0, MARS_pretrain_path='', MARS_resume_path='', annotation_path='/media/cqq/Data/vicky/code/1action/MARS_dataset/dataset/HMDB51_labels/', batch_size=2, begin_epoch=1, checkpoint=1, dampening=0.9, dataset='HMDB51', frame_dir='/media/cqq/Data/vicky/code/1action/MARS_dataset/dataset/HMDB51_1/', freeze_BN=False, ft_begin_index=4, input_channels=3, learning_rate=0.1, log=1, lr_patience=10, manual_seed=1, modality='RGB_Flow', model='resnext', model_depth=101, momentum=0.9, n_classes=400, n_epochs=400, n_finetune_classes=51, n_workers=4, nesterov=False, only_RGB=False, optimizer='sgd', output_layers=["'avgpool'"], pretrain_path='/media/cqq/Data/vicky/code/1action/MARS/trained_models/MARS_Kinetics_64f.pth', random_seed=1, resnet_shortcut='B', resnext_cardinality=32, result_path='/media/cqq/Data/vicky/code/1action/MARS/results/', resume_path1='/media/cqq/Data/vicky/code/1action/MARS/trained_models/Flow_HMDB51_64f.pth', resume_path2='', resume_path3='', sample_duration=64, sample_size=112, split='1', training=True, weight_decay=0.001)
Preprocessing train data ...
Length of train data = 3570
Preprocessing validation data ...
Length of validation data = 1530
Preparing datatloaders ...
Length of train datatloader = 1785
Length of validation datatloader = 765
Loading MARS model... resnext 101
loading pretrained model /media/cqq/Data/vicky/code/1action/MARS/trained_models/MARS_Kinetics_64f.pth
Layers to finetune : ['layer4', 'fc']
Loading Flow model... resnext 101
loading checkpoint /media/cqq/Data/vicky/code/1action/MARS/trained_models/Flow_HMDB51_64f.pth
Initializing the optimizer ...
lr = 0.001 momentum = 0.9 dampening = 0.9 weight_decay = 1e-05, nesterov = False
LR patience = 10
run
Traceback (most recent call last):
File "/media/cqq/Data/vicky/code/1action/MARS/masr_train.py", line 186, in
outputs_Flow = model_Flow(inputs_Flow)[1].detach()
IndexError: list index out of range

The length of the model_Flow is 1, so the list index out of range how can i fix this?
Thansk

craston / mars Goto Github PK

mars's People

Stargazers

Watchers

Forkers

mars's Issues

Recommend Projects

Recommend Topics

Recommend Org