Code Monkey home page Code Monkey logo

mars's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

mars's Issues

how to get the label file of HMDB51

I sincerely appreciate the provided code.I want to know how to get the label file of HMDB51. I only downloaded the video file.
Thank you very much.

UCF101 dataset train error

Sorry for disturbing

But i got another problem

HMDB51 dataset worked fine, to train UCF101 dataset I just changed train part:
print("Preprocessing train data ...")
train_data = globals()['{}_test'.format(opt.dataset)](split = opt.split, train = 0, opt = opt) --> train 1 to 0

It seems everything is fine, however i got error below:

Is it related with code or I made mistake

Preprocessing train data ...
Length of train data = 3678
Preprocessing validation data ...
Length of validation data = 3678
Preparing datatloaders ...
Length of train datatloader = 114
Length of validation datatloader = 114
Loading model... resnext 101
loading pretrained model trained_models/kinetics/RGB_Kinetics_16f.pth
Layers to finetune : ['layer4', 'fc']
Initializing the optimizer ...
lr = 0.001 momentum = 0.9 dampening = 0.9 weight_decay = 1e-05, nesterov = False
LR patience = 10
run
Traceback (most recent call last):
File "train.py", line 119, in
for i, (inputs, targets) in enumerate(train_dataloader):
File "/home/fazlik/python36/local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 582, in next
return self._process_next_batch(batch)
File "/home/fazlik/python36/local/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
ValueError: Traceback (most recent call last):
File "/home/fazlik/python36/local/lib/python3.5/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/fazlik/python36/local/lib/python3.5/site-packages/torch/utils/data/_utils/worker.py", line 99, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/fazlik/Desktop/MARS/dataset/dataset.py", line 288, in getitem
clip = get_train_video(self.opt, frame_path, Total_frames)
File "/home/fazlik/Desktop/MARS/dataset/dataset.py", line 96, in get_train_video
start_frame = np.random.randint(0, Total_frames)
File "mtrand.pyx", line 992, in mtrand.RandomState.randint
ValueError: Range cannot be empty (low >= high) unless no samples are taken

the setting of weight_decay

Hi,
thank you for your excellent work. In your paper the weight-decay sets to 5e-4 but it is 1e-5 in your code which is quiet different. Can you tell me which setting is right or which setting is better .Thanks.

How to train MERS model?

Figure 2: Training to mimic the Flow stream. We first train the Flow stream to classify actions using optical flow clips with cross entropy loss and freeze its weights. To mimic flow features using RGB frames, in step 1, we backpropagate the MSE loss through all the layers of MERS except the last layer. In step 2, we separately train the last layer of MERS with a cross entropy loss.

Dear @craston , I am a llitle confused in your paper after looking your code .
According to your paper, I think a model except the last layer has been trained with mse loss . And then you use this model to train the last layer with entropy loss. This is two separate process .
However, according to your code, you performance two steps in an epoch.
So the right operation is performancing two steps in an epoch ?

Running SomeThingSomeThingV1 test_RGB problem.

1
Hi,
I have reproduced the test results on HMDB51 and UCF101 successfully.
However, I have trouble with running test SomeThingSomeThingV1.
I downloaded frame_folder from SomeThingSomeThingV1 main page(https://20bn.com/datasets/something-something/v1), then I added code for reading SomeThingSomeThingV1 in dataset.py . Finally, I ran test SomeThingSomeThingV1 single stream RGB only 64f:

python3 test_single_stream.py --batch_size 1 --n_classes 174 --model resnext --model_depth 101 --log 1 --dataset SmtSmt --modality RGB --sample_duration 64 --split 1 --only_RGB --resume_path1 "/host/mars/models/SMTSMT/RGB_Something_Something_64f.pth" --frame_dir "/host/SomethingSomethingV1_validation_frames/" --annotation_path "/host/mars/dataset/SmtSmt_labels/" --result_path "/host/mars/results_test_smtsmt/"

It was successfully executed but the result is completely wrong (accuracy is only 0.3%). Can you identify what i did wrong ? Do I need to modify the frame_folder after downloading ? Or can you help me demonstrate steps for running test SomeThingSomeThingV1 single stream RGB only 64f ?

(Here I attached file dataset.py with code for SomeThingSomeThingV1 added, just in case you need to check it).

How can I give input to 3d CNN?

The 3d CNN works with the videos, MRI, and scan datasets. Can anyone tell me If I have to feed the input (video) to the proposed 3d CNN network, and train it's weights, how can I able to do that? as 3d CNN expect 5 dimensional inputs;

[batch size, channels, depth, height, weight]

how can I extract depth from the videos?

If I have 10 video of 10 different classes. The duration of each video is 6 seconds. I extract 2 frames for each second and it goes around 12 frames for each video.

Size of RGB videos is 112x112 --> Height = 112, Width=112, and Channels=3

If I keep the batch size equals 2

1 video --> 6 seconds --> 12 frames (1sec == 2frames) [each frame (3,112,112)]

10 videos (10 classes) --> 60 seconds --> 120 frames
So the 5 dimensions will be something like this; [2, 3, 12, 112, 112]

2 --> Two videos will be processed for each batch size.
3 --> RGB channel
12 --> each video contains 12 frames
112 --> Height of each video
112 --> Width of each video

First, I need to label all 10 videos [3, 12, 112, 112] --> [channels, frames (depth), height, width], then I am feeding it to the Data Loader (Pytorch) to make it to batch size [2, 3, 12, 112, 112]

I use data loader in Pytorch, I am keeping its batch size equals 2 as I am processing 2 videos each time during the training, this way my 10 videos will be trained for 5 times.

Am I right? or can you suggest any other method to do this?

Code error: Normalize code does not apply

The code below receives a variable called tensor and outputs the tensor to which nothing is applied.

class Normalize(object):
    """Normalize an tensor image with mean and standard deviation.
    Given mean: (R, G, B) and std: (R, G, B),
    will normalize each channel of the torch.*Tensor, i.e.
    channel = (channel - mean) / std
    Args:
        mean (sequence): Sequence of means for R, G, B channels respecitvely.
        std (sequence): Sequence of standard deviations for R, G, B channels
            respecitvely.
    """

    def __init__(self, mean, std):
        self.mean = mean
        self.std = std

    def __call__(self, tensor):
        """
        Args:
            tensor (Tensor): Tensor image of size (C, H, W) to be normalized.
        Returns:
            Tensor: Normalized image.
        """
        # TODO: make efficient
        for t, m, s in zip(tensor, self.mean, self.std):
            t.sub_(m).div_(s)
        return tensor

    def randomize_parameters(self):
        pass

extract flow problem

When i run the extract_frames_flows.py to extract flow and frames , it show me
sh: /home/ncrasto/code/workspace/action-recog-release/utils1/tvl1_videoframes: No such file or directory
i have run g++ -std=c++11 tvl1_videoframes.cpp -o tvl1_videoframes -I${OPENCV}include/opencv4/ -L${OPENCV}lib64 -lopencv_objdetect -lopencv_features2d -lopencv_imgproc -lopencv_highgui -lopencv_core -lopencv_imgcodecs -lopencv_cudaoptflow -lopencv_cudaarithm

could you help me what should i do,thank u very much

percentage difference for previous SOTA paper

I have another question

In your paper RGB results for UCF-101 dataset is 95.2% for 64f-clip

In 'Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet' CVPR 2018 paper the result is 94.5%

However, approaches are the same in both cases(resnext-101).

Could you please explain, why your RGB result is higher than 94.5%.

I could not find any difference between your source code and 'Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet' paper's source code.

That is why i am wondering why two different results are obtained.

Maybe you added something which i have not mentioned in your code.

Thanks

Training UCF-101 dataset

Can someone train UCF-101 dataset

I can not train it.

If you could train please, leave comment how(details)

Thank you

About MRS+Flow+RGB testing

When I tested MRS+Flow+RGB, my accuracy was only 94.8%.May I know how you tested it? I wish you could tell me

The accuracy of validation during training RGB stream is basically unchanged

Thank you for your good effort.

I want to ask that the accuracy of validation during training RGB stream is basically unchanged. I tried to fine-tune the RGB stream on UCF101 dataset. The accuracy of validation is stable at around 0.86 since step 40.

Is my training process right? I do not modify the hyperparameters.

how to train two-stream(RGB+Flow) Network?

Thank you for publishing this good work!
In your paper, you have mentioned the result of two-stream rgb+flow. However, I didn't find the method to train two-stream in this repo.
Can you give me this training script?

Thanks!

about Accuracy

How to look at the accuracy? The accuracy displayed is the accuracy of a batch size. How to display the accuracy of a epoch?

Extracting optical flow

Hi,

Firstly, thanks for a great job

However, I have an issue.

To extract optical flow I need to install OpenCV(with GPU support).

I could not install OpenCV as you mentioned.

Could you please help me with this issue

Thank you

The test script does not work.

My data structure is as below.

Screen Shot 2020-03-16 at 5 53 27 PM

Here is my pre-trained model.
/workspace/MARS/MARS/trained_models/RGB_HMDB51_64f.pth

My test script is below.

python3 test_single_stream.py --batch_size 1 --n_classes 51 --model resnext --model_depth 101 --log 0 --dataset HMDB51 --modality RGB --sample_duration 64 --split 1 --only_RGB --resume_path1 "trained_models/RGB_HMDB51_64f.pth" --frame_dir "dataset/HMDB51/" --annotation_path "dataset/HMDB51_labels/testTrainMulti_7030_splits/"

It proceeds like the photo below and it no longer works.

Screen Shot 2020-03-16 at 5 58 02 PM

how to test three-stream result for rgb+mars+flow

Thank you for publishing this amazing work!
In your paper, you have mentioned the result of three-stream rgb+mars+flow. However, I didn't find the method to test three-stream result in this code.
Could you or anyone else who follow this work publish the code for testing three-stream results?

HMDB51 testing error!!!!

i have extracted frames from videos and RGB_HMDB51_16f.pth as you mention in Readme but whenever i try to test the script i get the following error.can you you please help me out to solve this issue. Thank you.

Screenshot from 2020-09-30 12-40-37

UCF 101 Training Guide

Could you please give me some tips about how to train your models on UCF101 dataset?
Thanks

One more question about training the flow_HMDB51_64f model

May i ask one more question? When the training the 64f Flow HMDB51 with the pretrained model 'Kinetics', however the accuracy of the 400 epoch model cannot reach the result as yours. Did you choose the last epoch model or others. Can you give me some advice on this?

A small query about finetuning

Hello,

I want to finetune the MARS and MERS on my own dataset using pre-trained weights of Kinetics-400 provided by you. For that I need to extract optical flow from my dataset, right?

Thank you, for sharing this wonderful repo online! Hope to hear from you soon!

Regards,
Ishan

Class Activation Map visualizing script.

Hi there ,can you share the class activation map script to visualize how the model is shifting its attention from time to time throughout a video?
Thanks in anticipation.

MERS weights request

Hi,
Thank you for the models and code. I am doing research on 3D activity recognition models like MARS/MERS. On your g-drive you have provided the Kinetics MARS_64f.pth; I was wondering if you could also provide Kinetics MERS_64f.pth so that I could compare them please.
Thank you

The accuracy of validation is low

when i execution

For RGB stream:

python test_single_stream.py --batch_size 1 --n_classes 51 --model resnext --model_depth 101
--log 0 --dataset HMDB51 --modality RGB --sample_duration 16 --split 1 --only_RGB
--resume_path1 "trained_models/HMDB51/RGB_HMDB51_16f.pth"
--frame_dir "dataset/HMDB51"
--annotation_path "dataset/HMDB51_labels"
--result_path "results/"

For single stream MARS:

python test_single_stream.py --batch_size 1 --n_classes 51 --model resnext --model_depth 101
--log 0 --dataset HMDB51 --modality RGB --sample_duration 16 --split 1 --only_RGB
--resume_path1 "trained_models/HMDB51/MARS_HMDB51_16f.pth"
--frame_dir "dataset/HMDB51"
--annotation_path "dataset/HMDB51_labels"
--result_path "results/"
For two streams RGB+MARS:

python test_two_stream.py --batch_size 1 --n_classes 51 --model resnext --model_depth 101
--log 0 --dataset HMDB51 --modality RGB --sample_duration 16 --split 1 --only_RGB
--resume_path1 "trained_models/HMDB51/RGB_HMDB51_16f.pth"
--resume_path2 "trained_models/HMDB51/MARS_HMDB51_16f.pth"
--frame_dir "dataset/HMDB51"
--annotation_path "dataset/HMDB51_labels"
--result_path "results/"

the top 1 is lower 70%

test_resnext101_HMDB51_1_RGB_16.txt

For Flow stream use test_single_stream.py

Thank you for publishing the code !
I have a question about flow, in the extract_frames_flows.py, line 95 to line 96:

cv2.imwrite(os.path.join(outdir, 'TVL1jpg_x_%05d.jpg' % (i)), iflow[:, :, 0])
cv2.imwrite(os.path.join(outdir, 'TVL1jpg_y_%05d.jpg' % (i)), iflow[:, :, 1])

so the shape of the TVL1jpg_x.jpg and TVL1jpg_y.jpg are both (256,256), while in the test_single_stream.py, line 35 to line 36:

if opt.modality=='RGB': opt.input_channels = 3
elif opt.modality=='Flow': opt.input_channels = 2

when I run the test_single_stream.py, it occured an error: Length of validation data = 0
I want know how it processes the two gray pictures to meet input_channels = 2.

Flow accuracy

when i train for flow stream i can't get the accuracy as you write in the paper ,only get 66.45%accuracy.why?

Run pre-trained model on single video

Hi, I would like to use the pre-trained weights on HMDB to infer the model on a single random video and get the actions from that video? Can you give me a little help with the steps required? What pre-processing does the video require?

Petru

The data set code seems to be out of date (and likely not functioning)

Hi, thank you very much for publishing the code!

I think the data set code might be outdated, though. Here, it uses the variable, which is not defined:

if self.train_valtest==1:

There's an obvious fix for this, but I now I wonder if you might have published not the latest and greatest version of the code.

Furthermore, the code that parses annotation files seems to be wrong as well.

First, do you use **action detection annotations from here?

If this is the case, each line has the format:

or

In other words, there's no class ID for test entries, but there's a class ID for training.

Thus, the following two lines fail silently for training entries.

class_id = self.class_idx.get(line.split('/')[0]) - 1

Different results of HMDB-51

Thank you very much for your work @craston. We really appreciate it.

I tested the model of 'RGB_HMDB_51_16f' in HMDB split 1. But I just got the accuracy of 55.7% which is 66.7% in your paper. The command is as follows:
python test_single_stream.py --batch_size 1 --n_classes 51 --model resnext --model_depth 101
--log 0 --dataset HMDB51 --modality RGB --sample_duration 16 --split 1 --only_RGB
--resume_path1 "../pretrained_model/RGB_HMDB51_16f.pth"
--frame_dir "../hmdb-51-1f"
--annotation_path "dataset/HMDB51_labels"
--result_path "results/"
--n_workers 4

Did I make a mistake somewhere? Thank you very much.

HMDB51 training error!!!!

Namespace(MARS=False, MARS_alpha=50.0, MARS_pretrain_path='', MARS_resume_path='', annotation_path='/media/cqq/Data/vicky/code/1action/MARS_dataset/dataset/HMDB51_labels/', batch_size=2, begin_epoch=1, checkpoint=1, dampening=0.9, dataset='HMDB51', frame_dir='/media/cqq/Data/vicky/code/1action/MARS_dataset/dataset/HMDB51_1/', freeze_BN=False, ft_begin_index=4, input_channels=3, learning_rate=0.1, log=1, lr_patience=10, manual_seed=1, modality='RGB_Flow', model='resnext', model_depth=101, momentum=0.9, n_classes=400, n_epochs=400, n_finetune_classes=51, n_workers=4, nesterov=False, only_RGB=False, optimizer='sgd', output_layers=["'avgpool'"], pretrain_path='/media/cqq/Data/vicky/code/1action/MARS/trained_models/MARS_Kinetics_64f.pth', random_seed=1, resnet_shortcut='B', resnext_cardinality=32, result_path='/media/cqq/Data/vicky/code/1action/MARS/results/', resume_path1='/media/cqq/Data/vicky/code/1action/MARS/trained_models/Flow_HMDB51_64f.pth', resume_path2='', resume_path3='', sample_duration=64, sample_size=112, split='1', training=True, weight_decay=0.001)
Preprocessing train data ...
Length of train data = 3570
Preprocessing validation data ...
Length of validation data = 1530
Preparing datatloaders ...
Length of train datatloader = 1785
Length of validation datatloader = 765
Loading MARS model... resnext 101
loading pretrained model /media/cqq/Data/vicky/code/1action/MARS/trained_models/MARS_Kinetics_64f.pth
Layers to finetune : ['layer4', 'fc']
Loading Flow model... resnext 101
loading checkpoint /media/cqq/Data/vicky/code/1action/MARS/trained_models/Flow_HMDB51_64f.pth
Initializing the optimizer ...
lr = 0.001 momentum = 0.9 dampening = 0.9 weight_decay = 1e-05, nesterov = False
LR patience = 10
run
Traceback (most recent call last):
File "/media/cqq/Data/vicky/code/1action/MARS/masr_train.py", line 186, in
outputs_Flow = model_Flow(inputs_Flow)[1].detach()
IndexError: list index out of range

The length of the model_Flow is 1, so the list index out of range how can i fix this?
Thansk

Flow Network For Fine-Tuning

Thank you for your good effort.

I want to ask that for fine-tuning models(MARS or MERS) in smaller datasets such as HMDB-51, as a teacher network, did you utilize Flow network trained on Kinetics-400, or Flow network fine-tuned on HMDB-51 dataset?

the accuracy is lower than 72.2% in HMDB51-1

Hello, I want to kown why my validation accuracy is lower than 72.2%, it's about 71.3%. I set batch size as 125 and else didn't change. from now as I known, batch size maybe won't influence the result .Thanks in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.