Code Monkey home page Code Monkey logo

mmsa's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mmsa's Issues

关于相关系数Corr

作者您好,请问在SIMS数据集的回归任务中Corr值的高低代表什么?为什么EF-LSTM的Corr值和别的模型相差那么多,在多少范围是正常的呢

关于测试脚本

作者你好,非常感谢你们的集成式工作。现在我有MMSA-FET工具生成的单个视频对应的特征以及已经用SIMS训练好的模型,请问是否有可以调用模型进行单个视频测试输出的脚本?

Question about DataPre.py

Hi, @iyuge2
Thank you for your contribution to this project.
I downloaded the data from the address you provided and ran it. It did achieve the same result as result-stat.

Then I download SIMS|MOSI|MOSEI raw data and use DatePre.py to generate features.pkl. But the processed data cannot be used for run.py.

Let's take MOSI as an example(I download the raw data and use DataPre.py to process):

  1. The file you provided are aligned_50.pkl(367.3MB) and unaligned_50.pkl(554.2MB).
  2. I use DatePre.py to generate *.pkl for the MOSI dataset. Got features.pkl 2.8G, which is really bigger than your *.pkl.

Then I use features.pkl to run, but it failed...:sob:

Failure situation:

Error of 'list' object has no attribute 'astype'.

Then I change

MMSA/data/load_data.py

Lines 37 to 39 in b2e70bb

self.labels = {
'M': data[self.mode][self.args.train_mode+'_labels'].astype(np.float32)
}

to

self.labels = {
       'M': np.array(data[self.mode][self.args.train_mode+'_labels'], dtype=np.float32)
 }

But it didn't work, we will get a new error of :
RuntimeError: cublas runtime error : resource allocation failed at /opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/THC/THCGeneral.cpp:216
There must be a problem with the DataPre.py generate data. I really don't know how to solve it.

For the DataPre.py file, I only modified the path so that the data can be found for processing, and I have not changed other places.
@iyuge2 Please help me...

About Language Pre-trained Model

作者你好,我用 hugging face上的 bert-case-uncased 模型训练结果与你们提供的 feature file 中 text 训练结果差了很多,请问这是什么原因?

Error of bert

@iyuge2 @Columbine21
Hi, have a problem when I run run.py file.
Hope to get your help

The error is located in BertTextEncoder.py :

text:torch.Size([64, 39, 768])
input_ids:torch.Size([64, 768]) | input_mask:torch.Size([64, 768]) | segment_ids: torch.Size([64, 768])

last_hidden_states = self.model(input_ids=input_ids,
attention_mask=input_mask,
token_type_ids=segment_ids)[0] # Models outputs are now tuples

Error in line 62, I still don't know how to solve this problem.

Detailed error message:

opt/conda/conda-bld/pytorch/work/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2, IndexIsMajor = true]: block: [79,0,0], thread: [0,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Traceback (most recent call last):
  File "/media/yiwei/yiwei-01/project/Emotion/MMSA/MMSA-old/trains/singleTask/MISA.py", line 66, in do_train
    outputs = model(text, audio, vision)
  File "/media/yiwei/yiwei-01/project/Emotion/MMSA/MMSA-old/models/singleTask/MISA.py", line 281, in forward
    output = self.alignment(text, audio, video)
  File "/media/yiwei/yiwei-01/project/Emotion/MMSA/MMSA-old/models/singleTask/MISA.py", line 195, in alignment
    bert_output = self.bertmodel(text) # [batch_size, seq_len, 768]
  File "/media/yiwei/600G/anaconda3/envs/MMSA/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
  File "/media/yiwei/yiwei-01/project/Emotion/MMSA/MMSA-old/models/subNets/BertTextEncoder.py", line 68, in forward
    token_type_ids=segment_ids.to('cuda'))[0]  # Models outputs are now tuples
ib/python3.6/site-packages/torch/nn/functional.py", line 1371, in linear
    output = input.matmul(weight.t())
RuntimeError: cublas runtime error : resource allocation failed at /opt/conda/conda-bld/pytorch/work/aten/src/THC/THCGeneral.cpp:216

关于mosi数据集下未对齐特征的问题

作者您好,我发现mosi数据集下未对齐的语音和图像特征序列的特征维度和mult论文中所使用的不一样,mult下分别是74,47,请问是否存在错误?

Issues for pytorch_transformers

I installed the same version of pytorch_transformers(eg: pytorch_transformers==1.0.0 in requirements.txt file), however i got "ModuleNotFoundError: No module named 'pytorch_transformers.amir_tokenization'" error in "/models/singleTask/BERT_MAG.py" file. So, someone else can tell me why?

关于MMSA_test 无法正常使用

你好!我成功地用你们的MMSA_run 训练出来了misa mosei 在models文件夹下的模型, 也用你们的fes 提出来新视频的特征。但是我运行mmsa_test时运行到最后报错。
这是我训练时用的命令
python -m MMSA -d mosei -m misa --model-save-dir ./models --res-save-dir ./results

这是我的测试脚本
_from MMSA.config import get_config_tune
from MMSA import MMSA_test
model = "/home/ben/hdd/AI/misa.pth"
fea= "/home/ben/hdd/AI/feature.pkl"
config = get_config_tune('misa', 'mosei')
MMSA_test(config,model,fea,0)__

报错内容
Traceback (most recent call last):
File "/home/ben/hdd/MMSA-master/run.py", line 8, in
MMSA_test(config,model,fea,0)
File "/home/ben/hdd/anaconda3/envs/AI/lib/python3.9/site-packages/MMSA/run.py", line 323, in MMSA_test
model.load_state_dict(torch.load(weights_path), strict=False)

File "/home/ben/hdd/anaconda3/envs/AI/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1604, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for AMIO:
size mismatch for Model.vrnn1.weight_ih_l0: copying a param with shape torch.Size([140, 35]) from checkpoint, the shape in current model is torch.Size([708, 177]).
size mismatch for Model.vrnn1.weight_hh_l0: copying a param with shape torch.Size([140, 35]) from checkpoint, the shape in current model is torch.Size([708, 177]).
size mismatch for Model.vrnn1.bias_ih_l0: copying a param with shape torch.Size([140]) from checkpoint, the shape in current model is torch.Size([708]).

请问是我torch的版本不一致问题,还是mmsa_test函数config文件设置不正确, 导致这个size不一样的问题?
不知道能不能提供你们测试时的torch 版本, 以及mmsa_test里所需要的config文件(如果我的用法错误), 或者提供所有mmsa_test 里所需的参数文件

about audio from dataset MOSEI

Hi, I just want to know how the feature of audio from dataset MOSEI is calculated.

I loaded one of the datasets,

filename = './aligned_50.pkl'
RawData = pickle.load(open(filename,'rb'),encoding='utf-8')
print(RawData['train']['audio'].shape)

the result goes:

(16326, 50, 74)

so it means each audio piece has a feature of shape(50,74),
but how to calculate these features from raw audio files?(like .mp3 or .wav files?)

关于评价指标

作者你好,关于评价指标论文中说到分类任务是acc和F1,请问主实验Table2中Acc-2、Acc-3、Acc-5和F1指标,这个F1是Acc-2还是Acc-3还是Acc-5的F1呢?

Cannot import 'MultimodalBertForSequenceClassification' from

Dear author,

It showed the erro "found ImportError: cannot import name 'MultimodalBertForSequenceClassification' from 'pytorch_transformers.modeling_bert'' when I enter 'run.py' command. I've installed the pytorch_transformers version 1.0.0.

Could you please help to check this problem?

Supplementary note about the results

Hi,

Results listed in the MMSA/results/result-stat.md are reproduced under the same tuning and running settings. First, we tried 50 sets of parameters for each model on the same dataset with grid search. Then the parameters with best performance in validation set are selected as the final one.

Unfortunately, we lost the original parameters in our paper when we re-run all models and datasets. But you can try the following parameters, which can get comparable or better results than our work in AAAI 2021.

Best wishes!
Thank you~

def __SELF_MM(self):
    tmp = {
        'commonParas':{
            'need_data_aligned': False,
            'need_model_aligned': False,
            'need_normalized': False,
            'use_bert': True,
            'use_finetune': True,
            'save_labels': False,
            'early_stop': 8,
            'update_epochs': 4
        },
        # dataset
        'datasetParas':{
            'mosi':{
                # the batch_size of each epoch is update_epochs * batch_size
                'batch_size': 16,
                'learning_rate_bert': 5e-5,
                'learning_rate_audio': 0.005,
                'learning_rate_video': 0.005,
                'learning_rate_other': 0.001,
                'weight_decay_bert': 0.001,
                'weight_decay_audio': 0.001,
                'weight_decay_video': 0.001,
                'weight_decay_other': 0.001,
                # feature subNets
                'a_lstm_hidden_size': 16,
                'v_lstm_hidden_size': 32,
                'a_lstm_layers': 1,
                'v_lstm_layers': 1,
                'text_out': 768, 
                'audio_out': 16,
                'video_out': 32, 
                'a_lstm_dropout': 0.0,
                'v_lstm_dropout': 0.0,
                't_bert_dropout':0.1,
                # post feature
                'post_fusion_dim': 128,
                'post_text_dim':32,
                'post_audio_dim': 16,
                'post_video_dim': 32,
                'post_fusion_dropout': 0.0,
                'post_text_dropout': 0.1,
                'post_audio_dropout': 0.1,
                'post_video_dropout': 0.0,
                # res
                'H': 3.0
            },
            'mosei':{
                # the batch_size of each epoch is update_epochs * batch_size
                'batch_size': 32,
                'learning_rate_bert': 5e-5,
                'learning_rate_audio': 0.005,
                'learning_rate_video': 1e-4,
                'learning_rate_other': 1e-3,
                'weight_decay_bert': 0.001,
                'weight_decay_audio': 0.0,
                'weight_decay_video': 0.0,
                'weight_decay_other': 0.01,
                # feature subNets
                'a_lstm_hidden_size': 16,
                'v_lstm_hidden_size': 32,
                'a_lstm_layers': 1,
                'v_lstm_layers': 1,
                'text_out': 768, 
                'audio_out': 16,
                'video_out': 32, 
                'a_lstm_dropout': 0.0,
                'v_lstm_dropout': 0.0,
                't_bert_dropout':0.1,
                # post feature
                'post_fusion_dim': 128,
                'post_text_dim':32,
                'post_audio_dim': 16,
                'post_video_dim': 32,
                'post_fusion_dropout': 0.1,
                'post_text_dropout': 0.0,
                'post_audio_dropout': 0.0,
                'post_video_dropout': 0.0,
                # res
                'H': 3.0
            },
            'sims':{
                # the batch_size of each epoch is update_epochs * batch_size
                'batch_size': 32,
                'learning_rate_bert': 5e-5,
                'learning_rate_audio': 5e-3,
                'learning_rate_video': 5e-3,
                'learning_rate_other': 1e-3,
                'weight_decay_bert': 0.001,
                'weight_decay_audio': 0.01,
                'weight_decay_video': 0.01,
                'weight_decay_other': 0.001,
                # feature subNets
                'a_lstm_hidden_size': 16,
                'v_lstm_hidden_size': 64,
                'a_lstm_layers': 1,
                'v_lstm_layers': 1,
                'text_out': 768, 
                'audio_out': 16,
                'video_out': 32, 
                'a_lstm_dropout': 0.0,
                'v_lstm_dropout': 0.0,
                't_bert_dropout':0.1,
                # post feature
                'post_fusion_dim': 128,
                'post_text_dim':64,
                'post_audio_dim': 16,
                'post_video_dim': 32,
                'post_fusion_dropout': 0.0,
                'post_text_dropout': 0.1,
                'post_audio_dropout': 0.1,
                'post_video_dropout': 0.0,
                # res
                'H': 1.0
            },
        },
    }
    return tmp

About dataset

The data set downloaded from Baidu net disk into the program, is not given the path of data set download can run?

Problems with relative imports

Hi,

I'm having some problems running MMSA, I've ran pip install . and it worked, but when I try to run the file by doing python3 src/MMSA/run.py or even python3 src/MMSA, I respectively get the following errors :

Traceback (most recent call last):
  File "run.py", line 18, in <module>
    from .config import get_config_regression, get_config_tune
ImportError: attempted relative import with no known parent package

or

Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/work_directory/MMSA/src/MMSA/__main__.py", line 3, in <module>
    from .run import MMSA_run
ImportError: attempted relative import with no known parent package

I'm not trying to run this script in a conda environment and I've also tried to put MMSA. in front of the first relative imports of run.py (lines 18 to 22), which gave me the following message :

SENA_run is not loaded due to missing dependencies. This is ok if you are not using M-SENA.

Thank you for your help, I'll be able to provide more informations if you need

Processed data for SIIMS dataset

Can the Processed data be shared for SIIMS dataset in google drive? I can't access the baidu link here from US. Thank you in advance!

About alignment

hello!
I want to know that for MOSI and MOSEI, how you align three modalities?
thank you !

关于数据集

作者您好,有一个问题想确认一下,数据集由60个视频裁剪而来,获取了2281个视频片段,每个video文件夹下的单个视频是按总视频的顺序来安放的吗?是否可以视作对话数据集呢?期待您的回复。
Hello, author, I have a question to confirm that the data set is cropped from 60 videos and 2281 video clips have been obtained. Is the single video in each video folder placed in the order of the total video? Can you think of it as a conversation data set? Thank you for your reply.

index

train_index.csv
I want to know what the numbers in the index represent because I am having trouble replacing MOSI, thank you very much

hi,query about dataset

你好 (●'◡'●)

  感谢你对这个代码库的不断支持,不知道能不能请教您,这个数据集合的 align 和 unalign 有 什么区别吗~~╥﹏╥...感谢!

Best,
jun0

数据集原始视频缺失

您好,我下载了你们提供的数据集,发现MOSEI缺少video_id是Y7qkyMyanjU文件夹,少了四个视频片段。然后我还有个问题是四个数据集能否提供原始的语音文件,目前只有视频和文本,语音需要自己提。

Raw files for CMU MOSEI dataset

Can the raw files for the CMU MOSEI dataset be shared? I saw that its on a Baidu server which I cannot access unfortunately.

关于pytorch版本的问题

我看到您的默认pytorch版本至少是1.9.1,我的GPU的cuda版本不支持安装这个版本的pytorch。请问有什么好的解决方案吗?

请教关于准确率计算的一个问题

请问在/trains/multiTask/文件夹下的几个脚本中,第63行(以MTFN.py为例)为何使用了 y_true[m].append(labels['M'].cpu() ?

如果是计算不同modality标签预测的结果指标,是否应使用 y_true[m].append(labels[m].cpu()) ?

关于使用tune调参后的问题

作者您好,我在使用is_tune=True调整模型参数后,选择了保存在csv文件中效果较好的对应参数进行训练,但是无法达到csv文件中参数所对应的精度,请问可能是什么原因导致的呢

about GPU

Hello, I want to know if it is possible to run this program with 4 2080Ti?

论文中关于超参数选择部分

5.2节超参数选择部分,有“Empirically, we choose the average length plus three times the standard deviation as the maximum length of the sequence” 问一下,为什么要这么做呢,这么做的理由是什么

How to download raw data for your CH-SIMS dataset?

You've done a great job! But if I want to try out some other features for those three modalities, how can I download the complete dataset? The links provided on your home page only contains processed data from CH-SIMS dataset.

关于SIMS数据集

您好!请问一下,您在论文报告关于SIMS数据集的结果,是negative/non-negative还是negative/positive的测试集结果?

More intuitive testing

  1. The run.py only can see the scores in the *.log file. Does it support tests emotion on video which input by users?

Oh, I see~. All the data had been processed to *.pkl file to train. So if I want test on personal input video, I should process the input video to pkl file which contain text, audio and video.

  1. By the way. What is the performance of this method? Can it be detected emotion on video real-time?.

About APIs

Hello, could you please tell me why the open APIs interface is empty?How do I get the API documentation for this package?

data/DataPre.py line 20

output_dir = os.path.join(working_dir, output_dir)

should be

output_dir = os.path.join(self.working_dir, output_dir)

python run.py has thses problems. when I train V1 version.

File "run.py", line 300, in
worker()
run_normal(args)
File "run.py", line 174, in run_normal
test_results = run(args)
model = AMIO(args).to(device)
File "D:\net\MMSA-master\models\AMIO.py", line 44, in init
lastModel = MODEL_MAP[args.modelName]
KeyError: 'MTFN'

About the released Processed Data

Thank you for your wonderful work!
But I have some questions about the released dataset in BaiduYun Disk: The Processed Data in MOSEI have 74 and 35 feature dimensions for audio and vision modality, which can be figured out that these two modalities features are extracted by COVAREP and Facet. However, the Processed Data in MOSI have 5 and 20 feature dimensions for audio and vision modality, what feature extractor did you use to extract MOSI data? As far as I know, MOSI audio and vision modality features extrated by COVAREP and Facet have 74 and 47 feature dimensions.
Thank you for your answer!

关于训练测试的问题

作者你好,我想问一下MMSA_run()这个函数是默认同时执行训练和测试吗,这里的tune指的是训练微调还是别的含义,谢谢!

Could you please share the hyperparameters/configure files that produce the results in "results/result-stat.md"?

Dear Authors,

Could you please share the hyperparameters/configure files that produce the results in "results/result-stat.md"?

I tried the default configuration files in this repo and I also tried to search the optimal configure using the fine-tune function, but I cannot reproduce the good results you displayed in the "results/result-stat.md" (I only tried it on the MOSI dataset currently).

Thank you very much for your support in advance.

关于复现的结果问题

我直接使用您的代码,没有做任何的改动,试着在MOSI运行了bert-mag和mult。结果如下。

Model,Has0_acc_2,Has0_F1_score,Non0_acc_2,Non0_F1_score,Mult_acc_5,Mult_acc_7,MAE,Corr,Loss
bert_mag,"(79.24, 4.4)","(88.35, 2.75)","(94.46, 1.58)","(97.15, 0.84)","(77.52, 1.7)","(77.52, 1.7)","(38.22, 3.04)","(67.88, 1.44)","(38.31, 3.02)"
mult,"(70.96, 9.93)","(82.62, 6.83)","(91.86, 4.26)","(95.7, 2.33)","(73.65, 0.69)","(73.65, 0.69)","(48.8, 0.85)","(54.8, 2.24)","(48.77, 0.83)"

这与result-stat.md和其对应论文中的结果都相差甚远,请问这是为什么呢?

Wrong input order of f1 score calculation?

Hi, I am a bit confused about the f1 score calculation of your code. Based on your code:

test_results = self.metrics(pred, true, exclude_zero=self.args.excludeZero)

and
f_score = f1_score(binary_preds, binary_truth, average='weighted')

f1 score is calculated with sk.learn.metrics.f1_score(pred,true), but in the official document, it should be calculated by sk.learn.metrics.f1_score(true,pred). The ground truth should be first. Is this a mistake? Does it affect the final result of your paper?

sklearn official document: https://scikit-learn.org/0.21/modules/generated/sklearn.metrics.f1_score.html

Data preprocessing

Dear Authors:

I was trying to repeat your results. But I found the data/Datapre.py script did not fit the raw videos in CH-SIMI.zip (downloaded from Google Drive). I tried to wrote scripts to process the raw videos, but the train/val/test split index files did not fit the raw video index.

I am asking if it is right to use train_index.csv files to process the data.npz feature files? For example, when id '0' in the train_index.csv, if the '0'th feature in data.npz file belongs to the training set.

It would be very helpful if you can update the processed feature or the whole folder so that data/DataPre.py can process it properly.

安装后遇到一些问题

SENA_run is not loaded due to missing dependencies. This is ok if you are not using M-SENA.

请问这个问题怎么解决呢?
麻烦解答一下 感谢

关于MMIM的复现问题

您好!
我在使用您的代码复现MMIM时观察到main loss变nan,我并没有修改您的代码,数据也是使用您提供的/mosi/unaligned。
MMSA - TRAIN-(mmim) [1/2/2] >> mmilb loss: 0.0042 main loss: nan Has0_acc_2: 0.5701 Has0_F1_score: 0.4140 Non0_acc_2: 0.5971 Non0_F1_score: 0.5708 Mult_acc_5: 0.2500 Mult_acc_7: 0.2313 MAE: 1.2234 Corr: 0.3217
请问这是什么原因导致的呢?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.