thuiar / mmsa Goto Github PK
View Code? Open in Web Editor NEWMMSA is a unified framework for Multimodal Sentiment Analysis.
License: MIT License
MMSA is a unified framework for Multimodal Sentiment Analysis.
License: MIT License
作者您好,请问在SIMS数据集的回归任务中Corr值的高低代表什么?为什么EF-LSTM的Corr值和别的模型相差那么多,在多少范围是正常的呢
如遇百度云链接失效,请在此issue下回复。我们会尽快更新链接。
If the BaiduYun link is dead again, please reply under this issue. We'll update as soon as possible.
作者你好,非常感谢你们的集成式工作。现在我有MMSA-FET工具生成的单个视频对应的特征以及已经用SIMS训练好的模型,请问是否有可以调用模型进行单个视频测试输出的脚本?
Hi, @iyuge2
Thank you for your contribution to this project.
I downloaded the data from the address you provided and ran it. It did achieve the same result as result-stat.
Then I download SIMS|MOSI|MOSEI raw data and use DatePre.py to generate features.pkl
. But the processed data cannot be used for run.py
.
Let's take MOSI as an example(I download the raw data and use DataPre.py
to process):
aligned_50.pkl
(367.3MB) and unaligned_50.pkl
(554.2MB).*.pkl
for the MOSI dataset. Got features.pkl
2.8G, which is really bigger than your *.pkl
.Then I use features.pkl
to run, but it failed...:sob:
Failure situation:
Error of 'list' object has no attribute 'astype'
.
Then I change
Lines 37 to 39 in b2e70bb
to
self.labels = { 'M': np.array(data[self.mode][self.args.train_mode+'_labels'], dtype=np.float32) }
But it didn't work, we will get a new error of :
RuntimeError: cublas runtime error : resource allocation failed at /opt/conda/conda-bld/pytorch_1565272279342/work/aten/src/THC/THCGeneral.cpp:216
There must be a problem with the DataPre.py
generate data. I really don't know how to solve it.
For the DataPre.py file, I only modified the path so that the data can be found for processing, and I have not changed other places.
@iyuge2 Please help me...
作者你好,我用 hugging face上的 bert-case-uncased 模型训练结果与你们提供的 feature file 中 text 训练结果差了很多,请问这是什么原因?
@iyuge2 @Columbine21
Hi, have a problem when I run run.py
file.
Hope to get your help
The error is located in BertTextEncoder.py :
text:torch.Size([64, 39, 768])
input_ids:torch.Size([64, 768]) | input_mask:torch.Size([64, 768]) | segment_ids: torch.Size([64, 768])
MMSA/models/subNets/BertTextEncoder.py
Lines 62 to 64 in b2e70bb
Error in line 62, I still don't know how to solve this problem.
Detailed error message:
opt/conda/conda-bld/pytorch/work/aten/src/THC/THCTensorIndex.cu:361: void indexSelectLargeIndex(TensorInfo<T, IndexType>, TensorInfo<T, IndexType>, TensorInfo<long, IndexType>, int, int, IndexType, IndexType, long) [with T = float, IndexType = unsigned int, DstDim = 2, SrcDim = 2, IdxDim = -2, IndexIsMajor = true]: block: [79,0,0], thread: [0,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Traceback (most recent call last):
File "/media/yiwei/yiwei-01/project/Emotion/MMSA/MMSA-old/trains/singleTask/MISA.py", line 66, in do_train
outputs = model(text, audio, vision)
File "/media/yiwei/yiwei-01/project/Emotion/MMSA/MMSA-old/models/singleTask/MISA.py", line 281, in forward
output = self.alignment(text, audio, video)
File "/media/yiwei/yiwei-01/project/Emotion/MMSA/MMSA-old/models/singleTask/MISA.py", line 195, in alignment
bert_output = self.bertmodel(text) # [batch_size, seq_len, 768]
File "/media/yiwei/600G/anaconda3/envs/MMSA/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
File "/media/yiwei/yiwei-01/project/Emotion/MMSA/MMSA-old/models/subNets/BertTextEncoder.py", line 68, in forward
token_type_ids=segment_ids.to('cuda'))[0] # Models outputs are now tuples
ib/python3.6/site-packages/torch/nn/functional.py", line 1371, in linear
output = input.matmul(weight.t())
RuntimeError: cublas runtime error : resource allocation failed at /opt/conda/conda-bld/pytorch/work/aten/src/THC/THCGeneral.cpp:216
作者您好,我发现mosi数据集下未对齐的语音和图像特征序列的特征维度和mult论文中所使用的不一样,mult下分别是74,47,请问是否存在错误?
I installed the same version of pytorch_transformers(eg: pytorch_transformers==1.0.0 in requirements.txt file), however i got "ModuleNotFoundError: No module named 'pytorch_transformers.amir_tokenization'" error in "/models/singleTask/BERT_MAG.py" file. So, someone else can tell me why?
你好!我成功地用你们的MMSA_run 训练出来了misa mosei 在models文件夹下的模型, 也用你们的fes 提出来新视频的特征。但是我运行mmsa_test时运行到最后报错。
这是我训练时用的命令
python -m MMSA -d mosei -m misa --model-save-dir ./models --res-save-dir ./results
这是我的测试脚本
_from MMSA.config import get_config_tune
from MMSA import MMSA_test
model = "/home/ben/hdd/AI/misa.pth"
fea= "/home/ben/hdd/AI/feature.pkl"
config = get_config_tune('misa', 'mosei')
MMSA_test(config,model,fea,0)__
报错内容
Traceback (most recent call last):
File "/home/ben/hdd/MMSA-master/run.py", line 8, in
MMSA_test(config,model,fea,0)
File "/home/ben/hdd/anaconda3/envs/AI/lib/python3.9/site-packages/MMSA/run.py", line 323, in MMSA_test
model.load_state_dict(torch.load(weights_path), strict=False)
File "/home/ben/hdd/anaconda3/envs/AI/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1604, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for AMIO:
size mismatch for Model.vrnn1.weight_ih_l0: copying a param with shape torch.Size([140, 35]) from checkpoint, the shape in current model is torch.Size([708, 177]).
size mismatch for Model.vrnn1.weight_hh_l0: copying a param with shape torch.Size([140, 35]) from checkpoint, the shape in current model is torch.Size([708, 177]).
size mismatch for Model.vrnn1.bias_ih_l0: copying a param with shape torch.Size([140]) from checkpoint, the shape in current model is torch.Size([708]).
。
。
请问是我torch的版本不一致问题,还是mmsa_test函数config文件设置不正确, 导致这个size不一样的问题?
不知道能不能提供你们测试时的torch 版本, 以及mmsa_test里所需要的config文件(如果我的用法错误), 或者提供所有mmsa_test 里所需的参数文件
Hi, I just want to know how the feature of audio from dataset MOSEI is calculated.
I loaded one of the datasets,
filename = './aligned_50.pkl'
RawData = pickle.load(open(filename,'rb'),encoding='utf-8')
print(RawData['train']['audio'].shape)
the result goes:
(16326, 50, 74)
so it means each audio piece has a feature of shape(50,74),
but how to calculate these features from raw audio files?(like .mp3 or .wav files?)
作者你好,关于评价指标论文中说到分类任务是acc和F1,请问主实验Table2中Acc-2、Acc-3、Acc-5和F1指标,这个F1是Acc-2还是Acc-3还是Acc-5的F1呢?
Dear author,
It showed the erro "found ImportError: cannot import name 'MultimodalBertForSequenceClassification' from 'pytorch_transformers.modeling_bert'' when I enter 'run.py' command. I've installed the pytorch_transformers version 1.0.0.
Could you please help to check this problem?
Hi,
Results listed in the MMSA/results/result-stat.md are reproduced under the same tuning and running settings. First, we tried 50 sets of parameters for each model on the same dataset with grid search. Then the parameters with best performance in validation set are selected as the final one.
Unfortunately, we lost the original parameters in our paper when we re-run all models and datasets. But you can try the following parameters, which can get comparable or better results than our work in AAAI 2021.
Best wishes!
Thank you~
def __SELF_MM(self):
tmp = {
'commonParas':{
'need_data_aligned': False,
'need_model_aligned': False,
'need_normalized': False,
'use_bert': True,
'use_finetune': True,
'save_labels': False,
'early_stop': 8,
'update_epochs': 4
},
# dataset
'datasetParas':{
'mosi':{
# the batch_size of each epoch is update_epochs * batch_size
'batch_size': 16,
'learning_rate_bert': 5e-5,
'learning_rate_audio': 0.005,
'learning_rate_video': 0.005,
'learning_rate_other': 0.001,
'weight_decay_bert': 0.001,
'weight_decay_audio': 0.001,
'weight_decay_video': 0.001,
'weight_decay_other': 0.001,
# feature subNets
'a_lstm_hidden_size': 16,
'v_lstm_hidden_size': 32,
'a_lstm_layers': 1,
'v_lstm_layers': 1,
'text_out': 768,
'audio_out': 16,
'video_out': 32,
'a_lstm_dropout': 0.0,
'v_lstm_dropout': 0.0,
't_bert_dropout':0.1,
# post feature
'post_fusion_dim': 128,
'post_text_dim':32,
'post_audio_dim': 16,
'post_video_dim': 32,
'post_fusion_dropout': 0.0,
'post_text_dropout': 0.1,
'post_audio_dropout': 0.1,
'post_video_dropout': 0.0,
# res
'H': 3.0
},
'mosei':{
# the batch_size of each epoch is update_epochs * batch_size
'batch_size': 32,
'learning_rate_bert': 5e-5,
'learning_rate_audio': 0.005,
'learning_rate_video': 1e-4,
'learning_rate_other': 1e-3,
'weight_decay_bert': 0.001,
'weight_decay_audio': 0.0,
'weight_decay_video': 0.0,
'weight_decay_other': 0.01,
# feature subNets
'a_lstm_hidden_size': 16,
'v_lstm_hidden_size': 32,
'a_lstm_layers': 1,
'v_lstm_layers': 1,
'text_out': 768,
'audio_out': 16,
'video_out': 32,
'a_lstm_dropout': 0.0,
'v_lstm_dropout': 0.0,
't_bert_dropout':0.1,
# post feature
'post_fusion_dim': 128,
'post_text_dim':32,
'post_audio_dim': 16,
'post_video_dim': 32,
'post_fusion_dropout': 0.1,
'post_text_dropout': 0.0,
'post_audio_dropout': 0.0,
'post_video_dropout': 0.0,
# res
'H': 3.0
},
'sims':{
# the batch_size of each epoch is update_epochs * batch_size
'batch_size': 32,
'learning_rate_bert': 5e-5,
'learning_rate_audio': 5e-3,
'learning_rate_video': 5e-3,
'learning_rate_other': 1e-3,
'weight_decay_bert': 0.001,
'weight_decay_audio': 0.01,
'weight_decay_video': 0.01,
'weight_decay_other': 0.001,
# feature subNets
'a_lstm_hidden_size': 16,
'v_lstm_hidden_size': 64,
'a_lstm_layers': 1,
'v_lstm_layers': 1,
'text_out': 768,
'audio_out': 16,
'video_out': 32,
'a_lstm_dropout': 0.0,
'v_lstm_dropout': 0.0,
't_bert_dropout':0.1,
# post feature
'post_fusion_dim': 128,
'post_text_dim':64,
'post_audio_dim': 16,
'post_video_dim': 32,
'post_fusion_dropout': 0.0,
'post_text_dropout': 0.1,
'post_audio_dropout': 0.1,
'post_video_dropout': 0.0,
# res
'H': 1.0
},
},
}
return tmp
The data set downloaded from Baidu net disk into the program, is not given the path of data set download can run?
Hi,
I'm having some problems running MMSA, I've ran pip install .
and it worked, but when I try to run the file by doing python3 src/MMSA/run.py
or even python3 src/MMSA
, I respectively get the following errors :
Traceback (most recent call last):
File "run.py", line 18, in <module>
from .config import get_config_regression, get_config_tune
ImportError: attempted relative import with no known parent package
or
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/work_directory/MMSA/src/MMSA/__main__.py", line 3, in <module>
from .run import MMSA_run
ImportError: attempted relative import with no known parent package
I'm not trying to run this script in a conda environment and I've also tried to put MMSA.
in front of the first relative imports of run.py
(lines 18 to 22), which gave me the following message :
SENA_run is not loaded due to missing dependencies. This is ok if you are not using M-SENA.
Thank you for your help, I'll be able to provide more informations if you need
I find the dataset in http://immortal.multicomp.cs.cmu.edu/raw_datasets/CMU_MOSI.zip. But can't find the label.csv,why? Or It's produced by program? If yes, where's the program?
Can the Processed data be shared for SIIMS dataset in google drive? I can't access the baidu link here from US. Thank you in advance!
hello!
I want to know that for MOSI and MOSEI, how you align three modalities?
thank you !
作者您好,有一个问题想确认一下,数据集由60个视频裁剪而来,获取了2281个视频片段,每个video文件夹下的单个视频是按总视频的顺序来安放的吗?是否可以视作对话数据集呢?期待您的回复。
Hello, author, I have a question to confirm that the data set is cropped from 60 videos and 2281 video clips have been obtained. Is the single video in each video folder placed in the order of the total video? Can you think of it as a conversation data set? Thank you for your reply.
train_index.csv
I want to know what the numbers in the index represent because I am having trouble replacing MOSI, thank you very much
你好 (●'◡'●)
感谢你对这个代码库的不断支持,不知道能不能请教您,这个数据集合的 align 和 unalign 有 什么区别吗~~╥﹏╥...感谢!
Best,
jun0
您好,我下载了你们提供的数据集,发现MOSEI缺少video_id是Y7qkyMyanjU文件夹,少了四个视频片段。然后我还有个问题是四个数据集能否提供原始的语音文件,目前只有视频和文本,语音需要自己提。
Can the raw files for the CMU MOSEI dataset be shared? I saw that its on a Baidu server which I cannot access unfortunately.
我看到您的默认pytorch版本至少是1.9.1,我的GPU的cuda版本不支持安装这个版本的pytorch。请问有什么好的解决方案吗?
请问在/trains/multiTask/文件夹下的几个脚本中,第63行(以MTFN.py为例)为何使用了 y_true[m].append(labels['M'].cpu() ?
如果是计算不同modality标签预测的结果指标,是否应使用 y_true[m].append(labels[m].cpu()) ?
作者您好,我在使用is_tune=True调整模型参数后,选择了保存在csv文件中效果较好的对应参数进行训练,但是无法达到csv文件中参数所对应的精度,请问可能是什么原因导致的呢
Hello, I want to know if it is possible to run this program with 4 2080Ti?
5.2节超参数选择部分,有“Empirically, we choose the average length plus three times the standard deviation as the maximum length of the sequence” 问一下,为什么要这么做呢,这么做的理由是什么
You've done a great job! But if I want to try out some other features for those three modalities, how can I download the complete dataset? The links provided on your home page only contains processed data from CH-SIMS dataset.
您好!请问一下,您在论文报告关于SIMS数据集的结果,是negative/non-negative还是negative/positive的测试集结果?
SENA_run is not loaded due to missing dependencies. This is ok if you are not using M-SENA.
run.py
only can see the scores in the *.log
file. Does it support tests emotion on video which input by users?Oh, I see~. All the data had been processed to *.pkl file to train. So if I want test on personal input video, I should process the input video to pkl file which contain text, audio and video.
Hello, could you please tell me why the open APIs interface is empty?How do I get the API documentation for this package?
Hi @iyuge2,
Your code ran 5 repeats with seeds. I am wondering that are the results reported in "result-stat.md" file the mean values or the highest values from 5 repeats?
Thanks.
output_dir = os.path.join(working_dir, output_dir)
should be
output_dir = os.path.join(self.working_dir, output_dir)
File "run.py", line 300, in
worker()
run_normal(args)
File "run.py", line 174, in run_normal
test_results = run(args)
model = AMIO(args).to(device)
File "D:\net\MMSA-master\models\AMIO.py", line 44, in init
lastModel = MODEL_MAP[args.modelName]
KeyError: 'MTFN'
Thank you for your wonderful work!
But I have some questions about the released dataset in BaiduYun Disk: The Processed Data in MOSEI have 74 and 35 feature dimensions for audio and vision modality, which can be figured out that these two modalities features are extracted by COVAREP and Facet. However, the Processed Data in MOSI have 5 and 20 feature dimensions for audio and vision modality, what feature extractor did you use to extract MOSI data? As far as I know, MOSI audio and vision modality features extrated by COVAREP and Facet have 74 and 47 feature dimensions.
Thank you for your answer!
作者你好,我想问一下MMSA_run()这个函数是默认同时执行训练和测试吗,这里的tune指的是训练微调还是别的含义,谢谢!
Dear Authors,
Could you please share the hyperparameters/configure files that produce the results in "results/result-stat.md"?
I tried the default configuration files in this repo and I also tried to search the optimal configure using the fine-tune function, but I cannot reproduce the good results you displayed in the "results/result-stat.md" (I only tried it on the MOSI dataset currently).
Thank you very much for your support in advance.
我直接使用您的代码,没有做任何的改动,试着在MOSI运行了bert-mag和mult。结果如下。
Model,Has0_acc_2,Has0_F1_score,Non0_acc_2,Non0_F1_score,Mult_acc_5,Mult_acc_7,MAE,Corr,Loss
bert_mag,"(79.24, 4.4)","(88.35, 2.75)","(94.46, 1.58)","(97.15, 0.84)","(77.52, 1.7)","(77.52, 1.7)","(38.22, 3.04)","(67.88, 1.44)","(38.31, 3.02)"
mult,"(70.96, 9.93)","(82.62, 6.83)","(91.86, 4.26)","(95.7, 2.33)","(73.65, 0.69)","(73.65, 0.69)","(48.8, 0.85)","(54.8, 2.24)","(48.77, 0.83)"
这与result-stat.md和其对应论文中的结果都相差甚远,请问这是为什么呢?
作者你好,很庆幸有人做了多模态情感分析的集成平台。请问为什么我现在在https://github.com/thuiar/MMSA/wiki/APIs网页上没有看到任何内容?以及这个集成环境可以选择使用其中的两类模态作为输入吗?
你好,麻烦问下CH-SIMS数据集中metadata下的train_index.csv,val_index.csv以及test_index.csv文件里index列的数字是代表metadata/sentiment/label_M.csv的行号么?
hello,can you tell me how to get label.csv of MOSI?
你好,我从百度网盘上下载到了mosei的数据,但是csv中只有,sentiment的三类标签,想问6类情绪标签是要通过label列,按照划分区域值来获取吗
Hi, I am a bit confused about the f1 score calculation of your code. Based on your code:
MMSA/trains/singleTask/BERT_MAG.py
Line 130 in bf88dfc
Line 111 in bf88dfc
sklearn official document: https://scikit-learn.org/0.21/modules/generated/sklearn.metrics.f1_score.html
Dear Authors:
I was trying to repeat your results. But I found the data/Datapre.py script did not fit the raw videos in CH-SIMI.zip (downloaded from Google Drive). I tried to wrote scripts to process the raw videos, but the train/val/test split index files did not fit the raw video index.
I am asking if it is right to use train_index.csv files to process the data.npz feature files? For example, when id '0' in the train_index.csv, if the '0'th feature in data.npz file belongs to the training set.
It would be very helpful if you can update the processed feature or the whole folder so that data/DataPre.py can process it properly.
你好,请问可以重新分享一下链接吗?
SENA_run is not loaded due to missing dependencies. This is ok if you are not using M-SENA.
请问这个问题怎么解决呢?
麻烦解答一下 感谢
您好!
我在使用您的代码复现MMIM时观察到main loss变nan,我并没有修改您的代码,数据也是使用您提供的/mosi/unaligned。
MMSA - TRAIN-(mmim) [1/2/2] >> mmilb loss: 0.0042 main loss: nan Has0_acc_2: 0.5701 Has0_F1_score: 0.4140 Non0_acc_2: 0.5971 Non0_F1_score: 0.5708 Mult_acc_5: 0.2500 Mult_acc_7: 0.2313 MAE: 1.2234 Corr: 0.3217
请问这是什么原因导致的呢?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.