Code Monkey home page Code Monkey logo

multimodal-sentiment-analysis's Introduction

Hi there 👋

  • 🔭 I’m currently working on ECNU & ByteDance(dimission now)
  • 🌱 I’m currently learning BlockChain
  • 👯 I’m looking to collaborate on unique picture method and some blockchain project
  • 🤔 I’m looking for help with how to deal with unique picture
  • 💬 Ask me about MachineLeaning, DeepLearning, BlockChain
  • 📫 How to reach me: QQ: 1102100299/Mail: [email protected](please star the repository which you ask me thanks uu)

multimodal-sentiment-analysis's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

multimodal-sentiment-analysis's Issues

结构图是怎么画的?

image
佬,又来麻烦你了~
我现在想要为模型画上结构图,看到你放的图,觉得蛮好,就像问下是怎么画的?我还想画更详细点。

请问可以用中文数据train 吗

您好,我想用中文的数据集跑一下训练,请问除了bert的预训练模型之外,还有什么需要改的地方吗,非常需要您的建议与指导,谢谢!

trainer文件中的代码疑问

pred, loss = self.model(texts, texts_mask, imgs, labels=labels)以及下面的几个同样调用model实例对象的代码,好像并没有用forward()函数,而且各个模型的代码中Model类中并没有声明__call__函数。

佬,我想调用训练好的模型对用户输入的文本-图像数据进行情感分类

由头
小弟初学AI,毕设是做多模态情感分析,刚好搜到佬的项目,于是怀着膜拜之心下载、阅读、训练。我现在想的是做一个前端,用户输入文本和图像,然后调用训练好的模型对其进行情感分类。
问题
但是我看不太懂你的数据处理方式,具体来说我无法处理用户输入的文本-图像数据,我发现你的数据貌似是要使用json格式进行提取(我不懂哈),然后我就怎么着也不能完美的处理外界数据。
代码

  • 1.我在项目目录中新建了web目录,在web目录下新建run_model.py文件(其余web目录下的文件暂时忽略/(ㄒoㄒ)/~~),如下:

image

  • 2.我已经训练好了模型

image

  • 3.这是run_model.py文件内容:
    ``
    import sys
    import os

指定自定义streamlit模块的绝对路径

custom_streamlit_path = 'E:/aGraduation_Program_File/Multimodal-Sentiment-Analysis-main/'

如果这个路径还不在sys.path中,就添加进去

if custom_streamlit_path not in sys.path:
sys.path.insert(0, custom_streamlit_path)

现在,尝试导入streamlit

import streamlit as st

from PIL import Image
import torch
from torchvision import transforms
#from transformers import AutoTokenizer
import io
from Models.OTEModel import Model # 根据实际使用的模型进行调整
from utils.DataProcess import LabelVocab # LabelVocab在Dataprocess.py中定义
from utils.APIs.APIDecode import api_decode # 解码模型输出的函数
from Config import config # 导入配置信息

配置信息

config = config()

加载模型

def load_model(model_path):
model = Model(config) # 假设模型初始化需要配置信息
model.load_state_dict(torch.load(model_path, map_location=torch.device('cpu')))
model.eval()
return model

model_path = "E:/aGraduation_Program_File/Multimodal-Sentiment-Analysis-main/output/OTE/pytorch_model.bin"
model = load_model(model_path)

初始化tokenizer和LabelVocab

#tokenizer = AutoTokenizer.from_pretrained(config.bert_name)
label_vocab = LabelVocab()

图像预处理函数

def process_image(image):
def get_resize(image_size):
for i in range(20):
if 2 ** i >= image_size:
return 2 ** i
return image_size

img_transform = transforms.Compose([
    transforms.Resize(get_resize(config.image_size)),
    transforms.CenterCrop(config.image_size),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

image = Image.open(io.BytesIO(image.read()))
return img_transform(image).unsqueeze(0)

文本预处理函数

'''def process_text(text):
text = text.replace('#', '')
tokens = tokenizer('[CLS]' + text + '[SEP]', return_tensors="pt", padding=True, truncation=True, max_length=512)
return tokens['input_ids'], tokens['attention_mask']
'''
from transformers import RobertaTokenizer

从本地路径加载 'roberta-base' 分词器

tokenizer = RobertaTokenizer.from_pretrained('C:/Users/Lenovo/.cache/huggingface/hub/models--roberta-base')

def preprocess_text(text, max_length=512):
"""
对给定文本进行预处理,生成input_ids和attention_mask。

参数:
- text (str): 输入文本。
- max_length (int): 文本的最大长度。

返回:
- input_ids (torch.Tensor): 文本的input ids。
- attention_mask (torch.Tensor): 文本的attention mask。
"""
# 使用encode_plus方法对文本进行编码
encoded_dict = tokenizer.encode_plus(
    text,  # 输入文本
    add_special_tokens=True,  # 添加特殊标记
    max_length=max_length,  # 设定最大文本长度
    padding='max_length',  # 填充至max_length长度
    truncation=True,  # 超过max_length会被截断
    return_attention_mask=True,  # 返回attention mask
    return_tensors='pt',  # 返回PyTorch张量
)

input_ids = encoded_dict['input_ids']
attention_mask = encoded_dict['attention_mask']
print("run_model")
print(input_ids.shape, attention_mask.shape)
return input_ids, attention_mask

'''
def preprocess_text(text, max_length=512):
"""
对给定文本进行预处理,生成input_ids和attention_mask。

参数:
- text (str): 输入文本。
- max_length (int): 文本的最大长度。

返回:
- input_ids (torch.Tensor): 文本的input ids。
- attention_mask (torch.Tensor): 文本的attention mask。
"""
# 使用encode_plus方法对文本进行编码
encoded_dict = tokenizer.encode_plus(
    text,  # 输入文本
    add_special_tokens=True,  # 添加'[CLS]'和'[SEP]'
    max_length=max_length,  # 设定最大文本长度
    pad_to_max_length=True,  # 填充至max_length长度
    return_attention_mask=True,  # 返回attention mask
    return_tensors='pt',  # 返回PyTorch张量
)

input_ids = encoded_dict['input_ids']
attention_mask = encoded_dict['attention_mask']

return input_ids, attention_mask

'''

应用标题

st.title("多模态情感分析")

文件上传器

uploaded_image = st.file_uploader("上传图片", type=["jpg", "jpeg", "png"])
uploaded_text = st.text_area("输入文本", "")

if uploaded_image is not None and uploaded_text != "":
processed_image = process_image(uploaded_image)
input_ids, attention_mask = preprocess_text(uploaded_text)

# 假设你的模型接受已经处理的图像和文本作为输入,并返回情感分类结果
# 你需要根据你的模型具体实现调整下面的预测代码
with torch.no_grad():
    prediction = model(processed_image, input_ids, attention_mask)
    sentiment = api_decode(prediction, label_vocab)  # 假设api_decode能够根据预测结果返回情感标签

    st.write(f"预测情感: {sentiment}")

``
run_model.md

  • 4.运行结果如下:

image

关于数据集和融合方式的一些小问题

您好,我想问一下找个数据集是自己爬取的还是公开数据集,另外几种融合方式是参考论文的还是自己想出来的,如果是有公开数据集或者参考的论文,可以发一下吗,非常感谢您

attention疑问

    attention_out = self.attention(torch.cat(
        [text_feature.unsqueeze(0), img_feature.unsqueeze(0)],
    dim=2)).squeeze()

请问,上面代码是在对同一batch内的不同样本进行attention操作么?

调用模型后输出的标签不正常

1.佬,又来打扰你了。上次我改了一下还是有好多问题(T~T)。不过还是一定要蟹蟹你呢!感谢大佬抽时间看我这个小菜鸡的issue!
2.现在我遇到的问题如下:
模型接受数据:
# 模型输入 model_input = { 'texts': text_input_ids, # 将 input_ids 改为 texts 'texts_mask': text_attention_mask, # 将 attention_mask 改为 texts_mask 'imgs': image_tensor, # 'guids' 和 'labels' # 'guids': torch.tensor([guid]), # 'labels': torch.tensor([tokenizer.label_vocab.label_to_id(label)]),
得到的输出:
image

3.全部代码如下:
import torch
from PIL import Image
from torchvision import transforms
from transformers import RobertaTokenizer
from Models.OTEModel import Model # 根据实际使用的模型进行调整
from utils.DataProcess import Processor
from utils.DataProcess import LabelVocab

导入配置信息,这里假设 Config 类已经定义并提供了必要的配置

from Config import config

初始化配置

config = config()

模型路径和配置

model_path = "E:/aGraduation_Program_File/Multimodal-Sentiment-Analysis-main/output/OTE/pytorch_model.bin"
model = Model(config)

加载模型权重

model.load_state_dict(torch.load(model_path, map_location=torch.device('cpu')))
model.eval()

用户输入的数据

guid = 0
label = 'null'
text = 'so fast'
image_path = 'E:/aGraduation_Program_File/Multimodal-Sentiment-Analysis-main/web/Cavendish.jpg'

文本预处理

tokenizer = RobertaTokenizer.from_pretrained('C:/Users/Lenovo/.cache/huggingface/hub/models--roberta-base') # 初始化分词器
text_tokens = tokenizer.encode(text, add_special_tokens=True)
text_input_ids = torch.tensor([text_tokens]) # 转换为张量并添加批次维度
text_attention_mask = torch.ones_like(text_input_ids, dtype=torch.long)

图像预处理

image = Image.open(image_path)
preprocess = transforms.Compose([
transforms.Resize((224, 224)), # 调整图像大小为 224x224
transforms.ToTensor(), # 将PIL图像转换为Tensor
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) # 归一化
])
image_tensor = preprocess(image).unsqueeze(0) # 添加批次维度

模型输入

model_input = {
'texts': text_input_ids, # 将 input_ids 改为 texts
'texts_mask': text_attention_mask, # 将 attention_mask 改为 texts_mask
'imgs': image_tensor,
# 'guids' 和 'labels'
# 'guids': torch.tensor([guid]),
# 'labels': torch.tensor([tokenizer.label_vocab.label_to_id(label)]),
}
processor = Processor(config)
label_vocab = LabelVocab()

使用模型进行预测

with torch.no_grad():
outputs = model(**model_input)

print('pred_label':',outputs)
print('输出的格式为:',type(outputs))


run_model_noweb.md
4.我想是不是pred_label需要进行解码,但我进行解码后还是不行。所以是哪出错了?呜呜呜...大佬,救我!

您好,想咨询您一个报错

运行 您好,我在跑您的代码时,NavieCombine是可以跑通的,但其他的CMAC、OTE等模型全部报这个错误,我理解数据集应该没有问题,还请您指教,谢谢!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.