请教文章截成的k段，在哪块代码出可以看出是分别输入模型处理？ about ccf-bdci-sentiment-analysis-baseline HOT 3 CLOSED

xuehui0725 commented on September 24, 2024

请教文章截成的k段，在哪块代码出可以看出是分别输入模型处理？

from ccf-bdci-sentiment-analysis-baseline.

Comments (3)

guoday commented on September 24, 2024

求解释下一个文章截成k端后怎么输入训练的，没有找到是在哪个地方“分别输入语言模型”的？如果是这样，理论上是不是不管多长的文章都可以通过切成很多端，分别输入处理了，不用截断文章了
BertForSequenceClassification中的forward：
def forward(self, input_ids, token_type_ids=None, attention_mask=None, labels=None,
position_ids=None, head_mask=None):
    flat_input_ids = input_ids.view(-1, input_ids.size(-1))
    flat_position_ids = position_ids.view(-1, position_ids.size(-1)) if position_ids is not None else None
    flat_token_type_ids = token_type_ids.view(-1, token_type_ids.size(-1)) if token_type_ids is not None else None
flat_attention_mask = attention_mask.view(-1, attention_mask.size(-1)) if attention_mask is not None else None
比如k＝2，input_ids中就包含文章划分的两端，通过view又展平了，那输入的长度还是没有变短？和不划分一样？

https://github.com/guoday/CCF-BDCI-Sentiment-Analysis-Baseline/blob/master/pytorch_transformers/modeling_bert.py
操作在995-1004行，将k段reshape回来，然后经过gru

from ccf-bdci-sentiment-analysis-baseline.

xuehui0725 commented on September 24, 2024

“操作在995-1004行，将k段reshape回来，然后经过gru”，是reshape成之前k端，再经过gru吧，这块我可以理解。主要不清楚的是，一个样本截成k端后，怎么输入bert模型处理的？是相当于变成k个样本吗（label都一样）？ output = pooled_output.reshape(input_ids.size(0),input_ids.size(1),-1).contiguous()，这里的pooled_output是k段的输出？

from ccf-bdci-sentiment-analysis-baseline.

guoday commented on September 24, 2024

“操作在995-1004行，将k段reshape回来，然后经过gru”，是reshape成之前k端，再经过gru吧，这块我可以理解。主要不清楚的是，一个样本截成k端后，怎么输入bert模型处理的？是相当于变成k个样本吗（label都一样）？ output = pooled_output.reshape(input_ids.size(0),input_ids.size(1),-1).contiguous()，这里的pooled_output是k段的输出？

一开始假设有m个样本，每个样本截成k段，那么输入就是(mk,max_len)，然后输入到bert,得到(mk,max_len,dim)，取cls的向量，得到(m*k,dim)。然后reshape成(m,k,dim)，经过gpu，得到(m,dim)

from ccf-bdci-sentiment-analysis-baseline.

Recommend Projects

请教文章截成的k段，在哪块代码出可以看出是分别输入模型处理？ about ccf-bdci-sentiment-analysis-baseline HOT 3 CLOSED

Comments (3)

Related Issues (16)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent