Code Monkey home page Code Monkey logo

classifier-multi-label's Introduction

Text Classification Multi-Label: 多标签文本分类

Python Pytorch

一、简介

1. 多元分类

多分类任务中一条数据只有一个标签,但这个标签可能有多种类别。比如判定某个人的性别,只能归类为"男性"、"女性"其中一个。再比如判断一个文本的情感只能归类为"正面"、"中面"或者"负面"其中一个。

2. 多标签分类

多标签分类任务中一条数据可能有多个标签,每个标签可能有两个或者多个类别。例如,一篇新闻可能同时归类为"娱乐"和"运动",也可能只属于"娱乐"或者其它类别。



二、算法

4种实现方法

├── classifier_multi_label
    └── classifier_multi_label
    └── classifier_multi_label_textcnn
    └── classifier_multi_label_denses
    └── classifier_multi_label_seq2seq

1. classifier_multi_label

  • 使用BERT第一个token[CLS]的向量,维度为(batch_size,hidden_size)。
  • 使用了tf.nn.sigmoid_cross_entropy_with_logits的损失函数。
  • 使用了tf.where函数来选择概率小于0.5的对应id。

2. classifier_multi_label_textcnn

  • 使用BERT输出的三维向量,维度为(batch_size,sequence_length,hidden_size),然后做为输入进入TextCNN层。
  • 使用了tf.nn.sigmoid_cross_entropy_with_logits的损失函数。
  • 使用了tf.where函数来选择概率小于0.5的对应id。

3. classifier_multi_label_denses

  • 使用BERT第一个token[CLS]的向量,维度为(batch_size,hidden_size),然后通过多个二分类(全连接层)来解决多标签分类问题。
  • 使用了tf.nn.softmax_cross_entropy_with_logits的损失函数。
  • 使用了tf.argmax函数来选择输出最高概率。

4. classifier_multi_label_seq2seq

  • 使用BERT输出的三维向量,维度为(batch_size,sequence_length,hidden_size),然后做为输入进入seq2seq+attention层。
  • 使用了tf.nn.softmax_cross_entropy_with_logits的损失函数。
  • 使用了beam search 来解码输出概率。

三、实验

1. 训练过程

2. 实验结果

3. 实验结论

  • 如果对推理速度的要求不是非常高,基于ALBERT+Seq2Seq_Attention框架的多标签文本分类效果最好。
  • 如果对推理速度和模型效果要求都非常高,基于ALBERT+TextCNN会是一个不错的选择。

参考

多标签文本分类介绍,以及对比训练
多标签文本分类 [ALBERT]
多标签文本分类 [ALBERT+TextCNN]
多标签文本分类 [ALBERT+Multi_Denses]
多标签文本分类 [ALBERT+Seq2Seq+Attention]

classifier-multi-label's People

Contributors

hellonlp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

classifier-multi-label's Issues

大模型

大佬有尝试使用大模型解决类似问题么,例如开源的百川&chatglm等

The passed save_path is not a valid checkpoin

Traceback (most recent call last):
File "predict.py", line 43, in
MODEL = ModelAlbertTextCNN()
File "predict.py", line 26, in init
self.albert, self.sess = self.load_model()
File "predict.py", line 39, in load_model
saver.restore(sess, ckpt.model_checkpoint_path)
ValueError: The passed save_path is not a valid checkpoint

pytorch

May I ask if there will be a related version of pytorch released in the future?

tf2使用textcnn网络问题

tf1 版本转tf2问题,当不添加textcnn网络时,训练预测均没有问题。但是当加入textcnn时训练时loss与acc都不错,但是预测都是错误的。以下tf2实现的textcnn基本都是直接转的。此外我还尝试tf.keras.layers.Conv2D()以及conv1d实现。但是效果都不行,本来考虑是不是训练周期等参数问题,但是跟您的项目参数保持一致,训练出来的模型就是有问题(有进行dropout),所以想请教一下您。

def textcnn(x):
    pooled_outputs = []

    filter_sizes = [2, 3, 4, 5, 6, 7]
    inputs_expand = tf.expand_dims(x, -1)
    for filter_size in filter_sizes:
        filter_shape = [filter_size, 312, 1, 128]
        W = tf.Variable(tf.random.truncated_normal(filter_shape, stddev=0.1), dtype=tf.float32, name="W")
        b = tf.Variable(tf.constant(0.1, shape=[128]), dtype=tf.float32, name="b")
        conv = tf.nn.conv2d(
            inputs_expand,
            W,
            strides=[1, 1, 1, 1],
            padding="VALID",
            name="conv")
        # Apply nonlinearity
        h = tf.nn.relu(tf.nn.bias_add(conv, b), name="relu")
        # Maxpooling over the outputs
        pooled = tf.nn.max_pool(
            h,
            ksize=[1, 60 - filter_size + 1, 1, 1],
            strides=[1, 1, 1, 1],
            padding='VALID',
            name="pool")
        pooled_outputs.append(pooled)
    # Combine all the pooled features
    num_filters_total = 128 * len(filter_sizes)
    h_pool = tf.concat(pooled_outputs, 3)
    h_pool_flat = tf.reshape(h_pool, [-1, num_filters_total])

    return h_pool_flat

训练结束后predict.py脚本获取不到标签问题

训练结束后predict.py脚本获取不到标签问题
知乎上也有很多人说predict.py脚本获取到的标签为空,其实不是训练数据有问题或者轮次不够,作者的get_label 函数逻辑有一些小小的问题,我这里简单修改了一下,可以成功获取到标签,新的predict.py 的get_label 函数如下:

def get_label(sentence):
    """
    Prediction of the sentence's label.
    """
    feature = get_feature_test(sentence)
    fd = {MODEL.albert.input_ids: [feature[0]],
          MODEL.albert.input_masks: [feature[1]],
          MODEL.albert.segment_ids:[feature[2]],
          }
    prediction = MODEL.sess.run(MODEL.albert.predictions, feed_dict=fd)[0]
    print(prediction)
    r=[]
    for i in range(len(prediction)):
        if prediction[i]!=0.0:
            r.append(id2label(i))
    return r
    #return [id2label(l) for l in np.where(prediction==1)[0] if l!=0]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.