Code Monkey home page Code Monkey logo

Comments (1)

chilynn avatar chilynn commented on September 12, 2024

1 train, test, validation的格式如下

这 O
里 O
是 O
清 B
华 M
大 M
学 E
。 O

中 B
国 M
政 M
府 E
。 O

B代表该字符是某个entity的开始
M代表该字符是某个entity的中间
E代表该字符是某个entity的结尾
O代表该字符不属于某个entity

这里一共有两个样本(两个句话),每个样本中间用空行分割
第一列是字符,第二列是标记,第一列与第二列用\t分割

2 embedding的格式
假设有一共有2个单字,每个单字是3维的向量,格式如下:
2 3
你 1 0 1
好 0 0 1
embedding的格式是gensim的word2vec的模型输出格式,调用的函数就是model.save_word2vec_format(output_path, binary=False)
整个embedding文件可以看出是一个2x3的矩阵,行代表单字,列代表字向量的某个维度
例如:“好”这个字映射到了[0, 0, 1]这个3维向量

from sequence-labeling.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.