Code Monkey home page Code Monkey logo

Comments (27)

crownpku avatar crownpku commented on August 22, 2024

可以post出来你训练模型train_GRU.py时的输出吗?
我怀疑是tensorflow版本的问题,可以尝试tf1.2.0再试下。

from information-extraction-chinese.

Mariobai avatar Mariobai commented on August 22, 2024

/Users/bai/anaconda3/bin/python /Users/bai/python/pythonex/relationex/train_GRU.py
reading wordembedding
reading training data
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-09-22T10:45:52.839965: step 50, softmax_loss 82.7629, acc 0.5
2017-09-22T10:46:23.717158: step 100, softmax_loss 77.9403, acc 0.48
2017-09-22T10:46:54.196828: step 150, softmax_loss 51.8013, acc 0.62

Process finished with exit code 0

我现在的train_GRU.py的模型输出是这样的。没有什么错误,可是就是没有办法保存模型

from information-extraction-chinese.

crownpku avatar crownpku commented on August 22, 2024

train_gru.py line 125:

if current_step > 8000 and current_step % 100 == 0:
                        print('saving model')
                        path = saver.save(sess, save_path + 'ATT_GRU_model', global_step=current_step)
                        tempstr = 'have saved model to ' + path
                        print(tempstr)

为了节省不必要的空间,现在的设定是8000以后的step才会每过100个step存储一次模型。

from information-extraction-chinese.

Mariobai avatar Mariobai commented on August 22, 2024

那我这个地怎么设置没10个step就保存一次模型呢?

from information-extraction-chinese.

crownpku avatar crownpku commented on August 22, 2024
if current_step % 10 == 0:
                        print('saving model')
                        path = saver.save(sess, save_path + 'ATT_GRU_model', global_step=current_step)
                        tempstr = 'have saved model to ' + path
                        print(tempstr)

from information-extraction-chinese.

Mariobai avatar Mariobai commented on August 22, 2024

还想请问一个问题,隔10个步长就保存一个词,我怎么覆盖之前保存的模型啊?这里面保存的模型太多了啊?

from information-extraction-chinese.

crownpku avatar crownpku commented on August 22, 2024

备份或者删除掉之前的模型,再跑新的训练。或者修改saver.save中间的模型名字。

太早开始存储模型,早期的模型效果非常差,没有用处的;这就是为什么我设置8000以后才会存储模型。
8000以后的每隔100存储的模型,可以互相比较效果取最好的,因为有可能太后期的模型又会overfit.

from information-extraction-chinese.

Mariobai avatar Mariobai commented on August 22, 2024

谢谢,现在已经可以运行了。非常感谢。你对地理方面的实体关系抽取有什么研究吗?现在我的问题是不知道数据集的构造是什么样的?
也是和这个项目中的一样: 赵玉芳 葛淑珍 父母 九九年前后时,女儿赵玉芳被大连外国语学院录取,而葛淑珍就再度成为陪读妈妈!
这样的吗?
还是我需要做成别的形式,我才刚刚开始研究,你对这方面有什么建议吗?

from information-extraction-chinese.

crownpku avatar crownpku commented on August 22, 2024

对地理领域并不了解...
关键是看数据长什么样,你有地理方面的数据,可以post些sample在这里来讨论。不然我也是一头雾水。

from information-extraction-chinese.

Mariobai avatar Mariobai commented on August 22, 2024

哦哦。。数据我们准备就是从网上爬取一些新闻报道,然后也是要处理成这个项目中的训练集这种格式吗?现在有点迷。

from information-extraction-chinese.

Mariobai avatar Mariobai commented on August 22, 2024

您好,我想请问一下,你的数据是怎么以词向量的形式输入到神经网络中的啊,是使用预训练好的词向量还是使用one-hot随机生成的词向量啊?

from information-extraction-chinese.

crownpku avatar crownpku commented on August 22, 2024

是预训练好的中文字向量

from information-extraction-chinese.

Mariobai avatar Mariobai commented on August 22, 2024

是直接训练(train.txt)里面这种类型的数据为词向量的吗?-->朱时茂 陈佩斯 合作 《水与火的缠绵》《低头不见抬头见》《天剑群侠》小品陈佩斯与朱时茂1984年《吃面条》合作者:陈佩斯聽1985年《拍电影》合

还是单独的使用别的数据来训练为词向量,还有就是在输入到神经网络的时候,是输入一句话,还是输入什么啊?

from information-extraction-chinese.

crownpku avatar crownpku commented on August 22, 2024

中文字向量是在中文wikipedia上训练的。输入的就是后面那一句话的每个字向量拼接在一起啊。
我以为我的blog已经写得很清楚了。。。

from information-extraction-chinese.

Mariobai avatar Mariobai commented on August 22, 2024

字向量?就是每个单独的字?为什么不是词向量啊?word2vec不是词向量吗?一个个的字,流程是这样的对吧-->首先,找到句子,不用进行分词和词性的标注,直接按照字去训练好的字向量中去找每个字的子向量是什么。(字向量怎么训练的?) 实在不好意思,问题有点多

from information-extraction-chinese.

crownpku avatar crownpku commented on August 22, 2024

对英文来说,词能有空格分开。中文天生粘在一起,可以使用一些分词工具分词之后再去word2vec跑词向量,但会带来分词的错误;字向量就是把每个中文字都分开当做一个"词",然后一样用word2vec去训练,只不过最后结果的每个向量代表的是单个中文字了。

from information-extraction-chinese.

Mariobai avatar Mariobai commented on August 22, 2024

哦哦。。。这样说的话就是说我需要把一句话里面的每个字用空格隔开,然后用这个字来训练字向量。还有就是我们输入的时候给一句话,可是这句话的实体之间的类型我们并没有给出啊?那我的网络怎么知道什么类型才是正确的呢?

from information-extraction-chinese.

crownpku avatar crownpku commented on August 22, 2024

整个网络最尾的output,通过和label计算loss,反馈回来训练整个网络。

from information-extraction-chinese.

Mariobai avatar Mariobai commented on August 22, 2024

label?在哪里?就是训练集里面的的两个实体中间对应的关系吗?

from information-extraction-chinese.

Mariobai avatar Mariobai commented on August 22, 2024

您好,如果说我要在您的网络中加入一个验证集,这个在哪里加啊?看您的网络只有训练集合测试集呢

from information-extraction-chinese.

crownpku avatar crownpku commented on August 22, 2024

label即是实体间的关系;添加验证集不需要改网络结构,需要修改训练部分train_GRU.py中的代码逻辑。

from information-extraction-chinese.

Mariobai avatar Mariobai commented on August 22, 2024

但我的句子输入的长度不一致的时候是怎么处理的啊?我的每个输入的句子的长度不可能是一致的啊?而且程序注释有点少啊。。

from information-extraction-chinese.

Mariobai avatar Mariobai commented on August 22, 2024

我在训练的过程中出现
saving model
have saved model to ./model/ATT_GRU_model-1000
2017-11-07T16:51:29.607642: step 1050, softmax_loss 0.0269104, acc 1
2017-11-07T16:54:02.656800: step 1100, softmax_loss 0.0240574, acc 1

我的准确度达到了1,这明显是过拟合了啊。您在处理这个的时候是怎么弄的啊?

from information-extraction-chinese.

KobeLA24 avatar KobeLA24 commented on August 22, 2024

请问你是如何解决训练问题的?我也是当step等于150时就停止了,没有保存模型,怎么解决的啊?

from information-extraction-chinese.

KobeLA24 avatar KobeLA24 commented on August 22, 2024

我在train_GRU.py中使用print函数发现如果使用作者提供的训练集,在默认num_epoch=10,big_num=50情况下,只会进行190多个train_step,为什么作者还要设置step>8000才保存model啊,这是怎么回事呢?难怪每次step=150程序就终止了。

from information-extraction-chinese.

crownpku avatar crownpku commented on August 22, 2024

@KobeLA24 项目中的training data仅仅是示例,远远不够训练出一个可用的模型。step>8000是在足够多的训练集上训练时用到的参数。

from information-extraction-chinese.

KobeLA24 avatar KobeLA24 commented on August 22, 2024

好的,明白了,谢谢啦

from information-extraction-chinese.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.