Code Monkey home page Code Monkey logo

chinesener's Introduction

Anurag's GitHub stats

chinesener's People

Contributors

yanwii avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

chinesener's Issues

python 停止工作

首先感谢大佬开源了这么好的项目。
有一个小疑问,存放bert模型的文件夹是‘model’还是‘bert_model’,按照说明里是‘bert_model’,但我跑的时候会出现如下错误:

Traceback (most recent call last):
  File "model.py", line 502, in <module>
    model.train()
  File "model.py", line 328, in train
    ARGS.init_checkpoint)
  File "D:\ProgramData\Anaconda3\lib\site-packages\bert_base-0.0.9-py3.6.egg\bert_base\bert\modeling.py", line 331, in get_assignment_map_from_checkpoint
    init_vars = tf.train.list_variables(init_checkpoint)
  File "D:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\training\checkpoint_utils.py", line 97, in list_variables
    reader = load_checkpoint(ckpt_dir_or_file)
  File "D:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\training\checkpoint_utils.py", line 65, in load_checkpoint
    "given directory %s" % ckpt_dir_or_file)
ValueError: Couldn't find 'checkpoint' file or checkpoints in given directory bert_model/

改成‘model’之后运行就出现python停止运行的问题。
小白一枚,还请大神指正!

缺少checkpoint文件

image

如图,下载的chinese_L-12_H-768_A-12.zip缺少checkpoint文件,是怎么回事

bert_data_utils.py文件中 错误ValueError: max() arg is an empty sequence

Traceback (most recent call last):
File "model.py", line 505, in
model.train()
File "model.py", line 284, in train
self.train_data = BertDataUtils(tokenizer, batch_size=5)
File "C:\Users\admin\Desktop\ChineseNER\bert_data_utils.py", line 28, in init
self.prepare_batch()
File "C:\Users\admin\Desktop\ChineseNER\bert_data_utils.py", line 77, in prepare_batch
pad_data = self.pad_data(self.data[-self.batch_size:])
File "C:\Users\admin\Desktop\ChineseNER\bert_data_utils.py", line 87, in pad_data
max_length = max([len(i[0]) for i in c_data] )
ValueError: max() arg is an empty sequence

这个问题怎么解决

使用GPU训练时报错了,请大佬帮帮我

Caused by op 'bert/encoder/layer_8/attention/self/Mul', defined at:
  File "model.py", line 503, in <module>
    model.train()
  File "model.py", line 316, in train
    self.__creat_model()
  File "model.py", line 42, in __creat_model
    self.bert_layer()
  File "model.py", line 130, in bert_layer
    use_one_hot_embeddings=False
  File "/home/.virtualenvs/ten_gpu/lib/python3.6/site-packages/bert_base/bert/modeling.py", line 217, in __init__
    do_return_all_layers=True)
  File "/home/.virtualenvs/ten_gpu/lib/python3.6/site-packages/bert_base/bert/modeling.py", line 846, in transformer_model
    to_seq_length=seq_length)
  File "/home/.virtualenvs/ten_gpu/lib/python3.6/site-packages/bert_base/bert/modeling.py", line 705, in attention_layer
    1.0 / math.sqrt(float(size_per_head)))
  File "/home/.virtualenvs/ten_gpu/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper
    return target(*args, **kwargs)
  File "/home/.virtualenvs/ten_gpu/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 248, in multiply
    return gen_math_ops.mul(x, y, name)
  File "/home/.virtualenvs/ten_gpu/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 5860, in mul
    "Mul", x=x, y=y, name=name)
  File "/home/.virtualenvs/ten_gpu/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/home/.virtualenvs/ten_gpu/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/home/.virtualenvs/ten_gpu/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
    op_def=op_def)
  File "/home/.virtualenvs/ten_gpu/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1801, in __init__
    self._traceback = tf_stack.extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[5,12,484,484] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[node bert/encoder/layer_8/attention/self/Mul (defined at /home/.virtualenvs/ten_gpu/lib/python3.6/site-packages/bert_base/bert/modeling.py:705) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	 [[node logits/Reshape (defined at model.py:199) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

``

batch_size及训练速度问题

在实际运行代码的过程中,发现batch_size设置为8时就会OOM,使用的是1080Ti显卡,11G显存。请问这种情况是否正常?
如果batch_size只能设置成这么小,训练过程会非常缓慢,1.5w数据的话,一个epoch需要40分钟左右,这个训练速度也是正常的吗?请问有什么可以提速的方法吗?
谢谢~

Couldn't find 'checkpoint' file or checkpoints in given directory

ub16c9@ub16c9-gpu:~/ub16_prj/ChineseNER$ python3 model.py -e train -m bert

train data: 2757
nums of tags: 9
/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gradients_impl.py:112: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
2019-03-12 17:25:23.711894: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-03-12 17:25:23.825882: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-03-12 17:25:23.826463: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.6575
pciBusID: 0000:01:00.0
totalMemory: 10.92GiB freeMemory: 10.11GiB
2019-03-12 17:25:23.826481: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2019-03-12 17:25:24.069500: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-12 17:25:24.069536: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2019-03-12 17:25:24.069542: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2019-03-12 17:25:24.069766: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9769 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
Traceback (most recent call last):
File "model.py", line 502, in
model.train()
File "model.py", line 328, in train
ARGS.init_checkpoint)
File "/usr/local/lib/python3.5/dist-packages/bert_base/bert/modeling.py", line 331, in get_assignment_map_from_checkpoint
init_vars = tf.train.list_variables(init_checkpoint)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/checkpoint_utils.py", line 95, in list_variables
reader = load_checkpoint(ckpt_dir_or_file)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/checkpoint_utils.py", line 63, in load_checkpoint
"given directory %s" % ckpt_dir_or_file)
ValueError: Couldn't find 'checkpoint' file or checkpoints in given directory bert_model/
ub16c9@ub16c9-gpu:/ub16_prj/ChineseNER$ ll bert_
bert_data_utils.py bert_model/
ub16c9@ub16c9-gpu:
/ub16_prj/ChineseNER$ ll bert_model/
total 804848
drwxrwxr-x 3 ub16c9 ub16c9 4096 三月 12 17:25 ./
drwxrwxr-x 7 ub16c9 ub16c9 4096 三月 12 17:19 ../
-rw-r--r-- 1 ub16c9 ub16c9 520 三月 12 17:19 bert_config.json
-rw-r--r-- 1 ub16c9 ub16c9 411529768 三月 12 17:19 bert_model.ckpt.data-00000-of-00001
-rw-r--r-- 1 ub16c9 ub16c9 8512 三月 12 17:19 bert_model.ckpt.index
-rw-r--r-- 1 ub16c9 ub16c9 905069 三月 12 17:19 bert_model.ckpt.meta
drwxr-xr-x 2 ub16c9 ub16c9 4096 三月 12 17:16 chinese_L-12_H-768_A-12/
-rw-rw-r-- 1 ub16c9 ub16c9 411575980 三月 12 17:19 pytorch_model.bin
-rw-r--r-- 1 ub16c9 ub16c9 109540 三月 12 17:19 vocab.txt
ub16c9@ub16c9-gpu:~/ub16_prj/ChineseNER$

运行结果全是0

您好,我用自己的数据跑出来各项指标都是0,但loss挺低,acc挺高,是我哪里print出来的不对吗

是必须要启用GPU才能跑吗?

InvalidArgumentError (see above for traceback): Cannot assign a device for operation init: node init (defined at model.py:323) was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0 ]. Make sure the device specification refers to a valid device. The requested device appears to be a GPU, but CUDA is not enabled.
[[node init (defined at model.py:323) ]]

学习率 这个地方报错了

epoch 0 Traceback (most recent call last): File "D:\ProgramFiles\anaconda\envs\tf\lib\site-packages\tensorflow\python\client\session.py", line 300, in __init__ fetch, allow_tensor=True, allow_operation=True)) File "D:\ProgramFiles\anaconda\envs\tf\lib\site-packages\tensorflow\python\framework\ops.py", line 3478, in as_graph_element return self._as_graph_element_locked(obj, allow_tensor, allow_operation) File "D:\ProgramFiles\anaconda\envs\tf\lib\site-packages\tensorflow\python\framework\ops.py", line 3567, in _as_graph_element_locked types_str)) TypeError: Can not convert a float into a Tensor or Operation.

`TypeError: Fetch argument 5e-05 has invalid type <class 'float'>, must be a string or Tensor. (Can not convert a float into a Tensor or Operation.)

`

这个地方该怎么设置,求解决办法,感谢

bert运行出现问题

Traceback (most recent call last):
not enough values to unpack (expected 2, got 1)
File "D:/dogtime/mission/BiGRU_crf/bert_data_utils.py", line 41, in load_data
word, tag = line.split()
ValueError: not enough values to unpack (expected 2, got 1)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:/dogtime/mission/BiGRU_crf/bert_data_utils.py", line 119, in
bert_data_util = BertDataUtils(tokenizer)
File "D:/dogtime/mission/BiGRU_crf/bert_data_utils.py", line 26, in init
self.load_data()
File "D:/dogtime/mission/BiGRU_crf/bert_data_utils.py", line 49, in load_data
inputs_ids = self.tokenizer.convert_tokens_to_ids(ntokens)
File "D:\dogtime\mission\BiGRU_crf\bert_base\bert\tokenization.py", line 179, in convert_tokens_to_ids
return convert_by_vocab(self.vocab, tokens)
File "D:\dogtime\mission\BiGRU_crf\bert_base\bert\tokenization.py", line 140, in convert_by_vocab
output.append(vocab[item])
KeyError: 'D'

每填喂batch_size数据,embedding矩阵都重新初始化

博主好,感谢提供了这一份嵌入bert预训练模型的python实现。最近有在看你的代码,发现一个事情,每次再向bert_layer层提供input_ids时,都是重新初始化了embedding_table,那么不同句中的同一个字极有可能是不同的初始化向量,这里是不是规定下种子号更为妥当?bert小白一枚,还请指点下

能否将该项目配合docker+tensorflow-serving作为服务来运行?

大佬,请问下该模型的输入输出项都有哪些?
最近在看将模型做成服务的形式,需要这几个参数

            labeling_signature = (
                tf.saved_model.signature_def_utils.build_signature_def(
                    inputs={
                        "input_ids":
                            bert_input_ids,
                        "segment_ids":
                            bert_segment_ids,
                        "input_mask":
                            bert_input_mask,
                        "dropout":
                            bert_dropout,
                    },
                    outputs={
                        "targets":
                            bert_targets,
                    },
                    method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME))

按照网上的教程,改写成这个样子,这些输入输出参数是对的吗?望指点下,感谢!

通讯录人名关系识别

很高兴发现这么棒的项目,学习了几天~
有一个问题想请教一下:
面对像通讯录那种“大舅、杨三姐、王欢处长”这种类型的文本可以利用你这种方案解决吗?
想了挺久的,文本长度太短了,原子化的文本查找有效姓名、关系等信息
期盼探讨~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.