xueyouluo / ccks2021-track2-code Goto Github PK

View Code? Open in Web Editor NEW

136.0 136.0 42.0 601 KB

“英特尔创新大师杯”深度学习挑战赛赛道2：CCKS2021中文NLP地址要素解析

Dockerfile 0.20% Python 98.04% Shell 1.76%

address-parsing biaffine ccks2021 electra ner

ccks2021-track2-code's Introduction

ccks2021-track2-code's People

Stargazers

Watchers

Forkers

woleto haojiepan1 hurun seyoulala lixianglong1205 alchemist1024 mulinfro 612twilight springwings hzy34343 zhenbaidang positivepeng liu-nlper pelhans light201212 2020zyc malestudents mingkin fangchuanzhi txfr maochunyan1218 tongcu hitxujian creator-123 wangyang-stu zxuanhong kang9779 pjy12345611 zmddzf zhuhw0916 tanh-wink danan0755 info4rec sjyttkl chrisdoggy wanghaoran-ucas zgd716 oldsport-996 caseware66 loloazz tyrionzk v-juma1

ccks2021-track2-code's Issues

您好，因为计算机资源不够，无法进行模型训练，请问能给一份训练好的模型数据吗？

您好，因为计算机资源不够，无法进行模型训练，请问能给一份训练好的模型数据吗？###

大佬～能求一份比赛数据集吗？原来的数据集已经在官网下架了

大佬～能求一份比赛数据集吗？原来的数据集已经在官网下架了，[email protected] 谢谢大佬

大佬~请教支持多GPU卡训练需改哪些地方？

用多张Tesla T4（16G）的机器来跑，报显存不足，虽然有多张卡，但只用到其中一张的资源
把预训练的batch调小后能跑通，但后面模型训练还是会报显存不足，后来用simple跑下来了。
现在就想怎么能充分利用多张卡的性能，请教大佬，假如要支持多卡跑的话，需要改哪些地方？谢谢~

`2022-05-07 21:34:05.589168: I tensorflow/core/common_runtime/bfc_allocator.cc:824] Stats:
Limit: 14854298010
InUse: 14655244288
MaxInUse: 14662447616
NumAllocs: 8390
MaxAllocSize: 216728064

2022-05-07 21:34:05.589326: W tensorflow/core/common_runtime/bfc_allocator.cc:319] ****************************************************************************************************
2022-05-07 21:34:05.589344: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at cast_op.cc:109 : Resource exhausted: OOM when allocating tensor with shape[24,43,64,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

Using Gradient Accumulation with 3

Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1356, in _do_call
return fn(*args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1341, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1429, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
(0) Resource exhausted: OOM when allocating tensor with shape[43,21128,1024] and type half on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node gradients/generator_predictions/MatMul_grad/MatMul_1}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[cond_1/LogicalAnd_1/_25229]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

(1) Resource exhausted: OOM when allocating tensor with shape[43,21128,1024] and type half on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node gradients/generator_predictions/MatMul_grad/MatMul_1}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "run_pretraining.py", line 425, in
main()
File "run_pretraining.py", line 421, in main
args.model_name, args.data_dir, **hparams))
File "run_pretraining.py", line 384, in train_or_eval
max_steps=config.num_train_steps)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 367, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1158, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1192, in _train_model_default
saving_listeners)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1484, in _train_with_estimator_spec
_, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", line 754, in run
run_metadata=run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", line 1252, in run
run_metadata=run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", line 1353, in run
raise six.reraise(*original_exc_info)
File "/usr/local/lib/python3.6/dist-packages/six.py", line 693, in reraise
raise value
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", line 1338, in run
return self._sess.run(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", line 1411, in run
run_metadata=run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", line 1169, in run
return self._sess.run(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 950, in run
run_metadata_ptr)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1173, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1350, in _do_run
run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1370, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
(0) Resource exhausted: OOM when allocating tensor with shape[43,21128,1024] and type half on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node gradients/generator_predictions/MatMul_grad/MatMul_1 (defined at /code/electra-pretrain/model/optimization.py:66) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

 [[cond_1/LogicalAnd_1/_25229]]

Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

(1) Resource exhausted: OOM when allocating tensor with shape[43,21128,1024] and type half on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node gradients/generator_predictions/MatMul_grad/MatMul_1 (defined at /code/electra-pretrain/model/optimization.py:66) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

0 successful operations.
0 derived errors ignored.
`

另外electra pretain 是指用自己的数据重新预训练了一下吗？
感谢

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

xueyouluo / ccks2021-track2-code Goto Github PK

ccks2021-track2-code's Introduction

ccks2021-track2-code's People

Stargazers

Watchers

Forkers

ccks2021-track2-code's Issues

您好，因为计算机资源不够，无法进行模型训练，请问能给一份训练好的模型数据吗？

大佬～能求一份比赛数据集吗？原来的数据集已经在官网下架了

大佬~请教支持多GPU卡训练需改哪些地方？

Using Gradient Accumulation with 3

assemble_fake()出错

大佬，求一份.conll的完整数据

大佬你好啊~我不太理解github主页的实验中，实验8：biaffine+bert+electra 为什么两个预训练模型，他们分别的作用是啥？还是说把两个预训练模型

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent