Code Monkey home page Code Monkey logo

roformer-sim's People

Contributors

zhuiyitechnology avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

roformer-sim's Issues

下载的ft模型的vocab.txt好像不能正常读取?

您好!

我这边下载了Chinese_roformer-sim-char-ft_L12_H-768_A-12这个模型的权重,但是在加载时,vocab.txt会一直说无法用utf-8编码load。我对比了一下simbert1的vocab文件好像没有这种情况。

我直接读取simbert1的vocab文件作为替代进行生成时,基本都是生僻字(看上去就很像乱码)组成的读不通的句子。。可能是我训练不充分,也有可能是词表不一样

这里有没有办法重新提供一下utf-8编码的Chinese_roformer-sim-char-ft_L12_H-768_A-12的vocab.txt,我不确定simbert1的词表跟simbert2一模一样

万分感谢!

roformer-sim模型预测时的输出向量

请问输入[t1]和输入[t1,t2],roformer-sim模型输出的768维向量本质有区别吗([cls]句向量)?输入[t1,t2]是t1拼接t2当成一句话处理的吗?

模型占内存特别大

在16G V100上运行,占用内存接近16个G。是模型本身这么大吗?还是为啥呢

权重会放出来吗

如题,之前就想会不会有roformer+simbert的版本,结果就刷出来了:)

您好,build_transformer_model 在GPU上运行时会报显存错误,但是我GPU显存是22G

机器系统为windows
TensorFlow-gpu 为 1.14 ,CUDA为10.0 GPU测试tf确实可用
显卡显存为22G

`from bert4keras.models import build_transformer_model

config_path = 'D:/pycharm/字段引用/chinese_simbert_L-4_H-312_A-12/bert_config.json'
checkpoint_path = 'D:/pycharm/字段引用/chinese_simbert_L-4_H-312_A-12/bert_model.ckpt'

roformer = build_transformer_model(
    config_path,
    checkpoint_path,
    model='roformer',
    application='unilm',
    with_pool='linear'
)`

已经加载是tiny模型了,还是在build_transformer_model 时会报错 ,2023-05-23 16:31:14.506402: E tensorflow/stream_executor/cuda/cuda_driver.cc:828] failed to allocate 19.69G (21146212608 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2023-05-23 16:31:14.705061: E tensorflow/stream_executor/cuda/cuda_driver.cc:828] failed to allocate 17.72G (19031590912 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory

|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 2080 Ti WDDM | 00000000:01:00.0 On | N/A |
| 77% 77C P8 41W / 260W| 496MiB / 22528MiB | 2% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+

请问这是为什么呢?
CPU版本是可以用的。

在用GPU运行`generate.py`上的一些问题

感谢贵公司的开源!
我在gpu上运行generate.py时报以下错误。

2022-01-18 20:50:55.799119: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfuld dynamic library libcublas.so.10.0
2022-01-18 20:52:03.633057: E tensorflow/stream_executor/cuda/cuda_blas.cc:428] failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED
2022-01-18 20:52:03.633118: E tensorflow/stream_executor/cuda/cuda_blas.cc:2301] Internal: failed BLAS call, see log for details
File "/usr/local/miniconda3/lib/python3.6/site-packages/bert4keras/snippets.py", line 627, in random_sample
inputs, output_ids, states, temperature, 'probas'
File "/usr/local/miniconda3/lib/python3.6/site-packages/bert4keras/snippets.py", line 525, in new_predict
prediction = predict(self, inputs, output_ids, states)
File "example_generate.py", line 52, in predict
return self.last_token(seq2seq).predict([token_ids, segment_ids])
File "/usr/local/miniconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1462, in predict
callbacks=callbacks)
File "/usr/local/miniconda3/lib/python3.6/site-packages/keras/engine/training_arrays.py", line 324, in predict_loop
batch_outs = f(ins_batch)
File "/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/keras/backend.py", line 3292, in call
run_metadata=self.run_metadata)
File "/usr/local/miniconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1458, in call
run_metadata_ptr)
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
(0) Internal: Blas xGEMMBatched launch failed : a.shape=[12,11,64], b.shape=[12,64,11], m=11, n=11, k=64, batch_size=12
[[{{node Transformer-0-MultiHeadSelfAttention/einsum/MatMul}}]]
[[lambda_1/strided_slice/_1229]]
(1) Internal: Blas xGEMMBatched launch failed : a.shape=[12,11,64], b.shape=[12,64,11], m=11, n=11, k=64, batch_size=12
[[{{node Transformer-0-MultiHeadSelfAttention/einsum/MatMul}}]]
0 successful operations.
0 derived errors ignored.

google了一下,说是需要添加如下代码:

# 第一种情况
cfg = tf.ConfigProto()
cfg.gpu_options.allow_growth=True
cfg.gpu_options.per_process_gpu_memory_fraction = 0.90
sess = tf.Session(config=cfg)
# 第二种情况
import keras
import tensorflow as tf
 
config = tf.ConfigProto()
config.gpu_options.allow_growth = True      # TensorFlow按需分配显存
config.gpu_options.per_process_gpu_memory_fraction = 0.5  # 指定显存分配比例
keras.backend.tensorflow_backend.set_session(tf.Session(config=config))

我发现我无论是第一种情况还是第二种情况都会报图片中的错误。我用的显卡是A100-SXM4-40GB,安装的是
Package Version


absl-py 0.15.0
asn1crypto 0.24.0
astor 0.8.1
astunparse 1.6.3
bert4keras 0.10.6
cached-property 1.5.2
cachetools 4.2.4
certifi 2018.4.16
cffi 1.11.5
chardet 3.0.4
charset-normalizer 2.0.10
clang 5.0
conda 4.5.4
cryptography 2.2.2
flatbuffers 1.12
gast 0.4.0
google-auth 1.35.0
google-auth-oauthlib 0.4.6
google-pasta 0.2.0
grpcio 1.43.0
h5py 3.1.0
idna 2.6
importlib-metadata 1.6.0
Keras 2.3.1
Keras-Applications 1.0.8
Keras-Preprocessing 1.1.2
Markdown 3.2.2
numpy 1.19.5
oauthlib 3.1.1
opt-einsum 3.3.0
pip 20.1
protobuf 3.12.0
pyasn1 0.4.8
pyasn1-modules 0.2.8
pycosat 0.6.3
pycparser 2.18
pyOpenSSL 18.0.0
PySocks 1.6.8
PyYAML 6.0
requests 2.27.1
requests-oauthlib 1.3.0
rsa 4.8
ruamel-yaml 0.15.37
scipy 1.5.4
setuptools 46.4.0
six 1.15.0
tensorboard 1.14.0
tensorboard-data-server 0.6.1
tensorboard-plugin-wit 1.8.1
tensorflow-estimator 1.14.0
tensorflow-gpu 1.14.0
termcolor 1.1.0
typing-extensions 3.7.4.3
urllib3 1.22
Werkzeug 1.0.1
wheel 0.37.1
wrapt 1.12.1
zipp 3.1.0

我用如下代码测试我的gpu,发现是可以用的。

# 测试gpu是否可以用
import tensorflow as tf
tf.test.is_gpu_available()

请指导一下,如何可以用GPU运行generate.py
感谢万分,预祝新年快乐~

为什么supervised阶段微调五分类?

按照博客里说的“具体来说,Sentence-BERT是将u,v,|u−v|(其中|u−v|是指u−v的每个元素都取绝对值后构成的向量)拼接起来做为特征,后面接一个全连接层做2分类(如果是NLI数据集则是3分类)。”
为什么在train/supervised.py里output = keras.layers.Dense(5, use_bias=False)(output)

如何使用自己的数据集进行数据增强?

data_path = './glue/data/'
datasets_1 = []
for task_name in ['ATEC', 'BQ', 'LCQMC', 'PAWSX', 'STS-B', 'SOHU21-SSB']:
for f in ['train', 'valid']:
threshold = 2.5 if task_name == 'STS-B' else 0.5
filename = '%s%s/%s.%s.data' % (data_path, task_name, task_name, f)
datasets_1 += load_data_1(filename, threshold)

有点没看懂如何读取自己的数据集,另外数据格式必须是 (文本1, 文本2, 标签) 这样的吗?

请教个encoder.predict([X, S]) 输入X类型的问题?

X输入一维或二维结果差异很大。用sequence_padding的时候X必须转成二维;用一维最后预测是多个结果(4, 384),二维预测就(1, 384);如果一维第一行当成句向量与二维结果差异也很大,为什么呢

image

关于 Blas xGEMMBatched launch failed 的报错

请问我在16g显存的gpu上测试 roformer-sim,总是会有如下报错:

tensorflow.python.framework.errors_impl.InternalError: Blas xGEMMBatched launch failed : a.shape=[192,17,64], b.shape=[192,17,64], m=17, n=17, k=64, batch_size=192
[[node model_3/Transformer-0-MultiHeadSelfAttention/einsum/Einsum (defined at /anaconda3/envs/nlp_tf/lib/python3.7/site-packages/bert4keras/layers.py:445) ]] [Op:__inference_predict_function_22308]

Errors may have originated from an input operation.
Input Source operations connected to node model_3/Transformer-0-MultiHeadSelfAttention/einsum/Einsum:
model_3/Transformer-0-MultiHeadSelfAttention/add_1 (defined at /anaconda3/envs/nlp_tf/lib/python3.7/site-packages/bert4keras/layers.py:443)
model_3/Transformer-0-MultiHeadSelfAttention/add (defined at /anaconda3/envs/nlp_tf/lib/python3.7/site-packages/bert4keras/layers.py:440)

Function call stack:
predict_function

请问这是什么原因呢?

蒸馏阶段时用到的数据

如题,请问下蒸馏阶段时用到的数据是公开数据集合BQ、ATEC等的train、valid集合吗

复现的指标和文章说的指标差别有一点大

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.