muqiujun-ai / bert4pytorch Goto Github PK
View Code? Open in Web Editor NEW超轻量级bert的pytorch版本,大量中文注释,容易修改结构,持续更新
超轻量级bert的pytorch版本,大量中文注释,容易修改结构,持续更新
可以加载bert4keras里面提供的模型吗
有人知道这个问题改怎么解决吗
File "F:\software\Anaconda\envs\tensor\lib\site-packages\bert4pytorch\modeling.py", line 71, in load_weights_from_pytorch_checkpoint
state_dict[new_key] = state_dict.pop(old_key)
KeyError: 'bert.embeddings.LayerNorm.gamma'
modelling文件中variable mapping函数中的mapping存在bug.
hugging face的模型文件中layerNorm参数用的是gamma和beta, 作者给的是weight bias, 不匹配
现在版本的分词似乎是不支持批量数据加载的,是吗?
比如 bert-base-chinese,作者是否有做过这方面的评估测试呀~
can not found 'config.json'
if conditional:
self.dense1 = nn.Linear(2 * hidden_size, hidden_size, bias=False)
self.dense.weight.data.uniform_(0, 0) -------> 此处应该self.dense1, 下边的self.dense2 也是一样的
能提供一下这个的安装版本嘛
Transformer Quality in Linear Time
Weizhe Hua, Zihang Dai, Hanxiao Liu, Quoc V. Le
We revisit the design choices in Transformers, and propose methods to address their weaknesses in handling long sequences. First, we propose a simple layer named gated attention unit, which allows the use of a weaker single-head attention with minimal quality loss. We then propose a linear approximation method complementary to this new layer, which is accelerator-friendly and highly competitive in quality. The resulting model, named FLASH, matches the perplexity of improved Transformers over both short (512) and long (8K) context lengths, achieving training speedups of up to 4.9× on Wiki-40B and 12.1× on PG-19 for auto-regressive language modeling, and 4.8× on C4 for masked language modeling.
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)
Cite as: arXiv:2202.10447 [cs.LG]
(or arXiv:2202.10447v1 [cs.LG] for this version)
sentence = "北京[MASK]安门"
huggingface上下载的bert-base-uncased模型,预测的结果总是"the",为什么
LabelSmoothingCrossEntropy这个函数最终返回的总loss的前半部分: loss*self.eps/c ,这里c是类别个数,我发现有的公式里写的这里应该是除以类别个数减一。
请教一下到底要不要减一
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.