Comments (3)
I was thinking to perform left padding at training time as well. But in most cases, we may not need batch inference, which does not need padding at all, which can be messed up if we trained that way.
from gpt-2-tensorflow2.0.
Hi @bshao001,
Padding mask also works for inference, so you can do the inference using the right padding.
With left padding, this implementation will not give you the correct result
from gpt-2-tensorflow2.0.
Thanks for your quick response. I will give it a try when the model trained with a much larger dataset is ready.
from gpt-2-tensorflow2.0.
Related Issues (20)
- Why was WSWS.org taken out of the gpt-2 dataset, but Breitbart, fox etc was left in?
- sg.sample_sequence returns context after pre-trained model HOT 3
- The Following Error is Generated HOT 1
- No module named sentencepiece
- IndexError: Out of range: piece id is out of range. HOT 3
- why not epoch ? how to stop train model
- Error while training HOT 1
- How can i train a model with language with not english symbols?
- Error in PredictCost() for the op: "Softmax"
- Performance issues in data_pipeline.py(P2)
- What is shared weight across layers?
- python setup.py egg_info did not run successfully.
- problem with training HOT 1
- RuntimeError: Internal: /sentencepiece/src/trainer_interface.cc(336) [!sentences_.empty()] HOT 11
- Mismatch in Arguments HOT 6
- Is Concatenation of Data Files Necessary?
- Why Train_Accuracy is pretty low(about 0.2) ? HOT 1
- can not covert to pytorch model by using transformers
- tensor mask shape may be different with tensor matmul_qk shape
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gpt-2-tensorflow2.0.