Comments (2)
Hey @SimonHFL
- As per our email exchange the pre-release model (prithivida/grammar_error_correcter) was trained using filtered WikiEdits data and on top of that, a slice of WI&Locness is used. Because at the time WI&Locness was available as a HuggingFace dataset with no license, in fact, marked as "unknown" (Below are the proofs for that, find attached the screenshots). I have already mentioned this information in the email thread, to which you said it was probably an unintentional miss at the end of the people who uploaded the dataset to HuggingFace. So, to reiterate no intention to undermine anyone's academic work or violate a valid license policy, I merely used it based on the license info shown ( as "Unknown") at that point in time.
- (I can see that you have/had them update the license info recently.)
- But after you pointed out a possible gap/missing info on the license on the HuggingFace page, I acknowledged that in the email (also mentioned I am anyway in the process of gathering more WikiEdits data to train the subsequent models) and I did the following: A.) Explicitly called out the pre-release model is not intended for commercial usage in Github, B.) Did the same in HuggingFace readme and C.) Trained a brand new model excluding WI&Locness. That is the _V1 model.
- _V1 model (prithivida/grammar_error_correcter_v1) is trained using WikiEdit pairs and other synthetic pairs (refer to the readme for details)
- Your script is saying both pre-release and V1 models are identical because there might be an inadvertent oversight on my side in picking the right checkpoints while uploading to the tag v1.
- I have refreshed the v1 tag with the right checkpoint files now and double-checked. See below
- Also to avoid any future unintentional non-compliance in the usage from the consumers of the package <= v1.0 and hence the pre-release model (prithivida/grammar_error_correcter), I can remove it from HuggingFace.
Thanks
from gramformer.
Thanks for fixing this! Now it seems there should not be any issue with commercial use.
from gramformer.
Related Issues (20)
- OSError: Can't load config for 'prithivida/grammar_error_correcter' HOT 1
- Training dataset HOT 1
- Highlight failed - throwing out of bound error
- Installation runs endlessly in Colab with Transfomers version 4.9.0 HOT 1
- Module not found Error HOT 1
- [Spacy error] Can't find model 'en' HOT 3
- Suggestions to improve the grammar results for short sentences
- Gramformer Highlight function not working
- Word limit HOT 2
- OSError: [E050] Can't find model 'en'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory. HOT 3
- Mulitlingual Grammatical Error Correction HOT 1
- Fix Can't find model "en" error by directly loading en_core_web_sm HOT 3
- Gramformer on pypi? HOT 5
- Edit start and end position HOT 1
- Retrain with FLAN-T5-base HOT 3
- No module named 'annotated_text' in streamlit_app.py HOT 1
- what is the VERB:SVA stand for ? HOT 1
- Compatibility issues with spacy
- Output Issue with Gramformer - Script Long Strings vs. List
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gramformer.