Comments (7)
@gmorenz Thanks for your suggestions. I have changed the wordings in README per your suggestion. I meant to be only the pre-trained models. We received this ethnics review after we stated that we would open source the code and pre-trained models, and apparently the LibriTTS model capable of voice cloning triggered the ethnics review flags, so our intention is mainly toward the pre-trained model weights, not the code itself.
@EricRa I think the confusion comes from the voice cloning part. LibriTTS is a dataset that follows CC BY 4.0, so anyone has the right to use the voices inside the training set. However, the model can also synthesize in voices not seen during training, which is why I have to impose this rule to address some ethical issues such as deception. If you do not use the pre-trained models, however, you do not have to abide by these rules.
from styletts2.
It might help to clarify "synthesized by StyleTTS 2 models" as "synthesized by the pretrained StyleTTS 2 models". It seems that the entire project is titled StyleTTS2, so "StyleTTS 2 models" could be read as meaning all models trained using the source code, whether or not they are trained from scratch. I don't think that's what you intended or the most reasonable reading, but I think it's where the confusion comes from.
Separately I believe it would be a good idea to explicitly grant rights to use the pretrained models subject to the terms you feel are necessary. Similar to how the MIT explicitly grants rights to use the source code. The default on copyright is "all rights reserved", so without a grant of rights using these models seems legally dubious. That's also the means by that the restrictions would typically be legally binding, a licence that says "you can <do things> provided you comply with <restrictions>". For example see the MIT license, where the restriction is simply including the notice in all copies or substantial portions of the software.
I'm not a lawyer though, and I don't think I can ethically propose precise terms - usually writing bespoke licenses like this is something done by lawyers.
from styletts2.
I do not see this as a MIT violation in any way, the code is MIT without any extras..
As far as I understood the models are free with that one exception and you can retrain your own ones with the included dataset that's public domain.
from styletts2.
These rules are to address the ethics reviews we received in NeurIPS 2023. The reviewer expressed concerns over the possibility of deception:
and we have addressed these issues by asking the users to abide by certain rules to use the pre-trained models, which the reviewer accepts:
Clearly the source of the concern is the model, not the code itself. For example, pre-trained model on LJSpeech cannot be used for deception, but models trained on larger datasets with tons of speakers can be used for deception. This is why we set the ethical rules for the models, not the code. The codebase itself is MIT, and any model you trained yourself does not have to follow these rules.
However, I understand this can be misleading, and I appreciate your concern. Given the ethics review we receive and our intention to address them on the rebuttal, what are some better ways you think we can do to make it more consistent and clear?
from styletts2.
Agreed with gmorenz on both points. Regarding the first point, it is not clear to me from the wording whether the additional "permission" requirement applies only to the pre-trained models. It reads to me as though it applies to all models trained using the source code in the repo.
If the intention is that this only applies to the pre-trained models, I don't understand the relevance of this statement
That is, you agree to only use voices whose speakers grant the permission to have their voice cloned, either directly or by license before making synthesized voices pubilc
Presumably, you already have permission to use the voices of the people in your pre-trained models.
Moving the models to a separate repo with a separate license file might help. I realize you can't host the models on Github, but they could be linked from the model repo. This way, you could specify license terms for the pre-trained models separate from the code. I do not know if this is desirable or the best solution, just a possible suggestion.
My intention is not to be antagonistic, so this all doesn't come across that way. The project looks fantastic. I just think devs both in commercial and hobbyist settings would find it very useful to have a clear set of terms on the license requirements involved when using the code/models in their projects.
from styletts2.
Thank you for the clarification!
from styletts2.
Hi @yl4579, just for clarification, if somebody were to finetune the model, would the restriction still apply - do finetuned models count as pretrained models?
from styletts2.
Related Issues (20)
- Very high GPU memory usage in voice cloning after 10-15 runs. HOT 1
- Strange Loss Behavior During Stage Two Training - Not Decreasing after Diff Epoch HOT 2
- Finetune on ljspeech or libritts? HOT 1
- Better LJSpeech or LibriTTS for finetuning a single speaker voice? Or training from scratch with not so much data? HOT 3
- SLM Adversarial Training did not start when finetuning HOT 11
- Second stage training with smaller window size HOT 1
- Possible Bug in Style Diffusion Inference Code
- Issue with impropper pauses and random bursts of noise
- Cannot Convert float NaN to integer HOT 1
- HELP WANTED!!!!!!!!!!! HOT 3
- asr negative loss
- Resuming finetuning uses second to last epoch
- Help Wanted For Stage-1 HOT 2
- Inference with multilingual PL-BERT Model HOT 4
- During training, the graphics memory has been continuously increasing
- May be a bug? input parameters for model.predictor_encoder and model.style_encoder in train_finetune.py
- S_loss = 0 ... why? HOT 2
- Inference Error: context_features exists but no features provided HOT 1
- Speech conditioning like tortoise TTS HOT 1
- FP8 Fine Tuning Crashes HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from styletts2.