Comments (7)
You can find an example usage about unpruned + pruned loss in
https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/pruned_transducer_stateless2/model.py#L72
It uses a so-called "model warmup". The basic idea is that it first uses the unpruned version to make the model converge and applies the pruned version gradually.
Also, I was wondering what is the best way to understand the pruned loss other than reading the code?
We will post the paper soon.
from fast_rnnt.
Also, I was wondering what is the best way to understand the pruned loss other than reading the code?
The paper is online now:
https://arxiv.org/abs/2206.13236
from fast_rnnt.
Thanks for your quick response. Yeah, I can see that clearly in https://github.com/k2-fsa/icefall/blob/7100c33820c8c478e07d3435e25e4f1543b6eec7/egs/librispeech/ASR/pruned_transducer_stateless2/train.py#L557
But assuming I used the pruned_loss as shown here, model convergance is expected to be slow... right??
from fast_rnnt.
model convergance is expected to be slow... right
It depends on what optimizer you are using, and what model and what LR scheduler you are using.
The model, optimizer, and LR scheduler in icefall are specifically tuned for pruned RNN-T training.
from fast_rnnt.
Are you also including the simple_loss in your loss function? You need to include that so it trains the associated parameters and learns reasonable pruning bounds. As Fangjun mentioned, it makes sense to use only the simple loss during a warmup period.
from fast_rnnt.
Hey @danpovey @csukuangfj, thank you for your quick responses... You are my safety net here 😃
I've added the warmup functionality to my code the same way it was implemented here. Now, I'm gonna re-train the model and see if that improves the convergence. Will update this thread once it's done.
from fast_rnnt.
Since the paper is out now, I'm gonna close this issue. Much appreciated!
from fast_rnnt.
Related Issues (20)
- Train loss is nan or inf HOT 29
- Combination of fast_rnnt and fast_emit HOT 12
- missing: CUDNN_LIBRARY_PATH CUDNN_INCLUDE_PATH when installing HOT 3
- AssertionError: assert py.is_contiguous() HOT 12
- RuntimeError: invalid device ordinal HOT 19
- ModuleNotFoundError: No module named '_fast_rnnt' HOT 2
- Issue in installation HOT 7
- [Help wanted] Support BUILD_FOR_ALL_ARCHS
- Why T>=S constraint? HOT 15
- Error while installing HOT 6
- pip error HOT 4
- pip error
- Import fast_rnnt is Failed HOT 11
- [feature request] Enable github actions HOT 1
- T>=S constraint in latest pip version HOT 1
- RuntimeError: Failed to find native CUDA module HOT 10
- C++ Version Error While Installing HOT 7
- CUDA error HOT 6
- #error -- unsupported GUN version ! gcc version later than 5.3 are not supported! HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fast_rnnt.