Comments (5)
This feels slow indeed. But just to know the baseline: could you provide the speed of a comparable pytorch/TF model on the same hardware with the same batch size?
from trax.
Hello,
On a v100 on similar settings using tensor2tensor, it takes less than 30 seconds per 100 batches or 0.3 seconds per batch. My guess is that, reformer has to recompute activations during backpropagation and this involves an overhead. Another thing is that reformer probably works on TPUs much faster (more compute cores or better addition-subtraction optimizations which reformer relies on). In any case, I would like to know if there is any way to speed things up.
Regards.
from trax.
I think that the batches for Reformer may be much larger - so it may take longer per batch, but the difference in speed per token isn't that big? Have you checked what are the exact batch sizes in both cases, do they match?
from trax.
Hi,
Now that you mention it, I noticed that the batch sizes are 256 SENTENCES per batch instead of the default 2048 TOKENS in transformer big.
Assuming an average token size of 24 thats 6144 tokens per batch on average. So thats almost 3x larger batches. I would expect that a V100 would be able to parallellize on larger batches but I may be wrong. If you think this is what the issue is then I am satisfied. :)
from trax.
I think it is, good to have clarified that :).
from trax.
Related Issues (20)
- The colab button on Knowledge_Tracing_Transformer.ipynb is not open
- TypeError: float() argument must be a string or a number, not 'jaxlib.tpu_client_extension.PyTpuBuffer'
- Machine Translation Refromer model.pkl for trax 1.4.1?
- ImportError: cannot import name 'MergeHeads' from 'trax.layers.attention'
- how to use trax to translate other languages
- Limit the dataset from TFDS
- TypeError: unsupported operand type(s) for ==: 'Array' and 'tuple' HOT 2
- SelfAttention - problem with tensorflow 2.11.0
- AttributeError: module 'jax.ops' has no attribute 'index_add' HOT 1
- Unable to import trax HOT 1
- Cannot import Trax HOT 6
- Could not normally run trax using GPU in local computer
- Issue when running training_loop.run(2000) - message StopIteration in next_batch(self)
- Is possible Linformer algorithm ?
- Can I do simple tokenization?
- Can't run `bert_vocab_from_dataset` without `TypeError: Tensor is unhashable` when import `trax` with `tensorflow`
- Are any easy ways to use something like `train_test_split` from `sklearn`?
- AttributeError: 'function' object has no attribute 'n_steps_per_checkpoint' for NLP Machine translation model HOT 1
- Error loading loop from a checkpoint HOT 1
- Inconsistency in function's doc-string HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from trax.