<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Feature request: make JIT and ONNX export work about recurrent-memory-transformer-pytorch HOT 4 OPEN

lucidrains commented on May 30, 2024

Feature request: make JIT and ONNX export work

from recurrent-memory-transformer-pytorch.

Comments (4)

pfeatherstone commented on May 30, 2024 1

I haven't been able to train my models yet just with normal transformers, using larger context lengths (my weird TTS + STT system). CTC loss isn't converging at all. So haven't attempted a proper run with RMT architecture in the STT model. But setting it up with RMT while debugging the other one. I will let you know if i find success. I'm worried that training 2 transformers in tandem simply doesn't work for reasons. Either because of stupidly slow convergence, too lower batch size, or other reasons... Don't know. I've been looking at shifted tokens, scale_norm and other tricks to help with convergence. But i'm not getting any luck. I'm tempted to try RWKV as they claim really fast convergence. Either way, I'm going to need something like RMT in the end so i can have a well defined streaming architecture on the STT side.

from recurrent-memory-transformer-pytorch.

lucidrains commented on May 30, 2024

@pfeatherstone ahh, yea, i can look into that

care to share what you are seeing on your dataset with this approach?

from recurrent-memory-transformer-pytorch.

lucidrains commented on May 30, 2024

oh got it, makes sense

from recurrent-memory-transformer-pytorch.

pfeatherstone commented on May 30, 2024

Gave this a go, it turns out that torch.jit.trace() doesn't accept None in example_inputs. So we cannot trace with mems not None and expect to work when None, or vice versa. My workaround is to pass mems=torch.zeros(B,num_memory_tokens,dim) in the first pass. Which means you're attending to self.read_memory_emb ONLY in the first pass. Don't know if that's allowed.

from recurrent-memory-transformer-pytorch.

Recommend Projects

Feature request: make JIT and ONNX export work about recurrent-memory-transformer-pytorch HOT 4 OPEN

Comments (4)

Related Issues (19)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent