Comments (6)
@feizc how are you approaching the problem of generating starting from a length that is less than the prefix?
Actually, I use a fixed length of conditional context, i.e., prefix length of prior music, to continue writing the next melody.
In my opinion, to start from zero, we can use special token like [pad] to supplement the prefix length, or only use decoder to generate an initial sentence then generate conditioned on latents.
I read the source code and find the author begin with zero :)
def gen_initial_events():
> events = np.zeros([device_count, batch_size, max_events_length], np.int32)
> events[:, :, 0] = dataset.SOS_ID
> return events
from perceiver-ar-pytorch.
After reviewing the current implementation (autoregressive_wrapper) it seems you generate each subsequent token one at a time as would be the case in most architectures. The authors of the perceiver-ar paper outlined a strided approach (typically the size of the self-attention sequence length) where the sampled tokens would be cached up to a certain size and then the buffer would be freed. Have you considered implementing this? The actual released implementation perceiver-ar is relatively easy to follow.
from perceiver-ar-pytorch.
πΆπ€π
from perceiver-ar-pytorch.
@feizc how are you approaching the problem of generating starting from a length that is less than the prefix?
from perceiver-ar-pytorch.
After reviewing the current implementation (autoregressive_wrapper) it seems you generate each subsequent token one at a time as would be the case in most architectures. The authors of the perceiver-ar paper outlined a strided approach (typically the size of the self-attention sequence length) where the sampled tokens would be cached up to a certain size and then the buffer would be freed. Have you considered implementing this? The actual released implementation perceiver-ar is relatively easy to follow.
noo not yet, i haven't implemented their special caching strategy at inference
but if i keep hearing more positive results, i may implement it! have to admit i was doubtful about the architecture initially
from perceiver-ar-pytorch.
Iβm curious to see how well this would work at inference, particularly when using a vqvae / vqgan to encode images. If you could decode in only several steps that would really speed up generation. I suspect quality would suffer, but the paperβs results seem promising w.r.t. to the ImageNet results.
from perceiver-ar-pytorch.
Related Issues (8)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from perceiver-ar-pytorch.