Comments (3)
The easiest way to do this is:
- put all your .wav files under
state-spaces/data/<your-dataset-name>/
- create a copy of the
YouTubeMix
config (https://github.com/HazyResearch/state-spaces/blob/main/configs/dataset/youtubemix.yaml) -- you can just call it<your-dataset-name>.yaml
. Change thepath
key in the config to<your-dataset-name>
and adjust any other settings in the config.
You can then try to run training out-of-the-box
python -m train wandb=null experiment=sashimi-youtubemix dataset=<your-dataset-name>
if you use wandb you can leave out the wandb=null
flag. Let us know if you run into any bugs along the way. This should be a good starting point to tweak from.
Caveat: if you have a custom train-val-test split, you'll need to implement things in Python instead. The main things to look at are:
- The base datasets (
AbstractAudioDataset
,QuantizedAudioDataset
) we use for audio: https://github.com/HazyResearch/state-spaces/blob/main/src/dataloaders/audio.py. I suggest reading this anyway, to get a sense for what the options in theyoutubemix.yaml
file mean. - The SC09 dataset (same file;
SpeechCommands09
), which does use a predefined train-val-test split. You can create a class analogous to this - Finally, this dataset needs to plug into our
SequenceDataset
abstraction, so we additionally add a wrapper insidedatasets.py
e.g. https://github.com/HazyResearch/state-spaces/blob/137e49d6930414d0b4ca2b0b7e72ef551b1129b0/src/dataloaders/datasets.py#L1541 forSpeechCommands09Autoregressive
.
from s4.
Closing this issue for now, feel free to reopen if you run into trouble.
from s4.
@krandiash Thanks a lot. I got it to run. However, using a custom name threw an error so I just used the youtubemix.yaml
and copied my files to data/youtubemix
.
from s4.
Related Issues (20)
- Usage of bandlimit parameter in S4D HOT 2
- The Issue only occurs in the aan dataset HOT 1
- Using Neumann series to compute the DFT of basis kernels directly HOT 5
- Several examples doesn't work (Sashimi checkpoints / sampleRNN training) HOT 4
- information mismatch in s4/models/s4/experiments.md
- Paper, Table 1, Convolution number of parameters HOT 2
- About `krylov()` HOT 1
- Missing or misplaced "old" config folder? HOT 4
- "pretrained_model" is not defined before being called in train.py HOT 2
- Question on HMDB51 Dataset (S4ND Video Experiment)
- Unable to generate the weather using generate.py with time Series training checkpoint
- Large difference of inference result between forward and step
- AttributeError: 'SSMKernelDPLR' object has no attribute 'kernel' HOT 1
- Training on 12bits audio instead of 8bit? (Question, what do I need to change?)
- S4 Listops have nan loss HOT 2
- Quantization for S4/ Hippo
- The dynamics of the latent state of the model
- segmentation fault when running python -m train pipeline=mnist model=s4 HOT 1
- how to use the S4Block .step()
- KeyError in train.py self.dataset = SequenceDataset.registry[self.hparams.dataset._name_]
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from s4.