Comments (13)
I just noticed that compute_loudness
in spectral_ops.py
outputs significantly lower loudness values when the sample rate is 48khz. I did not have time to figure out what is causing this, but increasing the FFT size didn't seem to help much.
from ddsp.
Just FYI, all the sample_rate agnostic preprocessing code should now be in, (you can check if it works for you), but we don't have a working 44kHz model up as a demo yet.
from ddsp.
Good catches! Yah we definitely haven't explored training many model configs at different rates yet. They seem like pretty straight-forward fixes, we'll try to get to them when we can.
from ddsp.
Hi Andras,
Good recommendation! This is actually something that Hanoi is currently looking into, so we're on the case :). The tying to 16kHz is actually more for CREPE f0 detection than anything else, but it's not a hard constraint. We'll just need to change the data processing pipeline and tweak some model parameters (# of harmonics, sizes of ffts).
from ddsp.
Hi, I just wonder if there is any progress with this? I've seen some related pull request failing tests...
from ddsp.
Thanks for the follow up. We just merged #44 which creates a data pipeline for creating datasets at arbitrary sample rates (with f0 CREPE detection still at 16kHz). We're working now on hammering out some details of training configs for higher sample rates (48kHz), and will add some details and configs to the colab notebooks when we get that figured out.
from ddsp.
Just wanted to add, that I have extensively tested 48 kHz training on the test branch that is waiting to be merged, and it works well (with some hyper-parameter tuning).
from ddsp.
That's great! Yah, sorry for all the delays. There's been with some COVID related bureaucracy slowing down our efforts in that direction so cool to hear that it's working for you.
Do you have an example gin config / example you could share of it working? It could be helpful for us and others I think.
In terms of the branch PR, @lamtharnhantrakul is back on the case just now actually. The old branch (#57) had gotten pretty stale so he's splitting it up into two PRs, the first of which is now (#102). So hopefully we should have the code in master soon.
from ddsp.
Sorry for the slow answer, I don't have an example to show at this time, I am working on a different problem field compared to what your demo is doing, mostly percussive sound resynthesis with plenty of inharmonicity. I am trying an approach to generate lot of harmonically non-related tunable sine components + noise and reproduce single shot acoustic samples. I will let you know if I have anything cool to show.
from ddsp.
Okay, great, no worries. For what it's worth, I've also been developing sinusoidal + noise models (still focusing on harmonic-ish type instruments) but for self-supervised transcription.
I think we're going to do a code refactor to expose a lot of that internal code in the next week or two, so feel free to take a look :).
from ddsp.
It seems like the assumption of working with a 16kHz signal is still inextricably baked into this code in some places. A couple examples I've noticed:
MfccTimeDistributedRnnEncoder.z_time_steps
is constrained to be chosen from a set of values that reflect the assumption that the input signal will be a 4 second clip with 16kHz sample rate.spectral_ops.compute_mel()
, a backbone to the other foundational functions inspectral_ops.py
, is hardcoded to computetf.signal.linear_to_mel_weight_matrix()
with a 16kHz sample rate, putting the Nyquist frequency well below the upper bound of human hearing.
Please correct me if I'm wrong on these examples, since I'm still very much in the learning process. I'll put more examples here if I find them. And thank you for how helpful you've been @jesseengel
from ddsp.
@voodoohop I found the same problem, the loudness is too low at 44.1kHz audio! I am not sure of the status of the code for higher sample_rates, but I am trying to train on a custom guitar dataset at 44.1kHz and the results are quite poor!
Did you get a solution to this?
from ddsp.
@jesseengel In my case, I am training the model on a custom guitar monophonic dataset (44.1kHz) to learn the timbre embeddings. I have set the frame_rate=210 & 252 and got poor results. So I am going to try training with higher frame rates!
But I am not sure if the root cause of the problem is in the low frame rate I used or in other model hyperparameters like fft size, #harmonics, etc.
Below are my commands.
Data prep:
ddsp_prepare_tfrecord \ --input_audio_filepatterns='~/buckets/pratik-ddsp-data/monophonic/*wav' \ --output_tfrecord_path=~/tfrecord_441sr_700fr/train.tfrecord \ --chunk_secs=0.0 \ --num_shards=10 \ --frame_rate=700 \ --sample_rate=44100 \ --alsologtostderr
Below is for training process
ddsp_run \ --mode=train \ --gin_file=~/ddsp/ddsp/training/gin/models/ae_mfccRnnEncoder_last.gin \ --gin_file=~/ddsp/ddsp/training/gin/datasets/tfrecord.gin \ --gin_file=~/ddsp/ddsp/training/gin/eval/basic_f0_ld.gin \ --gin_param="TFRecordProvider.file_pattern='~/tfrecord_441sr_252fr/train.tfrecord*'" \ --gin_param="batch_size=16" \ --alsologtostderr \ --gin_param="TFRecordProvider.sample_rate=44100" \ --gin_param="Harmonic.sample_rate=44100" \ --gin_param="FilteredNoise.n_samples=176400" \ --gin_param="Harmonic.n_samples=176400" \ --gin_param="Reverb.reverb_length=176400" \ --gin_param='F0LoudnessPreprocessor.time_steps=2800' \ --gin_param='F0LoudnessPreprocessor.frame_rate=700' \ --gin_param='F0LoudnessPreprocessor.sample_rate=44100' \ --gin_param="TFRecordProvider.frame_rate=700"
Am I missing something?
from ddsp.
Related Issues (20)
- Harmonic component not getting trained while training on a higher sample rate!
- NotImplementedError while trying to run training HOT 2
- DDSP VST Model training & autoencoder training both instantly start to slow down significantly
- Here's an example to transfer timbre in browser [Docs]
- Plugin not appearing in Ableton Live (Macbook pro M1) HOT 1
- Train VST Colab shows as idle when training
- TrainVST colab shows as idle when training
- Any chance to get the colab demo working again? HOT 2
- Possible to use VST model programmatically? HOT 1
- OnlineF0PowerPreprocessor cannot function with compute_power = False.
- No module crepe
- AttributeError: module 'hmmlearn.hmm' has no attribute 'CategoricalHMM'
- AttributeError: module 'hmmlearn.hmm' has no attribute 'CategoricalHMM'
- AttributeError: module 'hmmlearn.hmm' has no attribute 'CategoricalHMM'
- AttributeError: module 'collections' has no attribute 'Iterable'
- python environment Mac M1 HOT 1
- train_autoencoder.ipynb error I got HOT 1
- ImportError: cannot import name 'dtensor_api' from 'keras.dtensor' HOT 5
- vst notebook
- error when training !
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ddsp.