Comments (8)
Thanks for looking into this!
It seems you're using a GPU with about half what we've been testing on (v100), so sorry you bumped into this edge case.
I am a little confused why that code snippet works (since we don't use sessions in 2.0), but I assume it's somehow tapping into the same backend. Can you try the TF 2.0 code from https://www.tensorflow.org/guide/gpu#limiting_gpu_memory_growth and see if it works for you too?
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
# Currently, memory growth needs to be the same across GPUs
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
logical_gpus = tf.config.experimental.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
# Memory growth must be set before GPUs have been initialized
print(e)
from ddsp.
Can you add the exact command you're running and any other details (dataset etc.) that might be relevant?
from ddsp.
This is the command being run
ddsp_run --mode=train --alsologtostderr --model_dir="C:\Users\andrey\Desktop\winDDSP\MODEL" --gin_file="C:/Users/andrey/Desktop/winDDSP/soloinstrument.gin" --gin_file="C:/Users/andrey/Anaconda3/envs/test/lib/site-packages/ddsp/training/gin/datasets/tfrecord.gin" --gin_param="batch_size=16" --gin_param="TFRecordProvider.file_pattern='C:/Users/andrey/Desktop/winDDSP/data/train.tfrecord*'" --gin_param="train_util.train.num_steps=30000" --gin_param="train_util.train.steps_per_save=100" --gin_param="train_util.Trainer.checkpoints_to_keep=10"
And I believe that the dataset at the moment is just a single wav (around 15 seconds) that I prepared with ddsp_prepare_tfrecord. You can find the tfrecord files attached.
data.zip
As I said, the thing that confuses me most is that the same command runs perfectly fine when only the CPU is used to train. At the same time, judging from a tensorflow toy example code execution, tf and cuda seem to be configured correctly to work together.
from ddsp.
The problem was also discussed in this issue with tensorflow tensorflow/tensorflow#24496
Pasting this code inside train_util.py has solved the problem.
from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession
config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)
What was happening is that the process started filling the gpu memory very quickly and when it exceeded the available memory the aforementioned error popped out.
from ddsp.
You're welcome :) Yes, that code also does the job for the training. By the way, ddsp_prepare_tfrecord also has the same (or similar) problem. The console output is different but still I can see that it just allocates the whole GPU memory and then crashes. Where should I put that fix in? I've put it everywhere I can think of (prepare_tfrecord.py, prepare_tfrecord_lib.py, spectral_ops.py, core.py) and it doesn't seem to work.
Edit: I was trying to prepare a bigger dataset when I got this error (970 audios, 264mb), and found out it didn't work even on cpu. A small dataset with only one wav is prepared correctly both with GPU and CPU. How can I go around this? Thank you very much.
(base) andrey@andrey-PC:~/Escritorio/voicemodIA/DDSP$ ddsp_prepare_tfrecord --input_audio_filepatterns="/media/andrey/DATOS/Datasets/english/train/voice/male/*" --output_tfrecord_path="data/train.tfrecord" --num_shards=10 --alsologtostderr
2020-02-26 14:33:58.648031: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer.so.6
2020-02-26 14:33:58.649092: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libnvinfer_plugin.so.6
2020-02-26 14:33:59.577649: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-02-26 14:33:59.598016: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-26 14:33:59.598330: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2060 SUPER computeCapability: 7.5
coreClock: 1.68GHz coreCount: 34 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 417.29GiB/s
2020-02-26 14:33:59.598359: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-02-26 14:33:59.598407: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-02-26 14:33:59.599519: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-02-26 14:33:59.599689: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-02-26 14:33:59.600654: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-02-26 14:33:59.601324: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-02-26 14:33:59.601352: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-02-26 14:33:59.601433: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-26 14:33:59.601771: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-26 14:33:59.602051: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-02-26 14:33:59.602304: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-02-26 14:33:59.606149: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3699850000 Hz
2020-02-26 14:33:59.606322: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55b42a7ac960 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-02-26 14:33:59.606332: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-02-26 14:33:59.670131: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-26 14:33:59.670429: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55b42a79a280 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-02-26 14:33:59.670443: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce RTX 2060 SUPER, Compute Capability 7.5
2020-02-26 14:33:59.670574: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-26 14:33:59.670790: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2060 SUPER computeCapability: 7.5
coreClock: 1.68GHz coreCount: 34 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 417.29GiB/s
2020-02-26 14:33:59.670810: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-02-26 14:33:59.670818: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-02-26 14:33:59.670834: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-02-26 14:33:59.670847: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-02-26 14:33:59.670858: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-02-26 14:33:59.670869: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-02-26 14:33:59.670877: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-02-26 14:33:59.670913: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-26 14:33:59.671140: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-26 14:33:59.671336: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-02-26 14:33:59.671355: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-02-26 14:33:59.826876: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-02-26 14:33:59.826903: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0
2020-02-26 14:33:59.826908: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
2020-02-26 14:33:59.827068: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-26 14:33:59.827327: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-26 14:33:59.827535: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7028 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060 SUPER, pci bus id: 0000:01:00.0, compute capability: 7.5)
1 Physical GPUs, 1 Logical GPUs
1 Physical GPUs, 1 Logical GPUs
1 Physical GPUs, 1 Logical GPUs
1 Physical GPUs, 1 Logical GPUs
I0226 14:34:00.700125 140064621770560 fn_api_runner_transforms.py:540] ==================== <function annotate_downstream_side_inputs at 0x7f61f93db4d0> ====================
I0226 14:34:00.700693 140064621770560 fn_api_runner_transforms.py:540] ==================== <function fix_side_input_pcoll_coders at 0x7f61f93db5f0> ====================
I0226 14:34:00.700984 140064621770560 fn_api_runner_transforms.py:540] ==================== <function lift_combiners at 0x7f61f93db680> ====================
I0226 14:34:00.701103 140064621770560 fn_api_runner_transforms.py:540] ==================== <function expand_sdf at 0x7f61f93db710> ====================
I0226 14:34:00.701330 140064621770560 fn_api_runner_transforms.py:540] ==================== <function expand_gbk at 0x7f61f93db7a0> ====================
I0226 14:34:00.701719 140064621770560 fn_api_runner_transforms.py:540] ==================== <function sink_flattens at 0x7f61f93db8c0> ====================
I0226 14:34:00.701858 140064621770560 fn_api_runner_transforms.py:540] ==================== <function greedily_fuse at 0x7f61f93db950> ====================
I0226 14:34:00.702906 140064621770560 fn_api_runner_transforms.py:540] ==================== <function read_to_impulse at 0x7f61f93db9e0> ====================
I0226 14:34:00.703006 140064621770560 fn_api_runner_transforms.py:540] ==================== <function impulse_to_input at 0x7f61f93dba70> ====================
I0226 14:34:00.703125 140064621770560 fn_api_runner_transforms.py:540] ==================== <function inject_timer_pcollections at 0x7f61f93dbc20> ====================
I0226 14:34:00.703323 140064621770560 fn_api_runner_transforms.py:540] ==================== <function sort_stages at 0x7f61f93dbcb0> ====================
I0226 14:34:00.703435 140064621770560 fn_api_runner_transforms.py:540] ==================== <function window_pcollection_coders at 0x7f61f93dbd40> ====================
I0226 14:34:00.704764 140064621770560 statecache.py:150] Creating state cache with size 100
I0226 14:34:00.704909 140064621770560 fn_api_runner.py:1797] Created Worker handler <apache_beam.runners.portability.fn_api_runner.EmbeddedWorkerHandler object at 0x7f61f935ac10> for environment urn: "beam:env:embedded_python:v1"
I0226 14:34:00.705106 140064621770560 fn_api_runner.py:822] Running ((((ref_AppliedPTransform_Create/Impulse_3)+(ref_AppliedPTransform_Create/FlatMap(<lambda at core.py:2597>)_4))+(ref_AppliedPTransform_Create/MaybeReshuffle/Reshuffle/AddRandomKeys_7))+(ref_AppliedPTransform_Create/MaybeReshuffle/Reshuffle/ReshufflePerKey/Map(reify_timestamps)_9))+(Create/MaybeReshuffle/Reshuffle/ReshufflePerKey/GroupByKey/Write)
I0226 14:34:00.731557 140064621770560 fn_api_runner.py:822] Running ((((((((((Create/MaybeReshuffle/Reshuffle/ReshufflePerKey/GroupByKey/Read)+(ref_AppliedPTransform_Create/MaybeReshuffle/Reshuffle/ReshufflePerKey/FlatMap(restore_timestamps)_14))+(ref_AppliedPTransform_Create/MaybeReshuffle/Reshuffle/RemoveRandomKeys_15))+(ref_AppliedPTransform_Create/Map(decode)_16))+(ref_AppliedPTransform_Map(_load_audio)_17))+(ref_AppliedPTransform_Map(_add_f0_estimate)_18))+(ref_AppliedPTransform_Map(_add_loudness)_19))+(ref_AppliedPTransform_FlatMap(_split_example)_20))+(ref_AppliedPTransform_Reshuffle/AddRandomKeys_22))+(ref_AppliedPTransform_Reshuffle/ReshufflePerKey/Map(reify_timestamps)_24))+(Reshuffle/ReshufflePerKey/GroupByKey/Write)
I0226 14:34:00.753737 140058657015552 prepare_tfrecord_lib.py:43] Loading '/media/andrey/DATOS/Datasets/english/train/voice/male/V001_0001595577.wav'.
2020-02-26 14:34:01.440541: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-02-26 14:34:01.586315: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
/home/andrey/anaconda3/lib/python3.7/site-packages/librosa/core/time_frequency.py:1208: RuntimeWarning: divide by zero encountered in log10
- 0.5 * np.log10(f_sq + const[3]))
I0226 14:34:04.901932 140058657015552 prepare_tfrecord_lib.py:43] Loading '/media/andrey/DATOS/Datasets/english/train/voice/male/V001_0001866840.wav'.
Traceback (most recent call last):
File "apache_beam/runners/common.py", line 883, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 667, in apache_beam.runners.common.PerWindowInvoker.invoke_process
File "apache_beam/runners/common.py", line 748, in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
File "/home/andrey/anaconda3/lib/python3.7/site-packages/apache_beam/transforms/core.py", line 1435, in <lambda>
wrapper = lambda x, *args, **kwargs: [fn(x, *args, **kwargs)]
File "/home/andrey/anaconda3/lib/python3.7/site-packages/ddsp/training/data_preparation/prepare_tfrecord_lib.py", line 69, in _add_f0_estimate
f0_hz, f0_confidence = compute_f0(audio, sample_rate, frame_rate)
File "/home/andrey/anaconda3/lib/python3.7/site-packages/ddsp/spectral_ops.py", line 276, in compute_f0
assert n_padding % 1 == 0
AssertionError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/andrey/anaconda3/bin/ddsp_prepare_tfrecord", line 10, in <module>
sys.exit(console_entry_point())
File "/home/andrey/anaconda3/lib/python3.7/site-packages/ddsp/training/data_preparation/prepare_tfrecord.py", line 105, in console_entry_point
app.run(main)
File "/home/andrey/anaconda3/lib/python3.7/site-packages/absl/app.py", line 299, in run
_run_main(main, args)
File "/home/andrey/anaconda3/lib/python3.7/site-packages/absl/app.py", line 250, in _run_main
sys.exit(main(argv))
File "/home/andrey/anaconda3/lib/python3.7/site-packages/ddsp/training/data_preparation/prepare_tfrecord.py", line 100, in main
run()
File "/home/andrey/anaconda3/lib/python3.7/site-packages/ddsp/training/data_preparation/prepare_tfrecord.py", line 95, in run
pipeline_options=FLAGS.pipeline_options)
File "/home/andrey/anaconda3/lib/python3.7/site-packages/ddsp/training/data_preparation/prepare_tfrecord_lib.py", line 170, in prepare_tfrecord
coder=beam.coders.ProtoCoder(tf.train.Example))
File "/home/andrey/anaconda3/lib/python3.7/site-packages/apache_beam/pipeline.py", line 481, in __exit__
self.run().wait_until_finish()
File "/home/andrey/anaconda3/lib/python3.7/site-packages/apache_beam/pipeline.py", line 461, in run
self._options).run(False)
File "/home/andrey/anaconda3/lib/python3.7/site-packages/apache_beam/pipeline.py", line 474, in run
return self.runner.run_pipeline(self, self._options)
File "/home/andrey/anaconda3/lib/python3.7/site-packages/apache_beam/runners/direct/direct_runner.py", line 182, in run_pipeline
return runner.run_pipeline(pipeline, options)
File "/home/andrey/anaconda3/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 486, in run_pipeline
default_environment=self._default_environment))
File "/home/andrey/anaconda3/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 494, in run_via_runner_api
return self.run_stages(stage_context, stages)
File "/home/andrey/anaconda3/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 583, in run_stages
stage_context.safe_coders)
File "/home/andrey/anaconda3/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 904, in _run_stage
result, splits = bundle_manager.process_bundle(data_input, data_output)
File "/home/andrey/anaconda3/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 2105, in process_bundle
for result, split_result in executor.map(execute, part_inputs):
File "/home/andrey/anaconda3/lib/python3.7/concurrent/futures/_base.py", line 598, in result_iterator
yield fs.pop().result()
File "/home/andrey/anaconda3/lib/python3.7/concurrent/futures/_base.py", line 435, in result
return self.__get_result()
File "/home/andrey/anaconda3/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/home/andrey/anaconda3/lib/python3.7/site-packages/apache_beam/utils/thread_pool_executor.py", line 44, in run
self._future.set_result(self._fn(*self._fn_args, **self._fn_kwargs))
File "/home/andrey/anaconda3/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 2102, in execute
return bundle_manager.process_bundle(part_map, expected_outputs)
File "/home/andrey/anaconda3/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 2025, in process_bundle
result_future = self._worker_handler.control_conn.push(process_bundle_req)
File "/home/andrey/anaconda3/lib/python3.7/site-packages/apache_beam/runners/portability/fn_api_runner.py", line 1358, in push
response = self.worker.do_instruction(request)
File "/home/andrey/anaconda3/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", line 352, in do_instruction
request.instruction_id)
File "/home/andrey/anaconda3/lib/python3.7/site-packages/apache_beam/runners/worker/sdk_worker.py", line 386, in process_bundle
bundle_processor.process_bundle(instruction_id))
File "/home/andrey/anaconda3/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py", line 812, in process_bundle
data.transform_id].process_encoded(data.data)
File "/home/andrey/anaconda3/lib/python3.7/site-packages/apache_beam/runners/worker/bundle_processor.py", line 205, in process_encoded
self.output(decoded_value)
File "apache_beam/runners/worker/operations.py", line 302, in apache_beam.runners.worker.operations.Operation.output
File "apache_beam/runners/worker/operations.py", line 304, in apache_beam.runners.worker.operations.Operation.output
File "apache_beam/runners/worker/operations.py", line 178, in apache_beam.runners.worker.operations.SingletonConsumerSet.receive
File "apache_beam/runners/worker/operations.py", line 657, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/worker/operations.py", line 658, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/common.py", line 878, in apache_beam.runners.common.DoFnRunner.receive
File "apache_beam/runners/common.py", line 885, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 941, in apache_beam.runners.common.DoFnRunner._reraise_augmented
File "apache_beam/runners/common.py", line 883, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 497, in apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1028, in apache_beam.runners.common._OutputProcessor.process_outputs
File "apache_beam/runners/worker/operations.py", line 178, in apache_beam.runners.worker.operations.SingletonConsumerSet.receive
File "apache_beam/runners/worker/operations.py", line 657, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/worker/operations.py", line 658, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/common.py", line 878, in apache_beam.runners.common.DoFnRunner.receive
File "apache_beam/runners/common.py", line 885, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 941, in apache_beam.runners.common.DoFnRunner._reraise_augmented
File "apache_beam/runners/common.py", line 883, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 497, in apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1028, in apache_beam.runners.common._OutputProcessor.process_outputs
File "apache_beam/runners/worker/operations.py", line 178, in apache_beam.runners.worker.operations.SingletonConsumerSet.receive
File "apache_beam/runners/worker/operations.py", line 657, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/worker/operations.py", line 658, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/common.py", line 878, in apache_beam.runners.common.DoFnRunner.receive
File "apache_beam/runners/common.py", line 885, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 941, in apache_beam.runners.common.DoFnRunner._reraise_augmented
File "apache_beam/runners/common.py", line 883, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 497, in apache_beam.runners.common.SimpleInvoker.invoke_process
File "apache_beam/runners/common.py", line 1028, in apache_beam.runners.common._OutputProcessor.process_outputs
File "apache_beam/runners/worker/operations.py", line 178, in apache_beam.runners.worker.operations.SingletonConsumerSet.receive
File "apache_beam/runners/worker/operations.py", line 657, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/worker/operations.py", line 658, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/common.py", line 878, in apache_beam.runners.common.DoFnRunner.receive
File "apache_beam/runners/common.py", line 885, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 941, in apache_beam.runners.common.DoFnRunner._reraise_augmented
File "apache_beam/runners/common.py", line 883, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 667, in apache_beam.runners.common.PerWindowInvoker.invoke_process
File "apache_beam/runners/common.py", line 747, in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
File "apache_beam/runners/common.py", line 1028, in apache_beam.runners.common._OutputProcessor.process_outputs
File "apache_beam/runners/worker/operations.py", line 178, in apache_beam.runners.worker.operations.SingletonConsumerSet.receive
File "apache_beam/runners/worker/operations.py", line 657, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/worker/operations.py", line 658, in apache_beam.runners.worker.operations.DoOperation.process
File "apache_beam/runners/common.py", line 878, in apache_beam.runners.common.DoFnRunner.receive
File "apache_beam/runners/common.py", line 885, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 956, in apache_beam.runners.common.DoFnRunner._reraise_augmented
File "/home/andrey/anaconda3/lib/python3.7/site-packages/future/utils/__init__.py", line 421, in raise_with_traceback
raise exc.with_traceback(traceback)
File "apache_beam/runners/common.py", line 883, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 667, in apache_beam.runners.common.PerWindowInvoker.invoke_process
File "apache_beam/runners/common.py", line 748, in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
File "/home/andrey/anaconda3/lib/python3.7/site-packages/apache_beam/transforms/core.py", line 1435, in <lambda>
wrapper = lambda x, *args, **kwargs: [fn(x, *args, **kwargs)]
File "/home/andrey/anaconda3/lib/python3.7/site-packages/ddsp/training/data_preparation/prepare_tfrecord_lib.py", line 69, in _add_f0_estimate
f0_hz, f0_confidence = compute_f0(audio, sample_rate, frame_rate)
File "/home/andrey/anaconda3/lib/python3.7/site-packages/ddsp/spectral_ops.py", line 276, in compute_f0
assert n_padding % 1 == 0
RuntimeError: AssertionError [while running 'Map(_add_f0_estimate)']
from ddsp.
Cool, any interest in adding that to the code? I think it should probably just be a function allow_memory_growth()
in train_util.py
that gets called from ddsp_run.py
when a boolean flag --allow_memory_growth
flag is set (default to false).
The dataset creation seems to be a different issue perhaps as it's being caught by this assert:
File "apache_beam/runners/common.py", line 883, in apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 667, in apache_beam.runners.common.PerWindowInvoker.invoke_process
File "apache_beam/runners/common.py", line 748, in apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
File "/home/andrey/anaconda3/lib/python3.7/site-packages/apache_beam/transforms/core.py", line 1435, in <lambda>
wrapper = lambda x, *args, **kwargs: [fn(x, *args, **kwargs)]
File "/home/andrey/anaconda3/lib/python3.7/site-packages/ddsp/training/data_preparation/prepare_tfrecord_lib.py", line 69, in _add_f0_estimate
f0_hz, f0_confidence = compute_f0(audio, sample_rate, frame_rate)
File "/home/andrey/anaconda3/lib/python3.7/site-packages/ddsp/spectral_ops.py", line 276, in compute_f0
assert n_padding % 1 == 0
AssertionError
Would you like to create a different issue for that?
from ddsp.
I found out that the problem was with a specific .wav file and not because of the size of the dataset. It would be interesting to find out why's the code crashing with it, so I will open a new issue later. Also created a PR with the fix for this issue in the way you suggested, so I'm closing it.
Thank you for your responsiveness!
from ddsp.
I got a similar issue while training on a T4
failed to initialize batched cufft plan with customized allocator: Failed to make cuFFT batched plan. Fatal Python error: Aborted
The code suggested by jesseengel (#29 (comment)) fixed the issue.
from ddsp.
Related Issues (20)
- Any chance to get the colab demo working again? HOT 2
- Possible to use VST model programmatically? HOT 1
- OnlineF0PowerPreprocessor cannot function with compute_power = False.
- No module crepe
- AttributeError: module 'hmmlearn.hmm' has no attribute 'CategoricalHMM'
- AttributeError: module 'hmmlearn.hmm' has no attribute 'CategoricalHMM'
- AttributeError: module 'hmmlearn.hmm' has no attribute 'CategoricalHMM'
- AttributeError: module 'collections' has no attribute 'Iterable'
- python environment Mac M1 HOT 1
- train_autoencoder.ipynb error I got HOT 1
- ImportError: cannot import name 'dtensor_api' from 'keras.dtensor' HOT 5
- vst notebook
- error when training !
- pip is repeatedly installing various versions of same packages HOT 9
- Question About Midi Autoencoder
- Failed building wheel for llvmlite, Could not build wheels for numba, llvmlite, which is required to install pyproject.toml-based projects HOT 1
- timbre_transfer.ipynb is broken on Colab
- train_autoencoder.ipynb is broken on Colab
- Installation Guide HOT 1
- pitch_detection.ipynb is broken in Colab
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ddsp.