twardoch / audiostretchy Goto Github PK

AudioStretchy is a Python wrapper around the `audio-stretch` C library, which performs fast, high-quality time-stretching of WAV/MP3 files without changing their pitch. Works well for speech, can time-stretch silence separately.

Home Page: https://pypi.org/project/audiostretchy/

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

audio-speed audio-stretching time-domain-harmonic-scaling wav-audio

audiostretchy's Issues

no matches found: audiostretchy[all]

I am trying to install audiostretchy with all deps like defined in Readme, but getting error
zsh: no matches found: audiostretchy[all]...

Exception then reading wav files

https://github.com/twardoch/audiostretchy/blob/48b54f244718ec4c3de5b89604e665b370a40206/src/audiostretchy/stretch.py#L105C12-L105C12

audio_file.close() is a string object and cannot be closed. Code seem to work then line removed.

OSError: exception: access violation writing 0x00000284C1557000

hello, i keep getting this error when using audiostretchy for a wav to wav speedup. any help would be appreciated. I am running it on windows

code:
from audiostretchy.stretch import stretch_audio
stretch_audio("input.wav", "output.wav", ratio=1.1)

error:
OSError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_16828\3037458555.py in
1 from audiostretchy.stretch import stretch_audio
2
----> 3 stretch_audio("input.wav", "output.wav", ratio=1.1)

~\anaconda3\lib\site-packages\audiostretchy\stretch.py in stretch_audio(input_path, output_path, ratio, gap_ratio, upper_freq, lower_freq, buffer_ms, threshold_gap_db, double_range, fast_detection, normal_detection, sample_rate)
354 audio_stretch = AudioStretch()
355 audio_stretch.open(input_path)
--> 356 audio_stretch.stretch(
357 ratio,
358 gap_ratio,

~\anaconda3\lib\site-packages\audiostretchy\stretch.py in stretch(self, ratio, gap_ratio, upper_freq, lower_freq, buffer_ms, threshold_gap_db, double_range, fast_detection, normal_detection)
314 stretcher.output_capacity(self.nframes, ratio), dtype=np.int16
315 )
--> 316 num_samples = stretcher.process_samples(
317 self.in_samples, len(self.in_samples), self.samples, ratio
318 )

~\anaconda3\lib\site-packages\audiostretchy\interface\tdhs.py in process_samples(self, samples, num_samples, output, ratio)
114 :return: The number of processed samples.
115 """
--> 116 return self.stretch_samples(self.handle, samples, num_samples, output, ratio)
117
118 def flush(self, output: np.ndarray) -> int:

OSError: exception: access violation writing 0x00000284C17F7000

Stretched videos not at the correct length

I am trying to sync translated audio segments with a video using timestamps returned alongside the audio segment itself from a speech to text package. However, even with the stretch ratio calculated correctly, the duration of certain audio segments become too long, particularly because of a strange long pause at the end of the audio segment. For example in the attached zip folder there is the original audio and the stretched one. When calculating the stretch ratio based on the timestamp, the result duration should be about 5-6 seconds, a stretch ratio of around 1.1. However when inputting it into the stretch audio function, the video becomes 8 seconds instead with a 3 second pause. It will be great to know what's causing the problem and if there's something I am unaware of. The relevant code and audio files are below. Thank you!

def generate_segment_audio(segment, speaker_id):
    start, end, translated_text = segment  # Gets start and end timestamps from the audio segment
    segment_path = os.path.join(output_dir, f'segment_{start}_{end}.wav')
    stretched_path = os.path.join(output_dir, f'segment_{start}_{end}_stretched.wav')
    duration = end - start

    # Generate the audio file with the TTS model
    model.tts_to_file(translated_text, speaker_id, segment_path, speed=speed)

    # Adjust the audio speed to match the duration
    segment_audio = AudioSegment.from_file(segment_path)
    current_duration = len(segment_audio) / 1000  # Convert to seconds
    stretch_ratio = duration / current_duration
    print(f'{stretch_ratio} = {duration} / {current_duration}')
    stretch_audio(segment_path, stretched_path, ratio=stretch_ratio)
    return segment_path

audiofiles.zip

Error when trying to use audiostretchy

Hello, I m getting this error when trying to run audiostretchy 1.3.1.
I tried uninstalling and reinstalling it with "pip install audiostretchy[all]" but it doesn't seem to help could you tell me how to fix this please?
I am getting this error when I run : audiostretchy m1.mp3 m12.mp3 -r1.25
Traceback (most recent call last):
"C:\Users\bekaba\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return run_code(code, main_globals, None, File "C:\Users\bekaba\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in run_code exec(code, run_globals) File "C:\Users\bekaba\AppData\Local\Programs\Python\Python310\Scripts\audiostretchy.exe_main.py", line 7, in File "C:\Users\bekaba\AppData\Local\Programs\Python\Python310\lib\site-packages\audiostretchy_main.py", line 10, in cli fire.Fire(stretch_audio) File "C:\Users\bekaba\AppData\Local\Programs\Python\Python310\lib\site-packages\fire\core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "C:\Users\bekaba\AppData\Local\Programs\Python\Python310\lib\site-packages\fire\core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "C:\Users\bekaba\AppData\Local\Programs\Python\Python310\lib\site-packages\fire\core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "C:\Users\bekaba\AppData\Local\Programs\Python\Python310\lib\site-packages\audiostretchy\stretch.py", line 336, in stretch_audio audio_stretch.open(input_path) File "C:\Users\bekaba\AppData\Local\Programs\Python\Python310\lib\site-packages\audiostretchy\stretch.py", line 57, in open self.open_mp3(audio_file) File "C:\Users\bekaba\AppData\Local\Programs\Python\Python310\lib\site-packages\audiostretchy\stretch.py", line 72, in open_mp3 with open(BytesIO(), "wb") as wav_io: TypeError: expected str, bytes or os.PathLike object, not BytesIO

How does gap_ratio works?

Hi,
First of all, thank you so much for your work, your lib is super useful for my project with ElevenLabs.

I can't wrap my head on the gap_ratio parameters. If I understand well, it's to increase the silence length in audio, isn't it?
So if I want to slow down a speech by 30% and increase the silence length by 2 (or 100%) I would use this command:
audiostretchy tests_audio.wav test_audio-1.3.wav -r 1.3 -g 2
But it doesn't seem to increase the silence. I feel I'm missing something...
Thanks for your help!

_Stretch.so file not found

I ran a linux docker image where I was using audiostretchy package and got this error:

translation-audio-worker-1 | File "/app/libs/voice_clone.py", line 6, in
translation-audio-worker-1 | from audiostretchy.stretch import stretch_audio
translation-audio-worker-1 | File "/usr/local/lib/python3.10/site-packages/audiostretchy/stretch.py", line 9, in
translation-audio-worker-1 | from .interface.tdhs import TDHSAudioStretch
translation-audio-worker-1 | File "/usr/local/lib/python3.10/site-packages/audiostretchy/interface/tdhs.py", line 35, in
translation-audio-worker-1 | stretch_lib = ctypes.cdll.LoadLibrary(str(lib_path))
translation-audio-worker-1 | File "/usr/local/lib/python3.10/ctypes/init.py", line 452, in LoadLibrary
translation-audio-worker-1 | return self._dlltype(name)
translation-audio-worker-1 | File "/usr/local/lib/python3.10/ctypes/init.py", line 374, in init
translation-audio-worker-1 | self._handle = _dlopen(self._name, mode)
translation-audio-worker-1 | OSError: /usr/local/lib/python3.10/site-packages/audiostretchy/interface/linux/_stretch.so: cannot open shared object file: No such file or directory

How can I fix this such that it works? The audiostretchy package was working on Mac, so it must be an OS / system package issue.

Ran an ldd in the machine and got: not a dynamic executable

@twardoch

silence in the audios

When I stretch an audio with a ratio different from 0.5 in the interval ]0; 1[ the audio size is indeed accelerated but the duration doesn't change. For example when do that in python:

stretch_audio("input.wav", "output.wav", ratio=0.7)

The output.wav file has the same duration as input.wav, but the beginning is correctly accelerated. There is a silence to fill it.

I plot a graph with the ratio on the x-axis (here there was a 0.1 difference between each calculation) and the time on the y-axis. You can see that there are steps and that the duration doesn't change for "special" values :

no matches found: audiostretchy[all]

pip install audiostretchy[all]

This doesn't work and return

no matches found: audiostretchy[all]

twardoch / audiostretchy Goto Github PK

audiostretchy's Issues

no matches found: audiostretchy[all]

Exception then reading wav files

OSError: exception: access violation writing 0x00000284C1557000

Stretched videos not at the correct length

Error when trying to use audiostretchy

How does gap_ratio works?

_Stretch.so file not found

silence in the audios

no matches found: audiostretchy[all]

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent