Code Monkey home page Code Monkey logo

twardoch / audiostretchy Goto Github PK

View Code? Open in Web Editor NEW
31.0 2.0 2.0 2.26 MB

AudioStretchy is a Python wrapper around the `audio-stretch` C library, which performs fast, high-quality time-stretching of WAV/MP3 files without changing their pitch. Works well for speech, can time-stretch silence separately.

Home Page: https://pypi.org/project/audiostretchy/

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%
audio-speed audio-stretching time-domain-harmonic-scaling wav-audio

audiostretchy's Issues

OSError: exception: access violation writing 0x00000284C1557000

hello, i keep getting this error when using audiostretchy for a wav to wav speedup. any help would be appreciated. I am running it on windows

code:
from audiostretchy.stretch import stretch_audio
stretch_audio("input.wav", "output.wav", ratio=1.1)

error:
OSError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_16828\3037458555.py in
1 from audiostretchy.stretch import stretch_audio
2
----> 3 stretch_audio("input.wav", "output.wav", ratio=1.1)

~\anaconda3\lib\site-packages\audiostretchy\stretch.py in stretch_audio(input_path, output_path, ratio, gap_ratio, upper_freq, lower_freq, buffer_ms, threshold_gap_db, double_range, fast_detection, normal_detection, sample_rate)
354 audio_stretch = AudioStretch()
355 audio_stretch.open(input_path)
--> 356 audio_stretch.stretch(
357 ratio,
358 gap_ratio,

~\anaconda3\lib\site-packages\audiostretchy\stretch.py in stretch(self, ratio, gap_ratio, upper_freq, lower_freq, buffer_ms, threshold_gap_db, double_range, fast_detection, normal_detection)
314 stretcher.output_capacity(self.nframes, ratio), dtype=np.int16
315 )
--> 316 num_samples = stretcher.process_samples(
317 self.in_samples, len(self.in_samples), self.samples, ratio
318 )

~\anaconda3\lib\site-packages\audiostretchy\interface\tdhs.py in process_samples(self, samples, num_samples, output, ratio)
114 :return: The number of processed samples.
115 """
--> 116 return self.stretch_samples(self.handle, samples, num_samples, output, ratio)
117
118 def flush(self, output: np.ndarray) -> int:

OSError: exception: access violation writing 0x00000284C17F7000

Stretched videos not at the correct length

I am trying to sync translated audio segments with a video using timestamps returned alongside the audio segment itself from a speech to text package. However, even with the stretch ratio calculated correctly, the duration of certain audio segments become too long, particularly because of a strange long pause at the end of the audio segment. For example in the attached zip folder there is the original audio and the stretched one. When calculating the stretch ratio based on the timestamp, the result duration should be about 5-6 seconds, a stretch ratio of around 1.1. However when inputting it into the stretch audio function, the video becomes 8 seconds instead with a 3 second pause. It will be great to know what's causing the problem and if there's something I am unaware of. The relevant code and audio files are below. Thank you!

`

def generate_segment_audio(segment, speaker_id):
    start, end, translated_text = segment  # Gets start and end timestamps from the audio segment
    segment_path = os.path.join(output_dir, f'segment_{start}_{end}.wav')
    stretched_path = os.path.join(output_dir, f'segment_{start}_{end}_stretched.wav')
    duration = end - start
    # Generate the audio file with the TTS model
    model.tts_to_file(translated_text, speaker_id, segment_path, speed=speed)

    # Adjust the audio speed to match the duration
    segment_audio = AudioSegment.from_file(segment_path)
    current_duration = len(segment_audio) / 1000  # Convert to seconds
    stretch_ratio = duration / current_duration
    print(f'{stretch_ratio} = {duration} / {current_duration}')
    stretch_audio(segment_path, stretched_path, ratio=stretch_ratio)
    return segment_path

`

audiofiles.zip

Error when trying to use audiostretchy

Hello, I m getting this error when trying to run audiostretchy 1.3.1.
I tried uninstalling and reinstalling it with "pip install audiostretchy[all]" but it doesn't seem to help could you tell me how to fix this please?
I am getting this error when I run : audiostretchy m1.mp3 m12.mp3 -r1.25
Traceback (most recent call last):
"C:\Users\bekaba\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return run_code(code, main_globals, None, File "C:\Users\bekaba\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in run_code exec(code, run_globals) File "C:\Users\bekaba\AppData\Local\Programs\Python\Python310\Scripts\audiostretchy.exe_main.py", line 7, in File "C:\Users\bekaba\AppData\Local\Programs\Python\Python310\lib\site-packages\audiostretchy_main.py", line 10, in cli fire.Fire(stretch_audio) File "C:\Users\bekaba\AppData\Local\Programs\Python\Python310\lib\site-packages\fire\core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "C:\Users\bekaba\AppData\Local\Programs\Python\Python310\lib\site-packages\fire\core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "C:\Users\bekaba\AppData\Local\Programs\Python\Python310\lib\site-packages\fire\core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "C:\Users\bekaba\AppData\Local\Programs\Python\Python310\lib\site-packages\audiostretchy\stretch.py", line 336, in stretch_audio audio_stretch.open(input_path) File "C:\Users\bekaba\AppData\Local\Programs\Python\Python310\lib\site-packages\audiostretchy\stretch.py", line 57, in open self.open_mp3(audio_file) File "C:\Users\bekaba\AppData\Local\Programs\Python\Python310\lib\site-packages\audiostretchy\stretch.py", line 72, in open_mp3 with open(BytesIO(), "wb") as wav_io: TypeError: expected str, bytes or os.PathLike object, not BytesIO

How does gap_ratio works?

Hi,
First of all, thank you so much for your work, your lib is super useful for my project with ElevenLabs.

I can't wrap my head on the gap_ratio parameters. If I understand well, it's to increase the silence length in audio, isn't it?
So if I want to slow down a speech by 30% and increase the silence length by 2 (or 100%) I would use this command:
audiostretchy tests_audio.wav test_audio-1.3.wav -r 1.3 -g 2
But it doesn't seem to increase the silence. I feel I'm missing something...
Thanks for your help!

_Stretch.so file not found

I ran a linux docker image where I was using audiostretchy package and got this error:

translation-audio-worker-1 | File "/app/libs/voice_clone.py", line 6, in
translation-audio-worker-1 | from audiostretchy.stretch import stretch_audio
translation-audio-worker-1 | File "/usr/local/lib/python3.10/site-packages/audiostretchy/stretch.py", line 9, in
translation-audio-worker-1 | from .interface.tdhs import TDHSAudioStretch
translation-audio-worker-1 | File "/usr/local/lib/python3.10/site-packages/audiostretchy/interface/tdhs.py", line 35, in
translation-audio-worker-1 | stretch_lib = ctypes.cdll.LoadLibrary(str(lib_path))
translation-audio-worker-1 | File "/usr/local/lib/python3.10/ctypes/init.py", line 452, in LoadLibrary
translation-audio-worker-1 | return self._dlltype(name)
translation-audio-worker-1 | File "/usr/local/lib/python3.10/ctypes/init.py", line 374, in init
translation-audio-worker-1 | self._handle = _dlopen(self._name, mode)
translation-audio-worker-1 | OSError: /usr/local/lib/python3.10/site-packages/audiostretchy/interface/linux/_stretch.so: cannot open shared object file: No such file or directory

How can I fix this such that it works? The audiostretchy package was working on Mac, so it must be an OS / system package issue.

Ran an ldd in the machine and got: not a dynamic executable

@twardoch

silence in the audios

When I stretch an audio with a ratio different from 0.5 in the interval ]0; 1[ the audio size is indeed accelerated but the duration doesn't change. For example when do that in python:

stretch_audio("input.wav", "output.wav", ratio=0.7)

The output.wav file has the same duration as input.wav, but the beginning is correctly accelerated. There is a silence to fill it.

I plot a graph with the ratio on the x-axis (here there was a 0.1 difference between each calculation) and the time on the y-axis. You can see that there are steps and that the duration doesn't change for "special" values :

time evolution of a 47-second audio as a function of ratio

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.