Code Monkey home page Code Monkey logo

Comments (10)

bastibe avatar bastibe commented on August 30, 2024 1

That's very interesting, thank you!

If I understand this correctly, it means that (some variants of) the windows audio API just return no no data when none is available. Which is not in itself a problem, but breaks the assumption of soundcard, which would rather return zeros than no data. We can fudge that by just making up some zeros if no data is available.

However, the question then becomes: How many zeros should we return? Because the length of the output is how soundcard expresses how much time has passed. In this case, it is probably acceptable if the number of zeros is off by some margin of error. Ideally, we'd ask the audio driver for a current "time", but as far as I can tell, no such API is available.

As a workaround, change _record_chunk like this:

    def _record_chunk(self):
        # skip docstring for this example...
        start_time = 0 # in the real implementation, make this self.start_time so we don't skip processing time
        while not self._capture_available_frames():
            if start_time == 0:
                start_time = time.perf_counter_ns()
            now = time.perf_counter_ns()

            # no data for 50 ms: give up and return zeros.
            if now - start_time > 50_000_000:
                ppMixFormat = _ffi.new('WAVEFORMATEXTENSIBLE**')
                hr = self._ptr[0][0].lpVtbl.GetMixFormat(self._ptr[0], ppMixFormat)
                _com.check_error(hr)
                samplerate = ppMixFormat[0][0].nSamplesPerSec # in the real implementation, cache samplerate in self.
                num_samples = samplerate * (now - start_time) / 1_000_000
                return numpy.zeros([len(set(self.channelmap)) * num_samples], dtype='float32')
            time.sleep(0.001)
        # continue with the rest of the function below the while loop...

This should give you a reasonable estimate of the correct number of zeros. If this solves your problem, I'll code up a proper implementation.

from soundcard.

tez3998 avatar tez3998 commented on August 30, 2024 1

@bastibe
Thanks for great info.
I was able to write code that works correctly on three output devices.

The result of testing your code on my machine

Debugging your code

I changed the following sections because there are errors.

# your original code
samplerate = ppMixFormat[0][0].nSamplesPerSec

# modified code
samplerate = ppMixFormat[0][0].Format.nSamplesPerSec
# your original code
num_samples = samplerate * (now - start_time) / 1_000_000

# modified code
num_samples = int(samplerate * (now - start_time) / 1_000_000)

Result

Your code ended immediately and recorded silence because numpy.zeros() returned array large enough for the code to finish.

The code which worked correctly

Code

_record_chunk()`s while loop in mediafoundation.py

start_time = 0 # in the real implementation, make this self.start_time so we don't skip processing time
while not self._capture_available_frames():
    if start_time == 0:
        start_time = time.perf_counter_ns()
    now = time.perf_counter_ns()

    # no data for 50 ms: give up and return zeros.
    if now - start_time > 50_000_000:
        ppMixFormat = _ffi.new('WAVEFORMATEXTENSIBLE**')
        hr = self._ptr[0][0].lpVtbl.GetMixFormat(self._ptr[0], ppMixFormat)
        _com.check_error(hr)
        samplerate = ppMixFormat[0][0].Format.nSamplesPerSec # in the real implementation, cache samplerate in self.
        num_samples_per_ms = samplerate / 1_000
        num_channels = len(set(self.channelmap))
        giveup_ms = 50
        return numpy.zeros(int(num_samples_per_ms * giveup_ms * num_channels), dtype='float32')

    # rewrote time.sleep(0.001), because time.sleep(0.001) cannot sleep for 1ms.
    remaining_time = 1
    sleep_ms = 1
    _start = time.perf_counter()
    while remaining_time > 0:
        elapsed_time = (time.perf_counter() - _start) * 1_000
        remaining_time = sleep_ms - elapsed_time

Test code

I added some codes which print info.

import soundcard as sc
import soundfile as sf
import time

OUTPUT_FILE_NAME = "out.wav"    # output file name.
SAMPLE_RATE = 48_000              # [Hz]. sampling rate.
RECORD_SEC = 5                  # [sec]. recording duration.

print(f"output device: {str(sc.default_speaker().name)}")

with sc.get_microphone(id=str(sc.default_speaker().name), include_loopback=True).recorder(samplerate=SAMPLE_RATE) as mic:
    _start_time: float = time.perf_counter()
    
    # record audio with loopback from default speaker.
    data = mic.record(numframes=SAMPLE_RATE*RECORD_SEC)

    # output info
    print("\n-- info --")
    print(f"len of data: {len(data)}")
    print(f"elapsed time: {time.perf_counter() - _start_time}s")
    print("-- -- -- --\n")

    sf.write(file=OUTPUT_FILE_NAME, data=data[:, 0], samplerate=SAMPLE_RATE)

Result

Initially, the code recorded silence and then recorded sound from YouTube.
In this demo, the code ended in 5.076047300011851s.

soundcard_bug.mp4

from soundcard.

bastibe avatar bastibe commented on August 30, 2024

Depending on the sound card, silence is either reported as no-data, or as silence. However, support for this in soundcard has not been published yet, as I didn't have a good test case yet.

Could you try running your code against the current Git master of soundcard? I believe your issue should be fixed on there. And if it is, I will publish it as a new version as soon as you confirm that it's working as intended.

from soundcard.

tez3998 avatar tez3998 commented on August 30, 2024

@bastibe
I appreciate your quick response during your busy time.

Result

I cloned the current master of soundcard and ran the code written above on three output devices.
The results are as shown in the following table.

Output device Was there sound at the start of the code? Result
AMD High Definition Audio Device No Ended immediately and recorded silence.
AMD High Definition Audio Device Yes Successfully recorded.
Realtek(R) Audio No Successfully recorded.
Realtek(R) Audio Yes Successfully recorded.
Pixel Buds A-Series No Ended immediately and recorded silence.
Pixel Buds A-Series Yes Successfully recorded.

And I encountered the following warning at a random timing on all output devices, but the code could works as the above (Timing was random, but warnings tended to occur when output devices were switched before running the code).

C:\Users\user\workspace\clone\bastibe\SoundCard\soundcard\mediafoundation.py:750: SoundcardRuntimeWarning: data discontinuity in recording
  warnings.warn("data discontinuity in recording", SoundcardRuntimeWarning)

from soundcard.

bastibe avatar bastibe commented on August 30, 2024

Oh, the endless vagaries of sound drivers on Windows.

Regrettably, I can't debug this issue on my machine, as my sound card behaves like your Realtek. Could you check how this fails in _record_chunk for the affected sound cards?

I could imagine that GetNextPacketSize in _capture_available_frames returns AUDCLNT_E_DEVICE_INVALIDATED.

Alternatively, you could try extending the empty-watcher to more than 10ms. I have seen Windows sound cards taking up to 4s to wake up in extreme cases, if that's the problem. Perhaps we need to wait until AUDCLNT_E_SERVICE_NOT_RUNNING clears?

However, if so, I still don't know how to proceed in soundcard, as the API does not give an indication of how much silence there was. Soundcard operates on the assumption that you can get a fixed number of samples per second. WASAPI just refusing to return anything breaks that assumption. If you have a reasonable idea of how to deal with that, I'm all ears!

from soundcard.

tez3998 avatar tez3998 commented on August 30, 2024

@bastibe
I checked a behavior of SoundCard when output device was Pixel Buds A-Series and there was no sound at the start of the code.

The results of testing your opinions

The value returned from GetNextPacketSize in _capture_available_frames()

Unlike your expectation, GetNextPacketSize always returned 0.

Extending the empty-watcher to more than 4s

I extended empty-watcher to 5s and the code ended in about 5s from its start.

Waiting until AUDCLNT_E_SERVICE_NOT_RUNNING clears

I don`t know what to do due to the lack of my knowledge about audio. Sorry for this.

What I noticed

time.sleep() cannot sleep for 1ms

I noticed time.sleep(0.001) actually sleeps for not 1ms but about 5-15ms. This answer in stackoverflow says the smallest interval you can sleep for is about 10-13ms. If so, we need to use other method.

The reason the code ends immediately and records silence when there is no sound at the start of the code on Pixel Buds A-Series

The behavior of SoundCard in this case is as follows.

  1. If there is no sound at the start of the the code, _record_chunk() returns zero-sized array.
  2. if len(chunk) == 0 in record() is True.
  3. At this time, required_frames is 480000 and recorded_frames is 0. So a variable chunk is required_frames-sized array which value is all 0.
  4. Now, while recorded_frames < required_frames: in record() is False. So the code exits the while loop and record() ends.

from soundcard.

bastibe avatar bastibe commented on August 30, 2024

Please check out #167 for an implementation of this workaround, and thank you again for your analysis and examples!

If #167 works for you, I will try to publish it in a new version of soundcard next week.

from soundcard.

tez3998 avatar tez3998 commented on August 30, 2024

It worked fine on my three output devices!

from soundcard.

bastibe avatar bastibe commented on August 30, 2024

Perfect! Thank you for your feedback!

from soundcard.

tez3998 avatar tez3998 commented on August 30, 2024

@bastibe
Thank you too for your help during your busy time!

from soundcard.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.