I'm having difficulty keeping consistent low latency while streaming audio (input and

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Any update on this issue <a class="user-mention notranslate" data-hovercard-type="user

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Latency increasing over time,about bastibe/soundcard

Comments (18)

Chum4k3r commented on August 30, 2024

You are "appending" which means you are reallocating the memory at every iteration. After a while, the memory blocks becomes larger and larger and larger and it takes more time to reallocate them. This is not a good practice at all. The memory allocation should be managed by a separate process. Read this article of Ross Bencina, it should help you how to and how not to deal with audio and syscalls.

from soundcard.

bastibe commented on August 30, 2024

@Chum4k3r That's not actually the case in the code above: The code is using lists, which can be appended to without penalty. Numpy arrays would be a different matter.

This might be a buffering issue. Try record without numframes. It might be that pulseaudio is returning slightly-too-long buffers, which block the play for an additional block, which cause record to return even-longer buffers, which block play for another additional block, etc.. record without numframes simply returns whatever audio data is available at any one moment, without incurring any delays. As long as this data is less than one blocksize, it should play without delay as well.

It might also be a performance issue. If pulseaudio repeatedly notices that your code doesn't provide/consume audio data on time, it can decide to raise the internal block size.

You could also time the execution of play and record to check which of them is actually consuming too much time.

from soundcard.

mpariente commented on August 30, 2024

Any update on this issue @tetelestia ?

from soundcard.

tetelestia commented on August 30, 2024

@mpariente Not really. I never found any reliable audio streaming on Linux and am focusing all on mobile development now.

I tried the fix bastibe suggested, about removing the specification numframes in record, but I get a similar result. I ran this a few times, and sometimes it seems stable, but only when latency approaches ~200 ms. The starting latency seems almost random.

I'd be very interested if you do find a solution though. The closest I got was with JACK Audio Connection Kit, but that had catastrophic failures when processing time on my frame was greater than the buffer size.

from soundcard.

mpariente commented on August 30, 2024

I read your code too fast at first and I didn't notice that the frame size you want and the block size you ask from pulseaudio (numframes and block_size) are the same. Try setting block_size to a smaller value and depending on the hardware, it will help. The graphs you show are not surprising given that these two values are the same : if pulseaudio returns a frame which is too short, the player will have to wait until the next frame to get any data to play, thus waiting approx num_frames and this delay will accumulate. I bet the size of the jumps you observe in the graphs are typically of size num_frames, right?

Also, I'm not sure you understood the role of mic.flush(). It removes the remaining bytes that were received from pulseaudio but didn't fit in num_frames, I don't think it's intended to be called inside a recording stream.

Finally, I did manage to get reasonably stable audio streaming with soundcard, with short delays. But sadly I don't own the code so I cannot share it.
The main idea is to have the mic and the speaker share a queue and when the queue is too long, you can drop one frame or more. If you have an additional condition for dropping (like frame is silent or something), it can go unnoticed.

from soundcard.

bastibe commented on August 30, 2024

Did you find a solution to your problem?

from soundcard.

ronny-rentner commented on August 30, 2024

First of all, many thanks for this great library. It's a real relieve after using other audio libraries in Python.

Anyways, I have a similar issue with recording and playing at the same time (like a loopback). I'm using a null-sink as a test speaker to play the audio that I record in real time with a microphone.

When I run pulseaudio with -vvvv flags, I can see messages like the following in pulseaudio's debug log:

11140.604|   2.035) D: [null-sink] protocol-native.c: Implicit underrun of 'outputstream'
(11140.605|   0.001) D: [null-sink] protocol-native.c: Requesting rewind due to end of underrun.
(11140.605|   0.000) D: [null-sink] module-null-sink.c: Requested to rewind 480 bytes.
(11140.605|   0.000) D: [null-sink] sink.c: Processing rewind...
(11140.605|   0.000) D: [null-sink] sink-input.c: Have to rewind 436 bytes on render memblockq.
(11140.605|   0.000) D: [null-sink] source.c: Processing rewind...
(11140.605|   0.000) D: [null-sink] source-output.c: Have to rewind 436 bytes on implementor.
(11140.605|   0.000) D: [null-sink] module-null-sink.c: Rewound 436 bytes.

As you can see my block sizes are really low to achieve real time.

When I initialize the player like

with speaker.player(samplerate=48000, channels=[0, 1], blocksize=BLOCKSIZE) as p:
I can measure this step alone already takes around 16-21 ms. With my 10 ms blocksize, I don't know how many frames to play. If I only play one frame, somehow pulseaudio comes to the conclusion that there's a buffer underrun which will make subsequent playing "hang" for 2 seconds.

If I buffer too many frames upfront, then there's too much latency for the player because it never catches up.

I've seen in the C library, there's a function to find out how much data can be sent to a speaker. I guess I'd need this exposed to not get buffer underruns.

Apart from all that, the suggestion from @mpariente does work for me, so just dropping frames when the player is too slow. Unfortunately it's slower than it would be to be without this rewinding and hanging for 2 seconds. Anyways, with dropping frames, after those initial 2 seconds, I can catch up and everything runs smoothly. Just not in the first 2 seconds.

PS: Using Debian 11 Linux.

from soundcard.

bastibe commented on August 30, 2024

Thank you for your kind words!

It might be that pulse is not honoring your requested block size. There's a latency property on players and recorders that might give you a hint on how block sizes change.

Also, you can query pulse at any time as to how many bytes are available to read or write with soundcard._pulse._pa_stream_readable_size(recorder.stream) and soundcard._pulse._pa_stream_writable_size(player.stream). This is what soundcard is using internally.

from soundcard.

szlop commented on August 30, 2024

I stumbled upon similar problems, when I used SoundCard in a real-time application. Whenever the process encounters a buffer underrun, Pulseaudio increases the internal buffer, till the latency exceeds 2 seconds or more. I used a quick hack to set a hard limit to the Pulseaudio backend, since in my case occasional buffer underruns were acceptable but the ever increasing latency was not.
You can find the changes I did to the code here:
szlop@860f41c

If not for personal reasons, I would have followed up with a proper solution and a merge request. However, I hope the linked commit points you in the right direction.

from soundcard.

szlop commented on August 30, 2024

I also experimented with a try-record function, which instead of waiting returns None, if there are not enough frames in the Pulseaudio buffer:
szlop@97993f6

I'm not sure though, if this helps with yout problem.

from soundcard.

bastibe commented on August 30, 2024

Thank you for your input, @szlop! I wonder how the try-record function is functionally different from record(numframes=None)?

If I understand this problem correctly, it is that recording returns just a tiny bit too little data for playing. Is that correct? Or is it just a mismatch of block sizes (which could be solved with caching). The former is really unsolvable in soundcard itself, but quietly appending a frame or two of plausible data to the recording would probably work around the issue without too much trouble. The latter would need some more engineering as a solution.

from soundcard.

szlop commented on August 30, 2024

Thank you for your input, @szlop! I wonder how the try-record function is functionally different from record(numframes=None)?

The only difference is, that record(numframes=None) returns all the frames that are available while try-record(numframes=n) returns either None or exactly n frames, so the user does not need another queue for the frames she or he doesn't neet at the time. It was my intention to run the garbage collector if try-record came back None and call record afterward. That didn't work out, the garbage collection takes too long to be squeezed in the gap.

from soundcard.

ronny-rentner commented on August 30, 2024

@szlop Thanks for your hint. My problem is actually not with the record function but with playing. Also I also cannot observe that pulseaudio would automatically increase the latency. It doesn't do that in my installation but maybe it's also related to the type of device. I'm playing to a virtual null-sink device.

@bastibe Thanks for your help. I had found the play function and tested around a bit.

In https://github.com/bastibe/SoundCard/blob/master/soundcard/pulseaudio.py#L751 it tries to find out using how many bytes are writable, just as you've said.

It does nwrite = _pulse._pa_stream_writable_size(self.stream) // 4

I assume nwrite should be the number of frames that are writable, thus dividing by 4 bytes of a float32 but it doesn't consider the channels. It tried to write too many bytes to pulseaudio, more than it says that are writeable.

Then again, this has no impact at all because according to the pulseaudio docs

It is possible to write more than this amount, up to the stream's buffer_attr.maxlength bytes. This is usually not desirable,
though, as it would increase stream latency to be higher than requested (buffer_attr.tlength).

SoundCard sets maxlength to a really high value, so maybe the while loop in the play() function is not necessary at all?

I've corrected the nwrite calculation in a local test but it unfortunately doesn't improve the situation with the initial hick-ups and rewinding for me.

PS: I've also monitored the latency for the player and recorder using the SoundCard feature and they give really low values until the player hangs. The recorder is constantly < 1 ms. The player < 10 ms. Only in the beginning, after the first one or two writes, the playing hangs (due to rewing) and it gives 2000 ms and afterwards goes down to < 10ms again as I'm dropping frames.

from soundcard.

bastibe commented on August 30, 2024

Thank you for your analysis.

Good idea that pa_stream_writable_size might need to be divided by the number of configured channels! That might be true, and might even explain the latency hikes.

The maxlength used by SoundCard is (if I remember correctly) pulse's "default" value. It just means that pulse may freely choose. Does the situation improve if this is set to a fixed value?

from soundcard.

ronny-rentner commented on August 30, 2024

Here's a link to the PulseAudio docs: https://freedesktop.org/software/pulseaudio/doxygen/structpa__buffer__attr.html

It says to get the default you should set the value of maxlength to -1 but SoundCard sets it to
bufattr.maxlength = 2**32-1 # max buffer length but that might eventually be the same.

I've tested around a bit more and I've solved my problem by removing the loop from the play function like

def play(self, data):
        data = numpy.array(data, dtype='float32', order='C')
        if data.ndim == 1:
            data = data[:, None] # force 2d
        if data.ndim != 2:
            raise TypeError('data must be 1d or 2d, not {}d'.format(data.ndim))
        if data.shape[1] == 1 and self.channels != 1:
            data = numpy.tile(data, [1, self.channels])
        if data.shape[1] != self.channels:
            raise TypeError('second dimension of data must be equal to the number of channels, not {}'.format(data.shape[1]))
        bytes = data.ravel().tostring()
        _pulse._pa_stream_write(self.stream, bytes, len(bytes), _ffi.NULL, 0, _pa.PA_SEEK_RELATIVE)

Now I can record and play (so loopback) in realtime and there are no buffer underflows or overflows, also not in the beginning.

I now also suspect the

if nwrite == 0:
    time.sleep(0.001)
    continue

in the original version with the loop is causing the issue for the original poster.

This 1 ms will slowly add up given the calculation of nwrites.

I think under no circumstances it is a good idea to add 1 ms there from nowhere. It will automatically cause latency issues if you are playing real time because that 1 ms of audio cannot appear from nowhere, so in the best case with all the protection in PulseAudio (the rewinding stuff) it will cause some cracks in the output.

What I don't understand yet is why it didn't work for me when I had corrected the nwrites calculation but it only works now after I've removed the loop completely and write the whole data at once, no matter how much is writeable. I rely on the really huge maxlength and my blocks are really small anyways.

from soundcard.

ronny-rentner commented on August 30, 2024

Hmm, ok, I was wrong about removing the loop solving my problem. I have no 2 s delay but therefore the overall latency goes up in the interface. I'm afraid I'll need some more testing.

from soundcard.

ronny-rentner commented on August 30, 2024

It's really mysterious. This morning I come back and I run that same test script from yesterday evening and everything works smoothly without any changes. I kept the computer running, it's all the same code, at least as far as I am aware, but still the results are much better.

from soundcard.

bastibe commented on August 30, 2024

Here's a link to the PulseAudio docs: https://freedesktop.org/software/pulseaudio/doxygen/structpa__buffer__attr.html

It says to get the default you should set the value of maxlength to -1 but SoundCard sets it to bufattr.maxlength = 2**32-1 # max buffer length but that might eventually be the same.

That's the same. In C you'd write -1 and cast it to unsigned, which would convert it to 2**32-1. But CFFI won't let you write an unsigned number to a signed struct member, so you have to write 2**32-1 manually instead.

Regarding your point about the millisecond sleep: I think the common use case is to either play or record. If you just want to play audio, waiting is the right thing to do. But in your case it is not.

Actually the same thing is implemented for recording as well, where the default case waits until the requested number of samples is available. But for recording, we also have the the non-blocking record (with numframes=None). Perhaps a non-blocking version of playing would help, which just plays however much it can at the moment without waiting, and returns the number of played frames.

from soundcard.

Latency increasing over time about soundcard HOT 18 OPEN

Comments (18)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent