Code Monkey home page Code Monkey logo

Comments (14)

igorski avatar igorski commented on July 26, 2024

That's an interesting conundrum.

So basically the audio that comes from the input/microphone is what we is perceived like audio from the past (due to the latency on certain devices). The audio that is synthesized internally is what we can consider realtime and together we have a mismatch in timing.

When the engine mixes in the input signal with the internally generated audio, the input is obviously lagging behind in time (your recording sounded quite extreme, but such is the nature of the fragmented Android ecosystem, where performance can be considerably different, so its something to consider).

MWEngine can calculate the latency (as the AAudio driver provides such a facility), but now I'm thinking how it should correct the "position" of the recorded input signal when mixing it with the internally generated audio. The calculation to do so is quite simple (basically we "align" the input and internal audio by pushing the internal audio recording forwards by the latency), but this implies that we will need to be using twice the memory during recording as we need to keep the recorded input and internal audio separate until the final mixing stage... which is quite a penalty.

ALTERNATIVELY the recording of the internal audio will be "pushed forward" (thus delayed to sync with the input signal) for the duration of the input latency. This however prevents a problem when recording starts during audio playback (as in: the user can decide on the spot while audio is playing whether they want to record or not). Is that the case for your application? Or will your application have a single button that activates microphone recording and starts the internal audio playback at the same time ?

from mwengine.

YogarajRamesh avatar YogarajRamesh commented on July 26, 2024

Hi @igorski
Thanks for your response
My Application is Single button based(it start both recording and internal audio playback). In that case how can we push the internal audio based on the latency calculation?

Thanks in advance

from mwengine.

YogarajRamesh avatar YogarajRamesh commented on July 26, 2024

Hi @igorski,
Any suggestions on this?

Thanks in advance

from mwengine.

YogarajRamesh avatar YogarajRamesh commented on July 26, 2024

Hi @igorski

Hope your are doing great. First of all thank you for Time you put into this. By any chance is there any updates?

Thanks in advance

from mwengine.

igorski avatar igorski commented on July 26, 2024

Hi there @YogarajRamesh

I'm afraid this is a notoriously difficult thing to address. The most accurate way to calculate device latency is to use a loopback device which isn't really something you can expect your users to have and use (also there will be requirements on how they should conduct this test which needs to be done very carefully)... sadly the amount of devices running Android encompass such a wide range of configurations that it's not really feasible to "guesstimate" an appropriate latency (which is quite the luxury for iPhone users as each iPhone of a specific version is the exact same device).

In the latest commit, I have added a "warmup" phase that aims to synchronize the input and output streams as best as possible to minimize latency and force the input stream to be operating in low latency mode, but the results will sadly be device dependent. I have seen improvements with a low range device though I'm afraid this is all that can be done. There must be a reason why Smule has several patents to their name :/

from mwengine.

YogarajRamesh avatar YogarajRamesh commented on July 26, 2024

Hi @igorski ,
Thanks for you reply, I tried the latest build still facing the same issues.
Is there any other to work around this issues? By adding manual slider in UI to adjust the latency by user?

Thanks in advance

from mwengine.

YogarajRamesh avatar YogarajRamesh commented on July 26, 2024

Hi @igorski,

Any suggestions on this??

Thanks in advance

from mwengine.

igorski avatar igorski commented on July 26, 2024

Hi @YogarajRamesh

This is a tough cookie to crack (is that even a saying ?). Anyways I have created a branch duplex (also see this pull request) which you can give a spin. You can build the library for use in your app or give the updated example Activity within that branch a go basically:

There are new recording methods:

MWEngineInstance.startFullDuplexRecording( float roundtripLatencyInMs, String outputFileName );
MWEngineInstance.stopFullDuplexRecording();

Where roundtripLatencyInMs is a floating point value describing the latency between speaking into the microphone and hearing the audio back over the speaker, in milliseconds (so you can enter a value like 400 in case you measure the latency to be around 0.4 seconds). outputFileName speaks for itself as it is similar to all other recording methods.

When the "full duplex" recording is started, the engine:

  • will record the engine output (e.g. all synthesized sounds, playback of sequenced events)
  • will record the device input
  • will mute the input recording (to prevent feedback)
  • will record the output and input streams separately

Once recording is stopped, the engine will take the output and input streams and align them with the latency you provided when recording started. So if the latency was specified as 400 ms, the input stream is mixed into the output buffer 400 ms earlier, hopefully aligning a vocal performance with the sequenced output.

For the test app its good to know that you must drag the latency slider before starting the recording (as the latency is provided to the record method when it starts). For your actual app you can follow an approach that you would see in Rock Bands calibration menu, basically present the user with a simple activity where they themselves can tune the latency and hear the result (like "please sing in time with the following drum sequence"), or something to that effect.

I still need to do some backward compatibility checks with all other recording methods before merging the branch, but you can test if this can solve your problem.

from mwengine.

YogarajRamesh avatar YogarajRamesh commented on July 26, 2024

Hi @igorski
Thank you so much

It’s really working great as expected. I have tested in multiple low end phones and hight end phones by using the slider the all the outputs are almost same.
I have one question.
" Once recording is stopped, the engine will take the output and input streams and align them with the latency you provided when recording started. So if the latency was specified as 400 ms, the input stream is mixed into the output buffer 400 ms earlier, hopefully aligning a vocal performance with the sequenced output. "
Since the align phase take place after the recording is stopped. Can we set the latency delay value by checking the output I.e, recorded audio over music like preview ?. then align it.

Thanks in advance

from mwengine.

igorski avatar igorski commented on July 26, 2024

Hi @YogarajRamesh

Since the align phase take place after the recording is stopped. Can we set the latency delay value by checking the output I.e, recorded audio over music like preview

If I understand correctly: when recording is stopped, you want to preview both the recorded music as the recorded input side by side so the user can drag the "position of the input recording" to align with the output ?

Well, that's going to present a few challenges as it means we need to allocate memory for potentially two large recordings (depending how long the performance lasts), as MWEngine doesn't stream audio from storage (as in the context of a live audio processing runtime its too much of a performance bottleneck).

What I'm thinking is that your application can do a one-time setup upon first install. So user needs to sing along to a very short clip (maybe a four bar loop of a constant drum pulse where they say a short word to align with the timing of the pulse) and match their recorded input with the sequenced drum. The setup for that recording being:

  • present start button, upon click the sequencer starts playback while at the same you invoke startInputRecording (so we are only recording the device input, not full duplex with the output)
  • track the progress for four bars using the Notifier mechanism also seen in the example Activity (once four bars have elapsed, you will be stopping both the sequencer and invoking stopInputRecording to stop recording and save it to storage)
  • now load the written input recording from storage, and load it into the SampleManager
  • present new UI where you have a "play" button which at the same time starts the sequencer from the beginning, but also plays a SampleEvent (where the sample is the input recording) at 0 samples offset
  • UI also has a slider where you can adjust the offset of the input recording (you want the sliders value to generate a number in milliseconds)
  • this value in milliseconds you convert to an amount in samples (using BufferUtility for the conversion) and set it as the startOffset for the SampleEvent playing back the recording (note that when its supplied as startOffset you must pass the number as a negative value as we are pulling its playback "forward" (e.g. it starts at an earlier point in time).
  • when user is satisfied with the alignment they click a save button and the value in milliseconds is stored in your device storage (which you can provide to the new startFullDuplexRecording() method when recording begins for a performance).
  • this setup screen is not shown again (unless user decides to reconfigure it through a settings menu or something)

So this becomes a one time setup where all subsequent sessions for the user require no further setup (as the latency will not change for their device). This avoids making this a repetitive task upon each performance (also it will be more honest as people tend to overestimate their timing skills and will have differences in accuracy between performances).

from mwengine.

YogarajRamesh avatar YogarajRamesh commented on July 26, 2024

Hi @igorski

Thank you so much.
I am also trying to use the same approach to handle it.
Once again thank you for your suggestion.
And is this startFullDuplexRecording() available in main branch?

Thanks in advance

from mwengine.

igorski avatar igorski commented on July 26, 2024

Hi @YogarajRamesh

I finished all tests (needed to ensure all existing recording methods remained working as before) and have merged the code into the main branch of the repository.

For reference, all we discussed with additional input has been added to the documentation on recording.

from mwengine.

YogarajRamesh avatar YogarajRamesh commented on July 26, 2024

Hi @igorski

Thank you so much

from mwengine.

YogarajRamesh avatar YogarajRamesh commented on July 26, 2024

Hi @igorski

Is there way to manually turn off/on the feedback? while using startFullDuplexRecording()

  • will mute the input recording (to prevent feedback)

Thanks in advance

from mwengine.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.