Code Monkey home page Code Monkey logo

fwk_voice's People

Contributors

brennangit avatar danielpieczko avatar ed-xmos avatar johnshaferxmos avatar keithm-xmos avatar lucianomartin avatar mbanth avatar shuchitak avatar uvvpavel avatar xmos-jmccarthy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fwk_voice's Issues

Implement GPIO interface

  • 4 General Purpose Output pins. These can be configured as simple digital I/O pins, Pulse Width Modulated (PWM) outputs and rate adjustable LED flashers.
  • 4 General Purpose Input pins. These can be used as simple logic inputs or event capture (edge detection).

AGC passing tests

Port and modify existing integration and component-level tests, design and implement any new tests and use them to show correct operation of the Automatic Gain Controller. The tests shall reside in a repository separate from sw_avona. Each test may require specific hardware or a hardware simulator for operation.

250k model no longer fits in SRAM

This branch from a commit on July 19 is the last one where the 250k mode (barely) fits in SRAM. The very next commit (489ffe2) added support for the model being stored in the filesystem. As part of this change, several buffer were increased in size.

  • model runner stack went from 650 words to 2500 words
  • decode buffer size went from 30000 bytes to 35000 bytes
  • RTOS heap went from 120 * 1024 bytes to 256 * 1024 bytes

The heap is the main culprit, increasing by 139400 bytes. And, perhaps the decode buffer can return to 30000. Need to investigate.

IC functionally complete

Re-design and implement the Interference Canceller to provide necessary functionality. Implementation shall use C and be suitable for use under FreeRTOS or bare-metal. It should use lib_xs3_math and VPU optimisations where possible. Nothing in the implementation shall refer to or require specific hardware, e.g., an xCore.AI. Completion of this issue does not require all tests passing. It may involve re-design and implementation of unit tests at the descretion of the engineer to show that the re-worked IC functions operate as intended.

ADEC functionally complete

Re-design and implement the Automatic Delay Estimation Controller and surrounding components to provide necessary functionality. Implementation shall use C and be suitable for use under FreeRTOS or bare-metal. It should use lib_xs3_math and VPU optimisations where possible. Nothing in the implementation shall refer to or require specific hardware, e.g., an xCore.AI. Completion of this issue does not require all tests passing. It may involve re-design and implementation of unit tests at the descretion of the engineer to show that the re-worked ADEC functions operate as intended.

AGC documentation complete

Provide documentation that describes:

  • What the AGC does from a user's perspective,
  • Its mode(s) of operation, and
  • The purpose, parameters, return values and any constraints for each function in the component's API.

General component documentation shall use ReStructured Text. Function documentation shall use Doxygen.

Keep examples to a minimum (we will add more later), and only include ones necessary to describe the purpose of the component and its mode(s) of operation.

Add ability to wakeup host MCU when a wakeword is detected

Wakeup signal will be a GPO pin.

We will need to buffer N seconds of audio, including M milliseconds before the wakeword. We do not have a specification for N but we know M must be at least 500 millseconds.

N & M may be configurable and we either have enough SRAM or not. The build report will tell us.

Incorrect path to WW models in filesystem

In the file applications/avona/filesystem_support/create_fs.bat the 250k and 50k files need to go into a "ww" folder. So, these lines:

    cp "%WW_PATH%\models\common\WR_250k.en-US.alexa.bin" %temp%\fatmktmp\250kenUS.bin
    cp "%WW_PATH%\models\common\WS_50k.en-US.alexa.bin" %temp%\fatmktmp\50kenUS.bin

likely need to be changed to:

    cp "%WW_PATH%\models\common\WR_250k.en-US.alexa.bin" %temp%\fatmktmp\ww\250kenUS.bin
    cp "%WW_PATH%\models\common\WS_50k.en-US.alexa.bin" %temp%\fatmktmp\ww\50kenUS.bin

In addition, we no longer want to skip the creation of fat.fs when it exists already as this can hide issues.

AEC passing tests

Port and modify existing integration and component-level tests, design and implement any new tests and use them to show correct operation of the Acoustic Echo Canceller. The tests shall reside in a repository separate from sw_avona. Each test may require specific hardware or a hardware simulator for operation.

NS documentation complete

Provide documentation that describes:

  • What the NS does from a user's perspective,
  • Its mode(s) of operation, and
  • The purpose, parameters, return values and any constraints for each function in the component's API.

General component documentation shall use ReStructured Text. Function documentation shall use Doxygen.

Keep examples to a minimum (we will add more later), and only include ones necessary to describe the purpose of the component and its mode(s) of operation.

Simplify CMake

When writing the CMakeLists.txt for the lib_agc module, I simplified it to the minimal set of commands so it is much cleaner than the equivalent in lib_aec. Throughout sw_avona, we have been copying CMakeLists.txt files with a lot of boiler-plate code and unnecessary repetition. This can all be simplified by removing unnecessary commands and setting properties and variables in a suitable hierarchy to avoid repetition (eg. the executable suffix ".xe" should be set at a high level in the repo, not set individually in every CMakeLists.txt that produces such an executable).

VAD documentation complete

Provide documentation that describes:

  • What the VAD does from a user's perspective,
  • Its mode(s) of operation, and
  • The purpose, parameters, return values and any constraints for each function in the component's API.

General component documentation shall use ReStructured Text. Function documentation shall use Doxygen.

Keep examples to a minimum (we will add more later), and only include ones necessary to describe the purpose of the component and its mode(s) of operation.

AGC functionality complete

Re-design and implement the Automatic Gain Controller to provide necessary functionality. Implementation shall use C and be suitable for use under FreeRTOS or bare-metal. It should use lib_xs3_math and VPU optimisations where possible. Nothing in the implementation shall refer to or require specific hardware, e.g., an xCore.AI. Completion of this issue does not require all tests passing. It may involve re-design and implementation of unit tests at the descretion of the engineer to show that the re-worked AGC functions operate as intended.

NS passing tests

Port and modify existing integration and component-level tests, design and implement any new tests and use them to show correct operation of the Noise Suppressor. The tests shall reside in a repository separate from sw_avona. Each test may require specific hardware or a hardware simulator for operation.

Implement USB audio test interface

Support the ability to route audio into the start of the pipeline from host.

Output the following synchronized audio signals:

  • Processed audio (ASR & comms)
  • Stereo reference audio
  • 2x microphones

IC documentation complete

Provide documentation that describes:

  • What the IC does from a user's perspective,
  • Its mode(s) of operation, and
  • The purpose, parameters, return values and any constraints for each function in the component's API.

General component documentation shall use ReStructured Text. Function documentation shall use Doxygen.

Keep examples to a minimum (we will add more later), and only include ones necessary to describe the purpose of the component and its mode(s) of operation.

Implement SPI host interface mode

Consideration of an additional host interface option (SPI). Note this is non-real time interface, but may allow a more standard system level architecture.

Some open issues related to ADEC testing

While testing sw_avona adec module using the ADEC tests ported from lib_audio_pipelines (https://github.com/xmos/lib_audio_pipelines/tree/develop/tests/test_delay_estimator_controller), I found some issues that I've described here.

I have 3 failing tests on sw_avona ADEC.
rapid_changes - False negatives
small_mic_increase - False positives
delay_at_start - False positives.

While debugging these failures, I realised that 3610 lib_aec calculates inverse_X_energy normDenom in a slightly different way than the python model.
3610 lib_aec: https://github.com/xmos/lib_aec/blob/develop/lib_aec/src/aec_calc_inv_energy_params.xc#L59
python model: https://github.com/xmos/py_aec/blob/develop/py_aec/aec.py#L581

So there's an extra factor of 2 that is multiplied to X_energy to do norm_denom = 2X_energy + sigma_xxgamma.
When I add this multiplication by 2 factor in sw_avona lib_aec, the ADEC tests pass. If I remove this factor of 2 from 3610, lib_aec code, I see the 3 ADEC tests failures in lib_audio_pipelines as well.

I'm not sure how this discrepancy wrt python model got implemented in the first place. I don't recall adding an extra multiplication while implementing 3610 lib_aec and I maybe the python itself had the factor of 2 at some point?

When I run lib_audio_pipelines full keyword tests with and without this factor of 2 multiplication the results don't show any significant difference so both implementations seem okay.

While debugging these failing ADEC tests, here's what I found:

  • The failures happen because there's an early (frame 48) transition to DE mode in C code which doesn't happen on python.
  • The ERLE curve for C follows python (atleast to begin with, before they diverge due to different shadow resets, copy etc.) but at a lower ERLE on C than python. This lower ERLE triggers DE on frame 48 in C code. This happens on both the sw_avona and 3610 AEC code and is most likely because of limited fixed point precision and not a bug in the C code.
  • Because of the 48th frame DE transition, the small_mic_increase and delay_at_start tests fail with an extra false positive.
  • The rapid_changes test is interesting. While the early DE transition has happened and AEC is in DE mode, an actual delay change happens in the stream where the mic signal becomes early wrt reference. This delay change happens too late and before the AEC filter can converge to the new peak, we transition out of DE mode with the wrong measured delay. Since the actual delay is in fact, mic early, post this the filters never converge. This means that the initial shadow -> main filter copy never happens, which means ADEC doesn't run its logic anymore, since ADEC waits for the shadow->main filter copy before monitoring AEC performance. As a result we get false_negatives for this test case.
  • There's a watchdog in ADEC that is supposed to force trigger DE in consistently bad AEC case, but the watchdog check itself is within the has_shadow_copy_happened check so never gets triggered.

Conclusion:

  • I think getting ADEC to work with compare_filters logic needs more work. This is already an item acoustic team's backlog.
  • I've decided to introduce the factor of 2 multiplication in sw_avona lib_aec as well to get adec tests passing and also because it doesn't seem to degrade anything.
  • AEC performance parameters like ERLE, peak to average ratio etc. that are used to make decisions in compare_filters and ADEC algorithms are different in C vs python. So using the C code to tune these algorithms would perhaps make more sense. Also, debugging various reported issues in future on C rather than python would be useful.
  • In the pipeline example, I'm going to only demonstrate ADEC configured in initial delay estimation mode (same as 3610 default setting) since I'm not sure of robustness of ADEC in automatic DE control mode.
    c_py_compare_plots

AEC development beyond v0.1.0

I'm adding a list of lib_aec features not supported in 0.1.0 but will be added in the future:

  • L2 API level task distribution scheme.
  • Example demonstrating L2 API use.
  • Double precision C model for the AEC.
  • Decide if delay_estimator will be a separate module and move it out of lib_aec if needed.
  • Convert aec_unit_tests from xc to c.
  • Remove sh use from run_xcoreai.py in examples to make Windows compatible.

ADEC documentation complete

Provide documentation that describes:

  • What the ADEC and surrounding components do from a user's perspective,
  • Their mode(s) of operation, and
  • The purpose, parameters, return values and any constraints for each function in the component's API.

General component documentation shall use ReStructured Text. Function documentation shall use Doxygen.

Keep examples to a minimum (we will add more later), and only include ones necessary to describe the purpose of the component and its mode(s) of operation.

IC passing tests

Port and modify existing integration and component-level tests, design and implement any new tests and use them to show correct operation of the Interference Canceller. The tests shall reside in a repository separate from sw_avona. Each test may require specific hardware or a hardware simulator for operation.

Some AGC loss control transitions to far-end speech don't appear possible

The current AGC implementation (and the python model in lib_agc) don't appear to support two loss control transitions:

  • near-end only to far-end ony
  • double-talk to far-end only
    These require a transition to "silence" in between.

Relevant code is here. In particular, the "far-end speech only" branch has "do nothing", so the timers are not adjusted, and decrementing lc_t_near requires silence.

ADEC passing tests

Port and modify existing integration and component-level tests, design and implement any new tests and use them to show correct operation of the Automatic Delay Estimation Controller and surrounding components. The tests shall reside in a repository separate from sw_avona. Each test may require specific hardware or a hardware simulator for operation.

AEC functionally complete

Re-design and implement the Acoustic Echo Canceller to provide necessary functionality. Implementation shall use C and be suitable for use under FreeRTOS or bare-metal. It should use lib_xs3_math and VPU optimisations where possible. Nothing in the implementation shall refer to or require specific hardware, e.g., an xCore.AI. Completion of this issue does not require all tests passing. It may involve re-design and implementation of unit tests at the descretion of the engineer to show that the re-worked AEC functions operate as intended.

VAD passing tests

Port and modify existing integration and component-level tests, design and implement any new tests and use them to show correct operation of the Voice Activity Detector. The tests shall reside in a repository separate from sw_avona. Each test may require specific hardware or a hardware simulator for operation.

NS functionally complete

Re-design and implement the Noise Suppressor to provide necessary functionality. Implementation shall use C and be suitable for use under FreeRTOS or bare-metal. It should use lib_xs3_math and VPU optimisations where possible. Nothing in the implementation shall refer to or require specific hardware, e.g., an xCore.AI. Completion of this issue does not require all tests passing. It may involve re-design and implementation of unit tests at the descretion of the engineer to show that the re-worked NS functions operate as intended.

AEC documentation complete

Provide documentation that describes:

  • What the AEC does from a user's perspective,
  • Its mode(s) of operation, and
  • The purpose, parameters, return values and any constraints for each function in the component's API.

General component documentation shall use ReStructured Text. Function documentation shall use Doxygen.

Keep examples to a minimum (we will add more later), and only include ones necessary to describe the purpose of the component and its mode(s) of operation.

VAD functionally complete

Re-design and implement the Voide Activity Detector and related components to provide necessary functionality. Implementation shall use C and be suitable for use under FreeRTOS or bare-metal. It should use lib_xs3_math and VPU optimisations where possible. Nothing in the implementation shall refer to or require specific hardware, e.g., an xCore.AI. Completion of this issue does not require all tests passing. It may involve re-design and implementation of unit tests at the descretion of the engineer to show that the re-worked VAD functions operate as intended.

Windows not fully supported

Building for Windows still has some issues that need to be cleaned up and the Windows build steps are not fully documented.

In addition, the filesystem creation process needs to support Windows.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.