Comments (10)
I will add it to an example to documentation later. But instead of "linear", you can add the following fields to your JSON configuration file:
{
"array_type": "circular",
"microphone_positions": [[ 18.75, -32.475953, 0.0],
[ 37.5, 0.0, 0.0],
[ 18.75, 32.475953, 0.0],
[-18.75, 32.475953, 0.0],
[-37.5, 0.0, 0.0],
[-18.75, -32.475953, 0.0]],
...
}
from distant_speech_recognition.
Thanks for this example.
Just to make it sure:
each row of that matrix is the position (x,y,z) of a single microphone of the array starting from the centre of the circle, right?
Could you please exaplain me also what is the "Target" field that I can see inside the configuration? If I got it right it is the coordinates where to focus the beam. If this is the case, do you have any ready-to-use method to integrate the info coming from DOA module ?
Thanks again for your help
from distant_speech_recognition.
Sorry for my slow response. I did not notice you had questions. My answers were below.
Q1. each row of that matrix is the position (x,y,z) of a single microphone of the array starting from the centre of the circle, right?
A1. That is correct.
Q2. Could you please exaplain me also what is the "Target" field that I can see inside the configuration? If I got it right it is the coordinates where to focus the beam.
A2. Yes, you are right. It is a list of time stamp and position vector.
The context of the position vector will change, depending on the shape of the array (linear, polar, planar or near-field).
In the case of the linear array, it will require an azimuth value in radians.
In the case of the polar and planar geometries, it should have azimuth and polar angle values.
If it is the near-field, it will need x, y an z coordinate values.
Q3. If this is the case, do you have any ready-to-use method to integrate the info coming from DOA module ?
No, it is not. Currently, it will need a position of a target source in that format.
from distant_speech_recognition.
@kkumatani, in the case of polar/circular geometry, could you clarify what the target position values should be? azimuth and elevation in radians ? in which case should the third entry in the position vector be Null ?
from distant_speech_recognition.
@kkumatani by polar angle, do you mean elevation?
from distant_speech_recognition.
@kkumatani also what are the units for distances, angles and timestamps. For angles i imagine it's radians. Given that the speed of sound is hard-coded to 343740, it looks like distances are in milimeters. Could you clarify all of these?
from distant_speech_recognition.
Thanks for asking, pfeatherstone.
Q1. Culd you clarify what the target position values should be in the case of the circular geometry?
In the case of the far-field assumption with the circular array geometry, the return value would be (polar angle, aziumth). Actually, I found a bug. numpy.array([phi, theta]) must be numpy.array([theta, phi]) in at line 590, lib/pytdoa.py
and yes, it should be null.
Q2. by polar angle, do you mean elevation?
No, the polar angle is not not the elevation. The polar angle is the angle measured from the z-axis while the elevation is the angle from the xy-plane (in the spherical coordinate)
Q3. also what are the units for distances, angles and timestamps. For angles i imagine it's radians. Given that the speed of sound is hard-coded to 343740, it looks like distances are in milimeters. Could you clarify all of these?
Yes. the distance will be in mill-meters, and angle will be in radians.
from distant_speech_recognition.
@kkumatani So just to clarify again, you are using the physics convention for spherical coordinates. Mathmos use phi for the polar angle, not theta.
Also the function description of instantaneous_position in lib/pytdoa.py says it returns [azimuth, polar angle] in which case returning numpy.array([phi, theta]) using the physics convention of spherical coordinates is correct. Are you sure there is a bug? Or should the description say it is returning [polar angle, azimuth] ?
from distant_speech_recognition.
@kkumatani yet another clarification. Could you confirm that 'microphone_positions' is always in Cartesian coordinates, regardless on geometry.
from distant_speech_recognition.
-
So just to clarify again, you are using the physics convention for spherical coordinates.
Yes, it does. Theta indicates polar angle in radians and Phi is for azimuth.
-
Are you sure there is a bug?
Yes, I think so. The other functions assume to return (ploar angle, azimuth). Only instantaneous_position(self, frame_no) returns the opposite. So, it must return [polar angle, azimuth] -
Could you confirm that 'microphone_positions' is always in Cartesian coordinates, regardless on geometry.
Yes, that has to be always in Cartesian coordinates.
from distant_speech_recognition.
Related Issues (20)
- Follow the guide but fail to run example test_online_beamforming.py HOT 1
- the performance of the single channel wpe and multi channel wpe?
- SAD method HOT 2
- can a microphone array comput the things needed in *.json file for beamforming?
- Feature request : please support python3 HOT 6
- Optionally disable SWIG HOT 1
- design_de_haan_filter.py does not work well. HOT 1
- subband_dereverberator doesn't seem to work?
- Windows build
- Multiple errors on build HOT 12
- LCMV Multiple Noise Sources
- Can the single channel wpe run online?
- pickle load analysis filter failed
- Lcmv multi beam
- NLMSAcousticEchoCancellationFeature::next HOT 1
- Python 3 bindings
- ModuleNotFoundError: No module named 'btk20'
- CMake problem - Windows buil
- why OverSampledDFTAnalysisBank using a backward fft? HOT 2
- why OverSampledDFTAnalysisBank doing a backward fft after polyphase filter?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from distant_speech_recognition.