Comments (3)
Hi!
Iād like to help with the feature extraction issue you're facing. To do so, I need a bit more info:
- Python Version: Which version are you using?
- Libraries: Could you list the libraries and their versions you're working with?
- Error Message: What error comes up with the smaller values?
- Context: Any other details about your setup might be helpful.
This will help me understand the problem better and find a solution for you.
Thanks!
from pyaudioanalysis.
Hi,
Windows 10
Python 3.10.0
customtkinter==5.2.1
dm-tree==0.1.8
dtaidistance==2.3.11
eyed3==0.9.7
fastdtw==0.3.4
fCWT==0.1.18
fqdn==1.5.1
google-auth-oauthlib==1.2.0
isoduration==20.11.0
jsonpointer==2.4
lesscpy==0.15.1
noisereduce==3.0.2
notebook==7.1.2
pandas==2.2.1
pipdeptree==2.16.1
pyAudioAnalysis==0.3.14
pydub==0.25.1
python-speech-features==0.6
resampy==0.4.3
tensorflow==2.16.1
tensorflow-estimator==2.15.0
tkinterdnd2==0.3.0
toml==0.10.2
uri-template==1.3.0
webcolors==1.13
wurlitzer==3.0.3
xlwt==1.3.0
Error when using
F, f_names = ShortTermFeatures.feature_extraction(x[0:1600], Fs, 160, 160, deltas=False)
---------------------------------------------------------------------------
File D:\projects\vrt\.venv\lib\site-packages\pyAudioAnalysis\ShortTermFeatures.py:662, in feature_extraction(signal, sampling_rate, window, step, deltas)
657 feature_vector[n_time_spectral_feats:mffc_feats_end, 0] = \
658 mfcc(fft_magnitude, fbank, n_mfcc_feats).copy()
660 # chroma features
661 chroma_names, chroma_feature_matrix = \
--> 662 chroma_features(fft_magnitude, sampling_rate, num_fft)
663 chroma_features_end = n_time_spectral_feats + n_mfcc_feats + \
664 n_chroma_feats - 1
665 feature_vector[mffc_feats_end:chroma_features_end] = \
666 chroma_feature_matrix
File D:\projects\vrt\.venv\lib\site-packages\pyAudioAnalysis\ShortTermFeatures.py:293, in chroma_features(signal, sampling_rate, num_fft)
291 I = np.nonzero(num_chroma > num_chroma.shape[0])[0][0]
292 C = np.zeros((num_chroma.shape[0],))
--> 293 C[num_chroma[0:I - 1]] = spec
294 C /= num_freqs_per_chroma
295 final_matrix = np.zeros((12, 1))
ValueError: shape mismatch: value array of shape (80,) could not be broadcast to indexing result of shape (27,)
I need a distinct sound feature for my CNN-based project to create a model. The data frame size is 1600 (0.1 seconds)
In the above plot, you can see 7 MFCC generated from 7 different wave files. All the wave files have a similar sound.
The whole idea is to make the feature as same as possible for similar types of data so that a good prediction score can be
created.
from pyaudioanalysis.
Hello again,
I conducted a small experiment and was able to replicate the issue you described. It appears that there isn't sufficient information to compute chroma features. To address this and ensure the code functions (even if it means the chroma feature values are zeroes), I've implemented a fix. I plan to submit a pull request for this fix, pending the library author's approval.
For testing, I took the following approach (I recommend using fractions of the sampling rate, Fs, rather than sample counts, but the choice is yours. In my tests, I used an Fs of 44100):
from pyAudioAnalysis import ShortTermFeatures
from pyAudioAnalysis import audioBasicIO
def extract_features(frac_second, samples_features, Fs, x):
samples_frac_second = frac_second * Fs
samples_windows = samples_features // samples_frac_second
F, f_names = ShortTermFeatures.feature_extraction(x[:samples_features], Fs, frac_second*Fs, frac_second*Fs,
deltas=False)
print(f"In {frac_second} there are {samples_frac_second} samples")
print(f"In {samples_features} there are {samples_windows} windows")
print(len(F[0]))
print(len(f_names))
return F, f_names
def issue_396():
# Use a breakpoint in the code line below to debug your script.
[Fs, x] = audioBasicIO.read_audio_file('./audio/limbo_mono.wav')
for frac_second in [0.1, 0.05, 0.025, 0.01, 0.0036, 0.0018]:
print(f"Experiment with {frac_second} frac of second")
F, f_names = extract_features(frac_second, 16000, Fs, x)
# Press the green button in the gutter to run the script.
if __name__ == '__main__':
issue_396()
Output generated:
Experiment with 0.1 frac of second
In 0.1 there are 4410.0 samples
In 16000 there are 3.0 windows
3
34
Experiment with 0.05 frac of second
In 0.05 there are 2205.0 samples
In 16000 there are 7.0 windows
7
34
Experiment with 0.025 frac of second
In 0.025 there are 1102.5 samples
In 16000 there are 14.0 windows
14
34
Experiment with 0.01 frac of second
In 0.01 there are 441.0 samples
In 16000 there are 36.0 windows
36
34
Experiment with 0.0036 frac of second
In 0.0036 there are 158.76 samples
In 16000 there are 100.0 windows
101
34
Experiment with 0.0018 frac of second
In 0.0018 there are 79.38 samples
In 16000 there are 201.0 windows
202
34
Fix: In the method chroma_features inside of the file ShortTermFeatures.py adapt the following part like this:
else:
I = np.nonzero(num_chroma > num_chroma.shape[0])[0][0]
C = np.zeros((num_chroma.shape[0],))
if I > 1:
# If I <= 1 there are no chroma features that can be extracted
C[num_chroma[0:I - 1]] = spec[num_chroma[0:I - 1]]
C /= num_freqs_per_chroma
final_matrix = np.zeros((12, 1))
I'm submitting a pull request (https://github.com/Caparrini/pyAudioAnalysis), although I'm uncertain if it aligns with the expected behavior. I've uploaded it here for your convenience, should you prefer this over modifying your local library directly. Please choose whichever option suits you best.
Best regards,
from pyaudioanalysis.
Related Issues (20)
- Signal from read_audio_generic causes ValueError in feature_extraction
- error in audioTrainTest.extract_features_and_train if class folder contain only 1 sample
- Plot the wav image Value Error on silence_removal
- AttributeError: module 'pyAudioAnalysis.audioTrainTest' has no attribute 'featureAndTrain'
- Vocale
- Segmentation Based on speaker
- typo in wiki
- Add headers to CSV output with feature names?
- No requirements for pip pyAudioAnalysis?
- How to save pyaudioanalysis svm model in onnx format
- cannot reshape array of size 28800 into shape (480,10) HOT 1
- mportError: cannot import name 'audioFeatureExtraction' from 'pyAudioAnalysis' HOT 3
- invalid syntax for print method in analyzeMovieSound
- Add Pypi long description
- Is it possible to set shortTermWindow to be shorter? HOT 1
- Questions about Copilot + Open Source Software Hierarchy
- open wav file for analyse HOT 2
- open wav file for analyse HOT 2
- Security Concern: Usage of `pickle` for Model Loading
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
š Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ššš
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ā¤ļø Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pyaudioanalysis.