Code Monkey home page Code Monkey logo

audioclassification-paddlepaddle's Introduction

开发者,你们好!

访问者

Anurag's GitHub stats

核心项目

项目类型 Pytorch版本 PaddlePaddle版本 备注
语音识别 MASR PPASR
声纹识别 VoiceprintRecognition-Pytorch VoiceprintRecognition-PaddlePaddle
声音分类 AudioClassification-Pytorch AudioClassification-PaddlePaddle
语音情感识别 SpeechEmotionRecognition-Pytorch SpeechEmotionRecognition-PaddlePaddle
语音合成 VITS-Pytorch VITS-PaddlePaddle

语音项目

  1. 基于PaddlePaddle动态图实现的语音识别项目:PPASR GitHub Repo stars
  2. 基于Pytorch实现的语音识别项目:MASR GitHub Repo stars
  3. 微调Whisper模型和加速推理:Whisper-Finetune GitHub Repo stars
  4. 基于PaddlePaddle静态图实现的语音识别项目:PaddlePaddle-DeepSpeech GitHub Repo stars
  5. 基于Pytorch实现的声音分类项目:AudioClassification-Pytorch GitHub Repo stars
  6. 基于PaddlePaddle实现声音分类项目:AudioClassification-PaddlePaddle GitHub Repo stars
  7. 基于PaddlePaddle实现声纹识别项目:VoiceprintRecognition-PaddlePaddle GitHub Repo stars
  8. 基于Pytorch实现声纹识别项目:VoiceprintRecognition-Pytorch GitHub Repo stars
  9. 基于Tensorflow实现声纹识别项目:VoiceprintRecognition-Tensorflow GitHub Repo stars
  10. 基于Keras实现声纹识别项目:VoiceprintRecognition-Keras GitHub Repo stars
  11. 基于PaddlePaddle实现的语音情感识别:SpeechEmotionRecognition-PaddlePaddle GitHub Repo stars
  12. 基于Pytorch实现的语音情感识别:SpeechEmotionRecognition-Pytorch GitHub Repo stars
  13. 基于PaddlePaddle实现的VIST语音合成:VITS-PaddlePaddle GitHub Repo stars
  14. 基于Pytorch实现的VIST语音合成:VITS-Pytorch GitHub Repo stars

视觉项目

  1. 基于PaddlePaddle实现的人脸识别项目:PaddlePaddle-MobileFaceNets GitHub Repo stars
  2. 基于Pytorch实现的人脸识别项目:Pytorch-MobileFaceNet GitHub Repo stars
  3. 基于PaddlePaddle实现的SSD目标检测模型:PaddlePaddle-SSD GitHub Repo stars
  4. 基于Pytorch实现的人脸关键点检测MTCNN模型:Pytorch-MTCNN GitHub Repo stars
  5. 基于PaddlePaddle实现的人脸关键点检测MTCNN模型:PaddlePaddle-MTCNN GitHub Repo stars
  6. 基于PaddlePaddle实现的文字识别CRNN模型:PaddlePaddle-CRNN GitHub Repo stars
  7. 基于PaddlePaddle实现的人流密度CrowdNet模型:PaddlePaddle-CrowdNet GitHub Repo stars
  8. 基于MXNET实现的年龄性别识别项目:Age-Gender-MXNET GitHub Repo stars
  9. 使用Tensorflow Lite、Paddle Lite、MNN、TNN框架在Android上不是图像分类模型:ClassificationForAndroid GitHub Repo stars
  10. 基于PaddlePaddle实现的PP-YOLOE模型:PP-YOLOE GitHub Repo stars
  11. 在Android部署的人脸检测、口罩识别、关键检测模型:FaceKeyPointsMask GitHub Repo stars
  12. 在Android上部署语义分割模型实现换人物背景:ChangeHumanBackground GitHub Repo stars
  13. 使用Tensorflow实现的人脸识别项目:Tensorflow-FaceRecognition GitHub Repo stars

系列教程

  1. PaddlePaddle V2版本系列教程:LearnPaddle GitHub Repo stars
  2. PaddlePaddle Fluid版本系列教程:LearnPaddle2 GitHub Repo stars

书籍源码

  1. 《PaddlePaddle从入门到实战》源码:PaddlePaddleCourse GitHub Repo stars
  2. 《深度学习应用实战之PaddlePaddle》源码:BookSource GitHub Repo stars
github contribution grid snake animation

audioclassification-paddlepaddle's People

Contributors

gt-acerzhang avatar yeyupiaoling avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

audioclassification-paddlepaddle's Issues

训练时报错

输出如下

-----------  Configuration Arguments -----------
batch_size: 32
gpus: 0
input_shape: (None, 1, 128, 128)
learning_rate: 0.001
num_classes: 10
num_epoch: 50
num_workers: 4
save_model: models/
test_list_path: dataset/test_list.txt
train_list_path: dataset/train_list.txt
------------------------------------------------
W1230 03:44:49.589563  1696 device_context.cc:404] Please NOTE: device: 0, GPU Compute Capability: 6.0, Driver API Version: 11.2, Runtime API Version: 10.2
W1230 03:44:49.593922  1696 device_context.cc:422] device: 0, cuDNN Version: 7.6.
-------------------------------------------------------------------------------
   Layer (type)         Input Shape          Output Shape         Param #    
===============================================================================
     Conv2D-1        [[1, 1, 128, 128]]    [1, 64, 64, 64]         3,136     
   BatchNorm2D-1     [[1, 64, 64, 64]]     [1, 64, 64, 64]          256      
      ReLU-1         [[1, 64, 64, 64]]     [1, 64, 64, 64]           0       
    MaxPool2D-1      [[1, 64, 64, 64]]     [1, 64, 32, 32]           0       
     Conv2D-2        [[1, 64, 32, 32]]     [1, 64, 32, 32]        36,864     
   BatchNorm2D-2     [[1, 64, 32, 32]]     [1, 64, 32, 32]          256      
      ReLU-2         [[1, 64, 32, 32]]     [1, 64, 32, 32]           0       
     Conv2D-3        [[1, 64, 32, 32]]     [1, 64, 32, 32]        36,864     
   BatchNorm2D-3     [[1, 64, 32, 32]]     [1, 64, 32, 32]          256      
   BasicBlock-1      [[1, 64, 32, 32]]     [1, 64, 32, 32]           0       
     Conv2D-4        [[1, 64, 32, 32]]     [1, 64, 32, 32]        36,864     
   BatchNorm2D-4     [[1, 64, 32, 32]]     [1, 64, 32, 32]          256      
      ReLU-3         [[1, 64, 32, 32]]     [1, 64, 32, 32]           0       
     Conv2D-5        [[1, 64, 32, 32]]     [1, 64, 32, 32]        36,864     
   BatchNorm2D-5     [[1, 64, 32, 32]]     [1, 64, 32, 32]          256      
   BasicBlock-2      [[1, 64, 32, 32]]     [1, 64, 32, 32]           0       
     Conv2D-6        [[1, 64, 32, 32]]     [1, 64, 32, 32]        36,864     
   BatchNorm2D-6     [[1, 64, 32, 32]]     [1, 64, 32, 32]          256      
      ReLU-4         [[1, 64, 32, 32]]     [1, 64, 32, 32]           0       
     Conv2D-7        [[1, 64, 32, 32]]     [1, 64, 32, 32]        36,864     
   BatchNorm2D-7     [[1, 64, 32, 32]]     [1, 64, 32, 32]          256      
   BasicBlock-3      [[1, 64, 32, 32]]     [1, 64, 32, 32]           0       
     Conv2D-9        [[1, 64, 32, 32]]     [1, 128, 16, 16]       73,728     
   BatchNorm2D-9     [[1, 128, 16, 16]]    [1, 128, 16, 16]         512      
      ReLU-5         [[1, 128, 16, 16]]    [1, 128, 16, 16]          0       
     Conv2D-10       [[1, 128, 16, 16]]    [1, 128, 16, 16]       147,456    
  BatchNorm2D-10     [[1, 128, 16, 16]]    [1, 128, 16, 16]         512      
     Conv2D-8        [[1, 64, 32, 32]]     [1, 128, 16, 16]        8,192     
   BatchNorm2D-8     [[1, 128, 16, 16]]    [1, 128, 16, 16]         512      
   BasicBlock-4      [[1, 64, 32, 32]]     [1, 128, 16, 16]          0       
     Conv2D-11       [[1, 128, 16, 16]]    [1, 128, 16, 16]       147,456    
  BatchNorm2D-11     [[1, 128, 16, 16]]    [1, 128, 16, 16]         512      
      ReLU-6         [[1, 128, 16, 16]]    [1, 128, 16, 16]          0       
     Conv2D-12       [[1, 128, 16, 16]]    [1, 128, 16, 16]       147,456    
  BatchNorm2D-12     [[1, 128, 16, 16]]    [1, 128, 16, 16]         512      
   BasicBlock-5      [[1, 128, 16, 16]]    [1, 128, 16, 16]          0       
     Conv2D-13       [[1, 128, 16, 16]]    [1, 128, 16, 16]       147,456    
  BatchNorm2D-13     [[1, 128, 16, 16]]    [1, 128, 16, 16]         512      
      ReLU-7         [[1, 128, 16, 16]]    [1, 128, 16, 16]          0       
     Conv2D-14       [[1, 128, 16, 16]]    [1, 128, 16, 16]       147,456    
  BatchNorm2D-14     [[1, 128, 16, 16]]    [1, 128, 16, 16]         512      
   BasicBlock-6      [[1, 128, 16, 16]]    [1, 128, 16, 16]          0       
     Conv2D-15       [[1, 128, 16, 16]]    [1, 128, 16, 16]       147,456    
  BatchNorm2D-15     [[1, 128, 16, 16]]    [1, 128, 16, 16]         512      
      ReLU-8         [[1, 128, 16, 16]]    [1, 128, 16, 16]          0       
     Conv2D-16       [[1, 128, 16, 16]]    [1, 128, 16, 16]       147,456    
  BatchNorm2D-16     [[1, 128, 16, 16]]    [1, 128, 16, 16]         512      
   BasicBlock-7      [[1, 128, 16, 16]]    [1, 128, 16, 16]          0       
     Conv2D-18       [[1, 128, 16, 16]]     [1, 256, 8, 8]        294,912    
  BatchNorm2D-18      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
      ReLU-9          [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-19        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-19      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
     Conv2D-17       [[1, 128, 16, 16]]     [1, 256, 8, 8]        32,768     
  BatchNorm2D-17      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
   BasicBlock-8      [[1, 128, 16, 16]]     [1, 256, 8, 8]           0       
     Conv2D-20        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-20      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
      ReLU-10         [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-21        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-21      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
   BasicBlock-9       [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-22        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-22      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
      ReLU-11         [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-23        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-23      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
   BasicBlock-10      [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-24        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-24      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
      ReLU-12         [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-25        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-25      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
   BasicBlock-11      [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-26        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-26      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
      ReLU-13         [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-27        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-27      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
   BasicBlock-12      [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-28        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-28      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
      ReLU-14         [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-29        [[1, 256, 8, 8]]      [1, 256, 8, 8]        589,824    
  BatchNorm2D-29      [[1, 256, 8, 8]]      [1, 256, 8, 8]         1,024     
   BasicBlock-13      [[1, 256, 8, 8]]      [1, 256, 8, 8]           0       
     Conv2D-31        [[1, 256, 8, 8]]      [1, 512, 4, 4]       1,179,648   
  BatchNorm2D-31      [[1, 512, 4, 4]]      [1, 512, 4, 4]         2,048     
      ReLU-15         [[1, 512, 4, 4]]      [1, 512, 4, 4]           0       
     Conv2D-32        [[1, 512, 4, 4]]      [1, 512, 4, 4]       2,359,296   
  BatchNorm2D-32      [[1, 512, 4, 4]]      [1, 512, 4, 4]         2,048     
     Conv2D-30        [[1, 256, 8, 8]]      [1, 512, 4, 4]        131,072    
  BatchNorm2D-30      [[1, 512, 4, 4]]      [1, 512, 4, 4]         2,048     
   BasicBlock-14      [[1, 256, 8, 8]]      [1, 512, 4, 4]           0       
     Conv2D-33        [[1, 512, 4, 4]]      [1, 512, 4, 4]       2,359,296   
  BatchNorm2D-33      [[1, 512, 4, 4]]      [1, 512, 4, 4]         2,048     
      ReLU-16         [[1, 512, 4, 4]]      [1, 512, 4, 4]           0       
     Conv2D-34        [[1, 512, 4, 4]]      [1, 512, 4, 4]       2,359,296   
  BatchNorm2D-34      [[1, 512, 4, 4]]      [1, 512, 4, 4]         2,048     
   BasicBlock-15      [[1, 512, 4, 4]]      [1, 512, 4, 4]           0       
     Conv2D-35        [[1, 512, 4, 4]]      [1, 512, 4, 4]       2,359,296   
  BatchNorm2D-35      [[1, 512, 4, 4]]      [1, 512, 4, 4]         2,048     
      ReLU-17         [[1, 512, 4, 4]]      [1, 512, 4, 4]           0       
     Conv2D-36        [[1, 512, 4, 4]]      [1, 512, 4, 4]       2,359,296   
  BatchNorm2D-36      [[1, 512, 4, 4]]      [1, 512, 4, 4]         2,048     
   BasicBlock-16      [[1, 512, 4, 4]]      [1, 512, 4, 4]           0       
AdaptiveAvgPool2D-1   [[1, 512, 4, 4]]      [1, 512, 1, 1]           0       
     Linear-1            [[1, 512]]            [1, 10]             5,130     
===============================================================================
Total params: 21,300,554
Trainable params: 21,266,506
Non-trainable params: 34,048
-------------------------------------------------------------------------------
Input size (MB): 0.06
Forward/backward pass size (MB): 28.00
Params size (MB): 81.26
Estimated Total Size (MB): 109.32
-------------------------------------------------------------------------------

Epoch 0: StepDecay set learning rate to 0.001.
/usr/local/lib/python3.7/dist-packages/librosa/core/audio.py:165: UserWarning: PySoundFile failed. Trying audioread instead.
  warnings.warn("PySoundFile failed. Trying audioread instead.")
/usr/local/lib/python3.7/dist-packages/librosa/core/audio.py:165: UserWarning: PySoundFile failed. Trying audioread instead.
  warnings.warn("PySoundFile failed. Trying audioread instead.")
ERROR:root:DataLoader reader thread raised an exception!
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.7/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/dataloader_iter.py", line 411, in _thread_loop
    batch = self._get_data()
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/dataloader_iter.py", line 525, in _get_data
    batch.reraise()
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/worker.py", line 168, in reraise
    raise self.exc_type(msg)
ValueError: DataLoader worker(2) caught ValueError with message:
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/worker.py", line 320, in _worker_loop
    batch = fetcher.fetch(indices)
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/fetcher.py", line 99, in fetch
    data = [self.dataset[idx] for idx in batch_indices]
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/fetcher.py", line 99, in <listcomp>
    data = [self.dataset[idx] for idx in batch_indices]
  File "/content/AudioClassification-PaddlePaddle/reader.py", line 36, in __getitem__
    spec_mag = load_audio(audio_path, mode=self.model, spec_len=self.spec_len)
  File "/content/AudioClassification-PaddlePaddle/reader.py", line 14, in load_audio
    crop_start = random.randint(0, spec_mag.shape[1] - spec_len)
  File "/usr/lib/python3.7/random.py", line 222, in randint
    return self.randrange(a, b+1)
  File "/usr/lib/python3.7/random.py", line 200, in randrange
    raise ValueError("empty range for randrange() (%d,%d, %d)" % (istart, istop, width))
ValueError: empty range for randrange() (0,-15, -15)


Traceback (most recent call last):
  File "train.py", line 125, in <module>
    train(args)
  File "train.py", line 85, in train
    for batch_id, (spec_mag, label) in enumerate(train_loader()):
  File "/usr/local/lib/python3.7/dist-packages/paddle/fluid/dataloader/dataloader_iter.py", line 585, in __next__
    data = self._reader.read_next_var_list()
SystemError: (Fatal) Blocking queue is killed because the data reader raises an exception.
  [Hint: Expected killed_ != true, but received killed_:1 == true:1.] (at /paddle/paddle/fluid/operators/reader/blocking_queue.h:166)

/usr/local/lib/python3.7/dist-packages/librosa/core/audio.py:165: UserWarning: PySoundFile failed. Trying audioread instead.
  warnings.warn("PySoundFile failed. Trying audioread instead.")
/usr/local/lib/python3.7/dist-packages/librosa/core/audio.py:165: UserWarning: PySoundFile failed. Trying audioread instead.
  warnings.warn("PySoundFile failed. Trying audioread instead.")
/usr/local/lib/python3.7/dist-packages/librosa/core/audio.py:165: UserWarning: PySoundFile failed. Trying audioread instead.
  warnings.warn("PySoundFile failed. Trying audioread instead.")
/usr/local/lib/python3.7/dist-packages/librosa/core/audio.py:165: UserWarning: PySoundFile failed. Trying audioread instead.
  warnings.warn("PySoundFile failed. Trying audioread instead.")
/usr/local/lib/python3.7/dist-packages/librosa/core/audio.py:165: UserWarning: PySoundFile failed. Trying audioread instead.
  warnings.warn("PySoundFile failed. Trying audioread instead.")

声音异常检测

如果我想添加一个声音异常检测功能该怎么做呢,比如我划分了十个类别,当检测到声音不在这十个类别中,就输出其他类别,类似这样的怎么实现呢,我的想法是检测声音对比后返回一个和各类别相似度的分数,低于阈值的就是其他类别。问题是如何用你的代码得到这个分数呢。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.