Code Monkey home page Code Monkey logo

audiolearning's Introduction

audio learning

feature

  • auto generating subtitle(srt format ) or text for audio data
  • cut audio data(wav format ,1 channel) to small part by speakers pause

notice

  • only support 1 channel wav file
  • user need to retrieve audio data from videos
  • generate subtitle/text for video data
  • the recognize rate depend on many factors: the qulity of the video data etc
  • please apply Baidu api key for using ,contact if you have any question
  • ted80001.wav is generated from https://ia800204.us.archive.org/25/items/AomawaShields_2015U/AomawaShields_2015U.mp4
  • ted80001.srt is auto generated by ted80001.wav

FYI

contact

[email protected]

更新日志

Ver 0.0.1

  • change mdedian filter arithmetic
  • arithmetic improvement for insert sutiable audio info
  • no need to split wav file,use stream to do baidu query
  • use ffmpeg for caption
  • other bug fix and improvements

语音学习

功能

  • 自动生成语音字幕
  • 可以根据说话人的停顿,进行自动片段剪辑

注意事项

  • 只支持 1 个通道的 wav 文件
  • 如果进行视频字幕自动生成,用户需要自己提取一个通道的wav文件
  • 语音文件识别的字幕格式srt
  • 也可以进行语音转化成文字
  • 识别率还可以,依赖音频文件的噪声,演讲,朗读音频较好
  • 底层使用的Baidu的语音识别,如果使用请自行申请,如果有问题可以联系我
  • ted80001.wav 来源于视频 https://ia800204.us.archive.org/25/items/AomawaShields_2015U/AomawaShields_2015U.mp4
  • ted80001.srt 由ted80001.wav 自动生成

仅供参考##

知乎上详细的说明 https://zhuanlan.zhihu.com/p/28347508
音乐切割小音频 https://pan.baidu.com/s/1hrXxEJU
演讲切割小音频 https://pan.baidu.com/s/1jIrC0F8#list/path=%2F

联系方式

[email protected]

更新日志

Ver 0.0.1

  • 中值滤波scipy.signal.medfilt计算速度较慢,更新计算方法
  • get_wave_statistic函数添加framerate(采样率)参数,支持8000/16000,添加处理(无声音时长超过17s切为多个16.999s的无声音时长)
  • calculate_other_statistic_info函数添加framerate(采样率)参数,支持8000/16000
  • 修改原来循环排序生成间隔小于17s时间点数组算法(每次循环采用折半插入排序,因为插入的是排好序的数组,原来每次循环采用sort,视频时长超过1小时的话基本算不完了...)
  • 去掉原来将wav切成具体的小文件步骤,直接使用流访问百度api
  • 修改保存字幕格式可以直接使用ffmpeg将字幕烧制到视频中
  • 修改speech_recognizai_baidu方法接受流,不再去读文件
  • 添加注释
  • 添加ffmpeg分离音频,烧制字幕指令

©2017 alex All Rights Reserved.

audiolearning's People

Contributors

goodskillprogramer avatar mengshixing avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.