Code Monkey home page Code Monkey logo

googlespeech's Introduction

方法一. 利用 google的页面翻译,存成MP3. 不需要登录。缺点是无法调速。

googleTrans.js

googlespeech

use google translation to record as MP3

this can be used to learn English phrases or setences. it also can be used to lead-reading stuff.

the dependecy is windows notejs and puppeteer.

方法二. 利用deepmind的人工智能, wavenet api, 由text生成 speech

https://cloud.google.com/text-to-speech/docs/quickstart
https://cloud.google.com/text-to-speech/docs/create-audio
https://cloud.google.com/sdk/docs/quickstart-linux

gcloud init --console-only

必需建立服务帐号,准备好环境

gcloud iam service-accounts create speech #如果已经存在,则不需要此步骤

此步不一定必要。

gcloud projects add-iam-policy-binding [PROJECT_ID] --member "serviceAccount:[NAME]@[PROJECT_ID].iam.gserviceaccount.com" --role "roles/owner"

gcloud iam service-accounts keys create key.json --iam-account=[email protected]

环境变量

linux:
export GOOGLE_APPLICATION_CREDENTIALS=key.json
export GOOGLE_APPLICATION_CREDENTIALS="/home/user/Downloads/service-account-file.json"

windows:

  1. powershell
    $env:GOOGLE_APPLICATION_CREDENTIALS="C:\Users\username\Downloads[FILE_NAME].json"

  2. cmd
    set GOOGLE_APPLICATION_CREDENTIALS=[PATH]

https://stackoverflow.com/questions/44184869/google-cloud-shell-is-using-project-cloud-devshell-dev-instead-of-my-actual-proj?rq=1

使用curl 工具 生成语音数据文件,以txt文件存起来

curl \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
-H "Content-Type: application/json; charset=utf-8" \
--data "{
'input':{
'text':'Android is a mobile operating system developed by Google,
based on the Linux kernel and designed primarily for
touchscreen mobile devices such as smartphones and tablets.'
},
'voice':{
'languageCode':'en-US',
'name':'en-US-Wavenet-A',
'ssmlGender':'MALE'
},
'audioConfig':{
'audioEncoding':'MP3'
}
}" "https://texttospeech.googleapis.com/v1beta1/text:synthesize" > synthesize-text.txt

男女声音类型

en-US-Wavenet-A MALE
en-US-Wavenet-B MALE
en-US-Wavenet-C FEMALE
en-US-Wavenet-D MALE
en-US-Wavenet-E FEMALE
en-US-Wavenet-F FEMALE

配置语速及生成语音的格式

'audioConfig':{
'audioEncoding':'MP3',
'speakingRate':1.0, //[0.25, 4.0]
}

{
"audioEncoding": enum(AudioEncoding),
"speakingRate": number,
"pitch": number,
"volumeGainDb": number,
"sampleRateHertz": number,
}

在bash下使用命令行把生成的txt文件转化为mp3文件

sed 's|audioContent| |' < synthesize-text.txt > tmp-output.txt && \
tr -d '\n ":{}' < tmp-output.txt > tmp-output-2.txt && \
base64 tmp-output-2.txt --decode > synthesize-text-audio.mp3 && \
rm tmp-output*.txt

googlespeech's People

Watchers

James Cloos avatar Tfangz avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.