Code Monkey home page Code Monkey logo

ibm-ecosystem-engineering / watson-speech Goto Github PK

View Code? Open in Web Editor NEW
9.0 3.0 10.0 61.66 MB

This collection demonstrates how to help you to quickly embed Watson Speech in your own applications.

Home Page: https://techzone.ibm.com/collection/embedded-ai

License: Apache License 2.0

Java 0.27% Dockerfile 0.03% HTML 0.13% Jupyter Notebook 99.49% Python 0.08% Shell 0.01%
artificial-intelligence speech-to-text text-to-speech watson embeddable-ai ibm-watson-libraries-for-embed

watson-speech's Introduction

Self-Serve Assets for Embeddable AI using Watson Speech

Assets/Accelerators for Watson Speech (this repo) contains self-serve notebooks and documentation on how to create Speech models using Watson Speech library, how to serve Watson Speech models, and how to make inference requests from custom applications. With an IBM Cloud account a full production sample can be deployed in roughly one hour.

Key Technologies:

  • IBM Watson Speech to Text Library for Embed transcribes written text from spoken audio. The service leverages machine learning to combine knowledge of grammar, language structure, and the composition of audio and voice signals to accurately transcribe the human voice. It continuously updates and refines its transcription as it receives more speech audio. The service is ideal for applications that need to extract high-quality speech transcripts for use cases such as call centers, custom care, agent assistance, and similar solutions. You can customize the Watson Text to Speech service to suit your language and application needs. Both services offer HTTP and WebSocket programming interfaces that make them suitable for any application that produces or accepts audio.

  • IBM Watson Text to Speech Library for Embed synthesizes natural-sounding speech from written text. The service streams the results back to the client with minimal delay. The service is appropriate for voice-driven and screenless applications, where audio is the preferred method of output. You can customize the Watson Text to Speech service to suit your language and application needs. Both services offer HTTP and WebSocket programming interfaces that make them suitable for any application that produces or accepts audio.

Outline

Resources

Contrbutons By

Created & Architected By

Kunal Sawarkar, Chief Data Scientist

Builders

Michael Spriggs, Principal Architect
Shivam Solanki, Senior Advisory Data Scientist
Kevin Huang, Sr. ML-Ops Engineer
Abhilasha Mangal, Senior Data Scientist
Himadri Talukder - Senior Software Engineer

Disclaimer

This framework is developed by Build Lab, IBM Ecosystem. Please note that this content is made available to foster Embeddable AI technology adoption and serve ecosystem partners. The content may include systems & methods pending patent with the USPTO and protected under US Patent Laws. SuperKnowa is not a product but a framework built on the top of IBM watsonx along with other products like LLAMA models from Meta & ML Flow from Databricks. Using SuperKnowa implicitly requires agreeing to the Terms and conditions of those products. This framework is made available on an as-is basis to accelerate Enterprise GenAI applications development. In case of any questions, please reach out to [email protected].

Copyright @ 2023 IBM Corporation.

watson-speech's People

Contributors

abhilasha-mangal avatar ak-eyebee avatar chen-chris-topher avatar hitalukder avatar jamaya2001 avatar kevinxhuang avatar krook avatar kunal-savvy avatar mjspriggs avatar shivam6693 avatar timroster avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

watson-speech's Issues

Error when trying to build

I was trying to follow the following guideline on CentOS 7 VM:
https://github.com/ibm-build-lab/Watson-Speech/tree/main/single-container-stt

Tried to build with:

docker build . -t stt-standalone

and

docker build --no-cache . -t stt-standalone

But getting the following error:

STEP 14: RUN ./prepareModels.sh
Serving HTTP on 127.0.0.1 port 3333 (http://127.0.0.1:3333/) ...
127.0.0.1 - - [08/Jun/2023 03:34:37] "GET /pool2/en-US_Multimedia.standard.2022-03-15.4a1f1a7e.tar.pzstd HTTP/1.1" 200 -
127.0.0.1 - - [08/Jun/2023 03:34:37] "GET /pool2/en-US_Multimedia.low-latency.2022-03-15.f7dec0bc.tar.pzstd HTTP/1.1" 200 -
127.0.0.1 - - [08/Jun/2023 03:34:39] "GET /pool2/fr-FR_Multimedia.standard.2022-03-15.5023420e.tar.pzstd HTTP/1.1" 200 -
127.0.0.1 - - [08/Jun/2023 03:34:39] "GET /pool2/spkInfo_16k.2020-10-08.895a1741.tar.pzstd HTTP/1.1" 200 -
127.0.0.1 - - [08/Jun/2023 03:34:40] "GET /pool2/fr-FR_Multimedia.low-latency.2022-03-15.cbf4c476.tar.pzstd HTTP/1.1" 200 -
{'modelSetInitElapsedTime': 4.065191984176636}
<2023-06-08 03:34:41,879 src/global.cc:28>	RD_INFO 	RAPID recognizer 5.4.0 (C) IBM Corp. 2015-2020 (git revision 4e52e03fe57718461388d29838b4d269bbd1fb91-modified                    )
fatal: not a git repository (or any of the parent directories): .git
fatal: not a git repository (or any of the parent directories): .git
childDiverter::allHeadersReceived rrr timenow=2023-06-08 03:35:01.524135: rrr 127.0.0.1 b'GET' - /v1/miniHealthCheck    headers: {"Host": ["localhost:1080"], "User-Agent": ["curl/7.61.1"], "Accept": ["*/*"]}
Model initialization complete
6f530c7dec33ba7def765188694bf9370fcca89e2f170d9ed04b975b40a3c4b8
STEP 15: FROM 9648b3eb6a9f6b4563efceb5154232b19ccc66059196e414920b7e02bf77bcb1 AS release
STEP 16: COPY --from=model_cache ${CONFIG_DIR}/cache/ ${CONFIG_DIR}/cache/
Error: error dry-running "COPY --from=model_cache ${CONFIG_DIR}/cache/ ${CONFIG_DIR}/cache/": no files found matching "/var/lib/containers/storage/overlay/40fcdc4c86fcce34368711f238f7f739e23a26be5c70281abd9b6039833e1d62/merged/cache": no such file or directory

Full message:

# docker build --no-cache . -t stt-standalone
STEP 1: FROM cp.icr.io/cp/ai/watson-stt-generic-models:1.0.0 AS catalog
56f1a98978a3adde14d2e8753424a51853b02584835942d4a8d36b5ec3599aad
STEP 2: FROM cp.icr.io/cp/ai/watson-stt-en-us-multimedia:1.0.0 AS en-us-multimedia
a781c0a8e98d3f7448223c63234bf500c20967fac9ff8d1070c54e3cbe32d52f
STEP 3: FROM cp.icr.io/cp/ai/watson-stt-fr-fr-multimedia:1.0.0 AS fr-fr-multimedia
3ce6be74903c7321bc1835890692982151623837f10a165e837c3c81cd87db53
STEP 4: FROM cp.icr.io/cp/ai/watson-stt-runtime:1.0.0 AS runtime
STEP 5: ENV LOCAL_DIR=chuck_var
614f885a2e1dddfe230f07edcb00eaed40e6803b6ad5d03a25c62ccb6024c7c9
STEP 6: ENV CONFIG_DIR=/opt/ibm/chuck.x86_64/var
e4d7b9e25a47ba524abdae169bd8cc1255e994e9e1ef66660959c22c929b172e
STEP 7: COPY --chown=watson:0 --from=catalog catalog.json ${CONFIG_DIR}/catalog.json
c1f6e1c6a2ddcd176ec9af2acc0efa20e2774ff56b3e68dd56458dd29d2bd66e
STEP 8: COPY --chown=watson:0 ./${LOCAL_DIR}/* ${CONFIG_DIR}/
9648b3eb6a9f6b4563efceb5154232b19ccc66059196e414920b7e02bf77bcb1
STEP 9: FROM 9648b3eb6a9f6b4563efceb5154232b19ccc66059196e414920b7e02bf77bcb1 AS model_cache
STEP 10: RUN sudo mkdir -p /models/pool2
6e91658aba12a2410a4c46e0edda9cdb0a8af310eafe215623025b45aa8df7fc
STEP 11: COPY --chown=watson:0 --from=en-us-multimedia model/* /models/pool2/
5a36edfb869397543c23dc9f3748d8e3fad0638e001bb676a1ca1416e378604c
STEP 12: COPY --chown=watson:0 --from=fr-fr-multimedia model/* /models/pool2/
5c7a343f0b172471b55879071ee6afc1fa29a46075c25e6e85978e600180f73d
STEP 13: COPY ./prepareModels.sh .
b400d9976412ec42fcb0e21af0e7810c3620cd680eed25c8fc1afe2df3b888d2
STEP 14: RUN ./prepareModels.sh
Serving HTTP on 127.0.0.1 port 3333 (http://127.0.0.1:3333/) ...
127.0.0.1 - - [08/Jun/2023 03:34:37] "GET /pool2/en-US_Multimedia.standard.2022-03-15.4a1f1a7e.tar.pzstd HTTP/1.1" 200 -
127.0.0.1 - - [08/Jun/2023 03:34:37] "GET /pool2/en-US_Multimedia.low-latency.2022-03-15.f7dec0bc.tar.pzstd HTTP/1.1" 200 -
127.0.0.1 - - [08/Jun/2023 03:34:39] "GET /pool2/fr-FR_Multimedia.standard.2022-03-15.5023420e.tar.pzstd HTTP/1.1" 200 -
127.0.0.1 - - [08/Jun/2023 03:34:39] "GET /pool2/spkInfo_16k.2020-10-08.895a1741.tar.pzstd HTTP/1.1" 200 -
127.0.0.1 - - [08/Jun/2023 03:34:40] "GET /pool2/fr-FR_Multimedia.low-latency.2022-03-15.cbf4c476.tar.pzstd HTTP/1.1" 200 -
{'modelSetInitElapsedTime': 4.065191984176636}
<2023-06-08 03:34:41,879 src/global.cc:28>	RD_INFO 	RAPID recognizer 5.4.0 (C) IBM Corp. 2015-2020 (git revision 4e52e03fe57718461388d29838b4d269bbd1fb91-modified                    )
fatal: not a git repository (or any of the parent directories): .git
fatal: not a git repository (or any of the parent directories): .git
childDiverter::allHeadersReceived rrr timenow=2023-06-08 03:35:01.524135: rrr 127.0.0.1 b'GET' - /v1/miniHealthCheck    headers: {"Host": ["localhost:1080"], "User-Agent": ["curl/7.61.1"], "Accept": ["*/*"]}
Model initialization complete
6f530c7dec33ba7def765188694bf9370fcca89e2f170d9ed04b975b40a3c4b8
STEP 15: FROM 9648b3eb6a9f6b4563efceb5154232b19ccc66059196e414920b7e02bf77bcb1 AS release
STEP 16: COPY --from=model_cache ${CONFIG_DIR}/cache/ ${CONFIG_DIR}/cache/
Error: error dry-running "COPY --from=model_cache ${CONFIG_DIR}/cache/ ${CONFIG_DIR}/cache/": no files found matching "/var/lib/containers/storage/overlay/40fcdc4c86fcce34368711f238f7f739e23a26be5c70281abd9b6039833e1d62/merged/cache": no such file or directory

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.