jeroenterheerdt / pycsspeechtts Goto Github PK

5.0 4.0 10.0 45 KB

Python (py) library to use Microsofts Cognitive Services Speech (csspeech) Text to Speech (tts) API.

License: MIT License

Python 100.00%

cognitive-services microsoft microsoft-cognitive-services text-to-speech

pycsspeechtts's Introduction

pycsspeechtts

Python (py) library to use Microsofts Cognitive Services Speech (csspeech) Text to Speech (tts) API. The cryptic name is the combination of the abbrevations shown above.

Usage:

from pycsspeechtts import TTSTranslator
t = TTSTranslator("YOUR API KEY","westeurope")

data = t.speak(text='The default voice is using Microsoft Neural Voice. When using a neural voice, synthesized speech is nearly indistinguishable from the human recordings.')
with open("file1.wav", "wb") as f:
        f.write(data)

data = t.speak('en-gb','Male','I am Max', 'George, Apollo', 'riff-16khz-16bit-mono-pcm', text='I am Max')
with open("file2.wav", "wb") as f:
        f.write(data)

You can also use custom voice by specifying isCustom=True and providing a customEndpoint:

from pycsspeechtts import TTSTranslator
t = TTSTranslator("YOUR API KEY","westeurope", isCustom=True, customEndpoint=MyEndpoint)
data = t.speak(language='en-gb',gender='Male',voiceType="ArchieNeural",text="This is a test for custom voice")

See test.py for more samples. Refer to https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/language-support to find the valid values for language, gender, voicetype and output formats.

pycsspeechtts's People

Contributors

Stargazers

Watchers

Forkers

prairiesnpr 2ndmemory p0isson andrejscak ksyme99 jaburges daniele-athome rsegers iridris mr-quin

pycsspeechtts's Issues

Release new package to PyPi

Hi @jeroenterheerdt,

Would it be possible to release a new package to PyPi? That would include the changes from #4 (they aren't in the package yet) and would fix this issue in Home Assistant: home-assistant/core#42809

As soon as you release 1.0.4, I'll ensure the Home Assistant integration gets updated to use it. Or, seeing that you are a HA user yourself as well; feel free to do so :)

Thanks!

Hi, I have just started trying to use this by proxy of integrating with Home Assistant which relies on this library. I have been unable to get it to change from the default voice for a language, I think down to the way the voice name tag is being set on line 44:

voice.set('name', 'Microsoft Server Speech Text to Speech Voice ('+name_lang(language)+', +voiceType+')')

According to the docs, the name should be the direct name from the list of languages. VoiceType itself as the param is a bit confusing since in the response when getting the list of voices, VoiceType is simple Neural or Standard.

But without changing the parameter names, a bit of a breaking change, and assuming you get just the name bit in voiceType, I think the line above should simply be

voice.set('name', name_lang(language)+'-'+voiceType)

Please tag a version here

when you make a release on Pypi.
Make sure you are on the same commit id.

Thanks.
\B.

remove sys.exit

Please remove sys.exit(1)
Instead, raise an error for example

Tag a new release

Could a new release be tagged? The latest tagged release is 1.0.6 from December 2020. Without having a new release, the already-merged feature to support XML tags is not included for projects which incorporate this library (Home Assistant for example). Thanks!

Implementation of this code

I am running core-2021.2.3 and supervisor-2021.02.9

I was wondering if this code was already implemented in those builds because I get the same error.

If it isn't implemented, is implementing this fix as simple as running the setup.py in my setup?

Error when using XML

I'm testing the new version (v1.0.7) with a development version of Home Assistant and am having trouble getting some XML to work. Including simple tags like <break time="2s" /> works as expected. However, when trying to use some of the style examples that Microsoft has, I'm getting an error.

Per Microsoft's documentation, this is a full example to configure the voice style:

<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"
       xmlns:mstts="https://www.w3.org/2001/mstts" xml:lang="en-US">
    <voice name="en-US-JennyNeural">
        <mstts:express-as style="cheerful">
            That'd be just amazing!
        </mstts:express-as>
    </voice>
</speak>

It looks like your code is already setting the Speak and Voice tags, so I tried passing just the following text. But when I do, I get an error.

<mstts:express-as style="cheerful">That'd be just amazing!</mstts:express-as>

2022-11-07 19:22:09.633 ERROR (MainThread) [homeassistant.components.websocket_api.http.connection] [139668934221904] Error handling message: Unknown error (unknown_error) from 127.0.0.1 (Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:106.0) Gecko/20100101 Firefox/106.0)
Traceback (most recent call last):
  File "/home/steve/ha/home-assistant-core/homeassistant/components/websocket_api/decorators.py", line 27, in _handle_async_response
    await func(hass, connection, msg)
  File "/home/steve/ha/home-assistant-core/homeassistant/components/media_source/__init__.py", line 192, in websocket_resolve_media
    media = await async_resolve_media(hass, msg["media_content_id"])
  File "/home/steve/ha/home-assistant-core/homeassistant/components/media_source/__init__.py", line 155, in async_resolve_media
    return await item.async_resolve()
  File "/home/steve/ha/home-assistant-core/homeassistant/components/media_source/models.py", line 83, in async_resolve
    return await self.async_media_source().async_resolve_media(self)
  File "/home/steve/ha/home-assistant-core/homeassistant/components/tts/media_source.py", line 117, in async_resolve_media
    url = await manager.async_get_url_path(
  File "/home/steve/ha/home-assistant-core/homeassistant/components/tts/__init__.py", line 423, in async_get_url_path
    filename = await self._async_get_tts_audio(
  File "/home/steve/ha/home-assistant-core/homeassistant/components/tts/__init__.py", line 489, in _async_get_tts_audio
    extension, data = await provider.async_get_tts_audio(message, language, options)
  File "/home/steve/ha/home-assistant-core/homeassistant/components/tts/__init__.py", line 676, in async_get_tts_audio
    return await self.hass.async_add_executor_job(
  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/steve/ha/home-assistant-core/homeassistant/components/microsoft/tts.py", line 188, in get_tts_audio
    data = trans.speak(
  File "/home/steve/ha/home-assistant-core/venv/lib/python3.10/site-packages/pycsspeechtts/pycsspeechtts.py", line 71, in speak
    voice.append(ElementTree.XML('<prosody>'+text+'</prosody>'))
  File "/usr/lib/python3.10/xml/etree/ElementTree.py", line 1342, in XML
    parser.feed(text)
xml.etree.ElementTree.ParseError: unbound prefix: line 1, column 9

Microsoft's documentation on Style: https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-synthesis-markup#style

Please let me know what I can do to help troubleshoot this. Thanks!

Support for Azure TTS Multilingual Voices

Hello, this is not a problem, but a question or a request for support.

My problem is that multilanguage languages have appeared in Azure TTS, and my question is whether the program can handle it. As I see it, HomeAssistant uses this component and I am in trouble with the use of multilang languages.

If it is not supported, I would ask if it is possible to implement this. The quickest would be a secondary_language option and it would put the whole text into this lang xml child:

Example:

<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="https://www.w3.org/2001/mstts" xml:lang="en-US">
    <voice name="en-US-JennyMultilingualV2Neural">
        <lang xml:lang="de-DE">
            Wir freuen uns auf die Zusammenarbeit mit Ihnen!
        </lang>
    </voice>
</speak>

Of course, it can be made more complicated, such as having multilingual Texts in one text, but I would be very happy with this too :)

Thank you in advance for your response.
Viktor

Support for SSML

I've been using a patched version of your library to accept SSML in the text, but it's kind of hard-coded:

diff -rNu pycsspeechtts_old/pycsspeechtts.py pycsspeechtts/pycsspeechtts.py
--- pycsspeechtts_old/pycsspeechtts.py  2020-12-29 02:06:51.014818639 +0100
+++ pycsspeechtts/pycsspeechtts.py      2020-12-29 01:51:28.205551472 +0100
@@ -49,7 +49,7 @@
         prosody.set('volume', volume)
         prosody.set('pitch', pitch)
         prosody.set('contour', contour)
-        prosody.text = text
+        prosody.append(ElementTree.fromstring(text))

         headers = {"Content-Type": "application/ssml+xml",
                    "X-Microsoft-OutputFormat": output,

Would you accept a PR for this? It should be backwards compatible in some ways, but the text must be valid XML otherwise it will throw an exception. Or maybe add a boolean parameter (xml_text=True/False)? What do you think?

Thanks!

setup.py should live in the root folder

Ok, it will work when it's in the /src folder, it's just bad style.

Moving forward to new Speech Services

Hi Jeroen,

I tried to configure my Home Assistant for TTS via the Microsoft TTS Component. I failed because I configured an instance of a the new speech services instead of the bing text to speech service. I know the Cognitive Speech Services are still in preview but wanted to start the discussion on how to move forward.

Do we need to build a new package like this? Do we need a second component in HA?

Shall I update the HA docs to make the distinction between the new and old service more clear?

README.md is missing in Pypi SDIST archive

This missing file leads to a build failure:

>>> Jobs: 0 of 1 complete, 1 failed                 Load avg: 1.31, 0.93, 0.91
 * Package:    dev-python/pycsspeechtts-1.0.4
 * Repository: HomeAssistantRepository
 * Maintainer: [email protected]
 * USE:        abi_x86_64 amd64 elibc_glibc kernel_linux python_targets_python3_8 test userland_GNU
 * FEATURES:   network-sandbox preserve-libs sandbox userpriv usersandbox
>>> Unpacking source...
>>> Unpacking pycsspeechtts-1.0.4.tar.gz to /var/tmp/portage/dev-python/pycsspeechtts-1.0.4/work
>>> Source unpacked in /var/tmp/portage/dev-python/pycsspeechtts-1.0.4/work
>>> Preparing source in /var/tmp/portage/dev-python/pycsspeechtts-1.0.4/work/pycsspeechtts-1.0.4 ...
>>> Source prepared.
>>> Configuring source in /var/tmp/portage/dev-python/pycsspeechtts-1.0.4/work/pycsspeechtts-1.0.4 ...
>>> Source configured.
>>> Compiling source in /var/tmp/portage/dev-python/pycsspeechtts-1.0.4/work/pycsspeechtts-1.0.4 ...
 * python3_8: running distutils-r1_run_phase distutils-r1_python_compile
python3.8 setup.py build -j 10
Traceback (most recent call last):
  File "setup.py", line 2, in <module>
    with open("../README.md", "r") as fh:
FileNotFoundError: [Errno 2] No such file or directory: '../README.md'
 * ERROR: dev-python/pycsspeechtts-1.0.4::HomeAssistantRepository failed (compile phase):
 *   (no error message)

License file missing

The project contains no license file. The only license hint is MIT in setup.py.

It would be helpful if the project contains the correct license file and also have it included in the tarball found on pypi.

Custom voice usage

Hi there,
I'm trying my best to learn to code, but this is a bit beyond me to make a PR right now.

would it be possible to create a true/false variable:

is_custom_voice: false

when the variable is true an extra field is needed:

deploymentId=XXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXX

This would then construct the SpeechUrlTemplate

SpeechUrlTemplate = "https://{}.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId=XXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXX"

Note the voice. and the ?deploymentId=

Its very niche I know :)