discordier / sam Goto Github PK

View Code? Open in Web Editor NEW

This project forked from s-macke/sam

520.0 21.0 65.0 6.19 MB

Software Automatic Mouth - Tiny Speech Synthesizer

Home Page: https://discordier.github.io/sam/

JavaScript 94.92% HTML 4.54% Makefile 0.48% Dockerfile 0.06%

speech-synthesis c64 reciter phonemes software-automatic-mouth

sam's Introduction

SAM Software Automatic Mouth

What is SAM?

This is a vanilla Javascript port of the Text-To-Speech (TTS) software SAM (Software Automatic Mouth) for the Commodore C64 published in the year 1982 by Don't Ask Software (now SoftVoice, Inc.).

It is based on the adaption to C by Stefan Macke and the refactorings by Vidar Hokstad and 8BitPimp

It includes a Text-To-Phoneme converter called reciter and a Phoneme-To-Speech routine for the final output.

It aims for low memory impact and file size which is the reason I want to avoid the Emscripten conversion by Stefan (which weights about 414kb).

For further details, refer to retrobits.net

Some analytics of S.A.M. in general can be found in Artyom Skrobov's (@tyomitch) blog who also provided pretty insightful PRs. Visit his blog https://habr.com/ru/post/500764/ (russian) or the google translated version here.

Usage

Require the module via yarn: yarn add sam-js

Use it in your program:

import SamJs from 'sam-js';

let sam = new SamJs();

// Play "Hello world" over the speaker.
// This returns a Promise resolving after playback has finished.
sam.speak('Hello world');

// Generate a wave file containing "Hello world" and download it.
sam.download('Hello world');

// Render the passed text as 8bit wave buffer array (Uint8Array).
const buf8 = sam.buf8('Hello world');

// Render the passed text as 32bit wave buffer array (Float32Array).
const buf32 = sam.buf32('Hello world');

Typical voice values

DESCRIPTION          SPEED     PITCH     THROAT    MOUTH
Elf                   72        64        110       160
Little Robot          92        60        190       190
Stuffy Guy            82        72        110       105
Little Old Lady       82        32        145       145
Extra-Terrestrial    100        64        150       200
SAM                   72        64        128       128

Original docs.

I have bundled a copy of the original manual in this repository, see the manual file in the docs directory.

License

The software is a reverse-engineered version of a commercial software published more than 30 years ago. The current copyright holder is SoftVoice, Inc. (www.text2speech.com)

Any attempt to contact the company failed. The website was last updated in the year 2009. The status of the original software can therefore best described as Abandonware (http://en.wikipedia.org/wiki/Abandonware)

As long this is the case I cannot put my code under any specific open source software license Use it at your own risk.

Contact

If you have questions don' t hesitate to ask me. If you discovered some new knowledge about the code please file an issue.

sam's People

Contributors

Stargazers

Watchers

Forkers

fairlighteth jaxcore tyomitch stasyuk93 maxpereira kyuiki ediboko1980 robdancer isonimen shelbyserinah ilya-pi08 duke-is-my therealfishdoctor sergatv sparrow3432 scratchbotmanager mopssif testwedwadadwadwad nabilafk ethancoderguy joehicks sylveondev ds604 dmajoit davincee ydielgaming2019 reticivis-net wiseowl23 joshuaword2alt 4d4l bdarkar206 essingen123 alexander-grube anx13ty98 luciascarlet dzucconi bsiever injectedpie simon-tang nathanwilliams3141 tkiepuszew codes-13 yahirgamesllc marciowb monczak openaicodexguy anothershahid forkback driftplayz lipao38 ms-painter heraldod manniru kairusds-testing darkwoodco yalton collasperz sam-mullen04 bmarotta rubberocket seofernando25 bonziworld12 quandaledingletrue drag0nkid gargaj

sam's Issues

Rebuild to NPM without loads of console.log

It would be really awesome if a simple rebuild of the package could be done where the build command is ran without development flags so that this is actually usable in a server application. If I want to use this I basically have to take the output, and post it into my own private repo and remove PrintOutput function, and other logging nightmares so its usable. Luckily the dist is pretty small. I won't explain how you can use this in a server, just know that the other thing is pretty annoying for logging.

Question about the "pitch" scale

Hello! I'm wanting to explore the options of making SAM sing for a project. I'm wondering what the pitch system is based on. Is there any relative connection to the 12 TET system or is there any kind of chart that showcases the frequencies? Would love to learn more about it.

Expose a toWavBuffer function

A server returning the contents from sam.buf8(...) doesn't automatically plays on the client.

It would be nice if there was a sam.toWavBuffer() function for this and similar use cases.

The code for it already exists in the RenderBuffer function found here, so an implementation of it could just be a refactor of that function:

const toWavBuffer = (audiobuffer: Uint8Array) => {
    // This function is basically a refactor of RenderBuffer

    const text2Uint8Array = (text: string) => {
        const buffer = new Uint8Array(text.length);
        text.split("").forEach((e, index) => {
            buffer[index] = e.charCodeAt(0);
        });
        return buffer;
    };

    // Calculate buffer size.
    const realbuffer = new Uint8Array(
            // ...
    );

    let pos = 0;
    const write = (buffer: Uint8Array) => {
        realbuffer.set(buffer, pos);
        pos += buffer.length;
    };

    //RIFF header
    write(text2Uint8Array("RIFF")); // chunkID
    write(Uint32ToUint8Array(audiobuffer.length + 12 + 16 + 8 - 8)); // ChunkSize
    write(text2Uint8Array("WAVE")); // riffType
    //format chunk
    write(text2Uint8Array("fmt "));
    write(Uint32ToUint8Array(16)); // ChunkSize
    write(Uint16ToUint8Array(1)); // wFormatTag - 1 = PCM
    write(Uint16ToUint8Array(1)); // channels
    write(Uint32ToUint8Array(22050)); // samplerate
    write(Uint32ToUint8Array(22050)); // bytes/second
    write(Uint16ToUint8Array(1)); // blockalign
    write(Uint16ToUint8Array(8)); // bits per sample
    //data chunk
    write(text2Uint8Array("data"));
    write(Uint32ToUint8Array(audiobuffer.length)); // buffer length
    write(audiobuffer);

    return realbuffer;
}

const DownloadBuffer = (audiobuffer: Uint8Array) => {
     const blob = new Blob([toWavBuffer(realbuffer)], { type: "audio/wav" });
     // Other code to download buffer
}

any thoughts on it? I understand if you do not want to expose that on the API for simplicity.

I would be more than glad to open a PR.

Also thank you so much for making this library 🙏

Exported WAV files are low quality

It appears that the exported wave files (using this.download function) are only 8-bit. They sound much worse than the spoken in-browser output. I messed around in the code for a bit modifying both the wave file header and the this.buf8 and this.buf32 functions and I couldn't find a way to make it sound better.

Typo: Dipthong

"Dipthong" is actually spelled "diphthong". The typo is all over the code.

Not working

When I have a sentence sometimes no matter how many times I press it won’t work.

Comment: '*' in the phoneme isn't a wildcard

In the comments, it's noted that the character '*' in the second position of a phoneme, such as 'R*', is a "wildcard".

This is a bit misleading, as "wildcard" would indicate that "S*" should match "SH". This isn't the case.

The '*' in the second character position indicates the phoneme is only a single character long.

When matching phonemes, SAM first checks the two characters at the current buffer position against the 2 character phones phonemes (that is, they don't have a '*' in the second character). If this fails SAM then checks the single character at the buffer position against all the single character phonemes (the ones that do have a '*' in the second position).

Some strings cause an infinite loop

When certain strings are inputted an infinite loop happens where nothing will be spoken, the window will hang, and the developer console will spam output. Tested on the most current version of both Firefox and Chrome.

Steps to replicate:

Go to https://discordier.github.io/sam/
Type SoftwareSoftwareSoftwareSoftware in the text box
Press "Say" button

Other strings that produce same results:
dhaaaxaxaxaxaxaxaxaxaxaxaxaxax
CurrentlyCurrentlyCurrentlyCurrently

It seems to have to do with the repetition of characters with no spaces. If you add spaces in between the words it works fine. (ex: Currently Currently Currently Currently)

Here is a sample of the console output:

How i download the audio?

i dont know how to download audio.

Шаг 1 вы с другом в заброшенной психиатрической больнице

Any way to generate phonemes for lip sync animation?

I'm not seeing a way

update required librarys to include libsdl1.2-dev in the compile section

took me about 15 mins before i finally figured out that libsdl1.2-dev was the package required to compile the code and get it to work.
even if this doesn't get fixed those who are gonna do this will atleast know what they are missing instead of getting confused.

SyntaxError: Cannot use import statement outside a module

Import doesn't work.

Help plz. I'm a complete zero in js.

Allowed USAGE

Hard for me to verify the legitimacy of this but it appears at at some point someone reached out to Joseph Katz regarding either this project or a similar one. I can't find contact info for Joseph but it appears he is willing to grant individuals permission to use the software for non-commercial use

. I've been doing heavy research into tracking him down as I wanted to use it in something commercial but I'm not sure what the answer would be on that.

In any case it might be worth looking into.

https://nvda.groups.io/g/nvda/topic/softvoice_information_request/5170758

Didn't work

I copied the code, yet it says: "[error] Line 1: Uncaught SyntaxError: Unexpected identifier 'add'".
Is there something missing or this code isn't for JavaScript?

Best way to call this from C++ code?

The original C version of SAM is pretty buggy - this one seems to be far better maintained, but it's in JavaScript. Any idea how I can easily embed this in a C++ (MSVC) program?

How to get Sam's speech in WAV Data URL format?

I need this for one of my projects. I will be very grateful if you help!

the speech doesnt restart when you hit "say" twice

this isn't that bad but when using longer messages then changing them, you'd think hitting "say" again would just restart it but no, it causes it to start saying the beginning and where it was at the same time creating a jumbled mess (of course refreshing the page fixes this but I just wanted to point it out)

Question about using SAM in the WEB

Hello!

I'd like to use the SAM in one WEB-game, so I have question. Did I understand correctly that I can use samjs.js-file as a standalone script? Do I need any other scripts for working SAM in browser? I've experimented in Chrome, TTS is working with this one file, but maybe I don't know something...

And one another question, how I can enable the phonetics mode?

Thanks

Comment: "Frequency" data is actually a stride value

The table frequencyData values are referred to in the comments frequency values, which isn't strictly true.

They're actually stride values. Per your comments in the rendering sections, the original version of SAM used a pre-calculated sine table instead of calling the sin() function.

By using a stride value, SAM is able to find the next sine value for each formant simply by adding the stride value to the table index (and, of course, wrapping at the end of the table).