The extension enables voice interaction with ChatGPT and Claude.ai in Chrome browser and other Chromium-based browsers (e.g. Edge). It allows the use of the AI model "Whisper" developed by OpenAI for voice to text transcription. You can also use the transcription method built into the browser (webkitSpeechRecognition). The extension appears and operates upon visiting chat.openai.com or claude.ai
- mai_ | Whisper to ChatGPT and Claude.ai
- you can talk to the chat by speaking into the microphone and its responses will be read out loud
- you can enable an option that will read your entire chat conversation out loud,or only read the last response aloud
- you can highlight a text fragment in the chat thread and enable a feature to read this fragment out loud
- in the extension configuration, you can set voice parameters, including the language in which you converse with the chat, voice, voice pitch and reading speed, voice transcription method
Check out these video demo to see the extension in action:
https://youtu.be/LN7LakWMjp8?si=nBo6j2vi9eocme6F
Below are a few screenshots that showcase the extension's features:
This is the settings menu where you can customize various aspects of the extension to suit your preferences.
https://chromewebstore.google.com/detail/mai-whisper-to-chatgpt-an/eikfokiiajomccicnkljhdkgeaoicmem
If you'd like to install the extension manually before it's available in the Chrome Store, follow these steps:
- Download the source code from this GitHub repository.
- Unzip the downloaded file to your preferred location.
- Open the Chrome browser and navigate to
chrome://extensions/
. - Enable "Developer mode" by toggling the switch in the upper-right corner.
- Click on the "Load unpacked" button that appears.
- Select the unzipped folder of the extension's source code.
- The extension should now be installed and visible in your list of extensions.
Note: Since the extension is installed manually, it won't automatically update. To update, you'll need to download the latest version from this repository and repeat the above steps.
The extension offers two methods of voice transcription:
- webkitSpeechRecognition - the default method using Chrome browser's API. Transcription is performed locally (offline). Note: This method does not add punctuation and is supported only in Chrome browser.
- Whisper - an AI model developed by OpenAI for speech to text transcription. It offers high-quality transcription with proper punctuation. Requires an OpenAI API key, which involves costs OpenAI pricing. Transcription is performed on OpenAI servers, which requires sending the audio recording.
- The extension utilizes speechSynthesis - an API provided by Chrome browser and other Chromium-based browsers (e.g., Edge, Opera, Brave). This allows for speech synthesis (TTS) in offline mode, without data transmission.
- In the settings, you can choose the language and voice used for speech synthesis, as well as adjust other parameters, including voice pitch and reading speed.
The extension does not collect or transmit any personal data. All settings are stored locally in your browser (localStorage). However, if you decide to use the "Whisper" transcription method and provide your OpenAI API key, the extension will communicate with OpenAI servers, sending audio recordings for transcription, and the OpenAI servers will return the transcription text.
This is an OpenSource project which I am making available under the GNU Affero General Public License v3.0 https://www.gnu.org/licenses/agpl-3.0.en.html on the GitHub.com platform.
The inspiration for writing this extension came from the project: talk-to-chatgpt by C-Nedelcu. I am very grateful to him for making it available! However, I needed an extension with slightly different functionality, and I also wanted to add support for OpenAI's Whisper model due to its excellent transcription. As I wrote, I had more and more different ideas, so I wrote this project from scratch and implemented a completely different project architecture, which I hope will allow for easy expansion with additional features and support for other pages and AI chats in the near future.
This project arose from curiosity, passion, and the joy of programming but at the same time required a significant amount of time and effort. If you like this extension and find it useful, I am very pleased. And if you can support with a donation to the Reborn Foundation, where I teach yoga and spine therapy classes www.bosajoga.pl, I would be very grateful.
Your donation will support the statutory activity of our foundation, enabling the continuation of yoga classes and spine therapy sessions. The goal of these sessions is to improve the quality of life of participants by promoting physical and mental health. Your support will help cover the necessary operational costs associated with running our activities, which is crucial for continuing the mission and will allow us to grow.
Thank you very much in advance for your support!