The ai-voice-chat from weberjulian

deployement issue

Hi,
It looks very interesting with ai chat bot.
I tried to install but unfortunately its not working. Could you share the project demo or is it tested in your machine?

Thanks,
Santhosh

Offline version

It would be nice if you make a fully local version, able to work with a local llm inference runnig under llamma.cpp

The program successfully started, but there was no response when speaking to the UI.

I like this project, the UI is very beautiful. Thanks to the author.

The backend has been running, but there is no response from the frontend conversation. I don't know what's going on, and there are no prompts from the backend logs.

request returned Internal Server Error for API route and version http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.24/containers/json?all=1&filters=%7B%22label%22%3A%7B%22com.docker
.compose.config-hash%22%3Atrue%2C%22com.docker.compose.project%3Dai-voice-chat%22%3Atrue%7D%7D, check if the server supports the requested API version

Minimal requirements for smooth functioning

Hi, I tried running the voice chat on my home computer and finally I ended up with this:

llm-1    | {"timestamp":"2023-12-29T10:38:56.854053Z","level":"WARN","fields":{"message":"Unable to use Flash Attention V2: GPU with CUDA capability 7 5 is not supported for Flash Attention V2\n"},"target":"text_generation_launcher"}
llm-1    | {"timestamp":"2023-12-29T10:38:56.868376Z","level":"WARN","fields":{"message":"Could not import Mistral model: Mistral model requires flash attn v2\n"},"target":"text_generation_launcher"}
llm-1    | {"timestamp":"2023-12-29T10:38:57.633639Z","level":"ERROR","fields":{"message":"Error when initializing model\nTraceback (most recent call last):\n  File \"/opt/conda/bin/text-generation-server\", line 8, in <module>\n    sys.exit(app())\n  File \"/opt/conda/lib/python3.9/site-packages/typer/main.py\", line 311, in __call__\n    return get_command(self)(*args, **kwargs)\n  File \"/opt/conda/lib/python3.9/site-packages/click/core.py\", line 1157, in __call__\n    return self.main(*args, **kwargs)\n  File \"/opt/conda/lib/python3.9/site-packages/typer/core.py\", line 778, in main\n    return _main(\n  File \"/opt/conda/lib/python3.9/site-packages/typer/core.py\", line 216, in _main\n    rv = self.invoke(ctx)\n  File \"/opt/conda/lib/python3.9/site-packages/click/core.py\", line 1688, in invoke\n    return _process_result(sub_ctx.command.invoke(sub_ctx))\n  File \"/opt/conda/lib/python3.9/site-packages/click/core.py\", line 1434, in invoke\n    return ctx.invoke(self.callback, **ctx.params)\n  File \"/opt/conda/lib/python3.9/site-packages/click/core.py\", line 783, in invoke\n    return __callback(*args, **kwargs)\n  File \"/opt/conda/lib/python3.9/site-packages/typer/main.py\", line 683, in wrapper\n    return callback(**use_params)  # type: ignore\n  File \"/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py\", line 83, in serve\n    server.serve(\n  File \"/opt/conda/lib/python3.9/site-packages/text_generation_server/server.py\", line 207, in serve\n    asyncio.run(\n  File \"/opt/conda/lib/python3.9/asyncio/runners.py\", line 44, in run\n    return loop.run_until_complete(main)\n  File \"/opt/conda/lib/python3.9/asyncio/base_events.py\", line 634, in run_until_complete\n    self.run_forever()\n  File \"/opt/conda/lib/python3.9/asyncio/base_events.py\", line 601, in run_forever\n    self._run_once()\n  File \"/opt/conda/lib/python3.9/asyncio/base_events.py\", line 1905, in _run_once\n    handle._run()\n  File \"/opt/conda/lib/python3.9/asyncio/events.py\", line 80, in _run\n    self._context.run(self._callback, *self._args)\n> File \"/opt/conda/lib/python3.9/site-packages/text_generation_server/server.py\", line 159, in serve_inner\n    model = get_model(\n  File \"/opt/conda/lib/python3.9/site-packages/text_generation_server/models/__init__.py\", line 259, in get_model\n    raise NotImplementedError(\"Mistral model requires flash attention v2\")\nNotImplementedError: Mistral model requires flash attention v2\n"},"target":"text_generation_launcher"}
llm-1    | {"timestamp":"2023-12-29T10:38:57.962449Z","level":"ERROR","fields":{"message":"Shard complete standard error output:\n\nTraceback (most recent call last):\n\n  File \"/opt/conda/bin/text-generation-server\", line 8, in <module>\n    sys.exit(app())\n\n  File \"/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py\", line 83, in serve\n    server.serve(\n\n  File \"/opt/conda/lib/python3.9/site-packages/text_generation_server/server.py\", line 207, in serve\n    asyncio.run(\n\n  File \"/opt/conda/lib/python3.9/asyncio/runners.py\", line 44, in run\n    return loop.run_until_complete(main)\n\n  File \"/opt/conda/lib/python3.9/asyncio/base_events.py\", line 647, in run_until_complete\n    return future.result()\n\n  File \"/opt/conda/lib/python3.9/site-packages/text_generation_server/server.py\", line 159, in serve_inner\n    model = get_model(\n\n  File \"/opt/conda/lib/python3.9/site-packages/text_generation_server/models/__init__.py\", line 259, in get_model\n    raise NotImplementedError(\"Mistral model requires flash attention v2\")\n\nNotImplementedError: Mistral model requires flash attention v2\n"},"target":"text_generation_launcher","span":{"rank":0,"name":"shard-manager"},"spans":[{"rank":0,"name":"shard-manager"}]}

So I guess my GPU is too old... So here comes the question: what kind of hardware do I need for this voice chat to work smoothly? What have you tested it on?

Any help will be appreciated :)

weberjulian / ai-voice-chat Goto Github PK

ai-voice-chat's People

Contributors

Stargazers

Watchers

Forkers

ai-voice-chat's Issues

deployement issue

Offline version

The program successfully started, but there was no response when speaking to the UI.

$ docker-compose up Error

Minimal requirements for smooth functioning

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent