nczkevin / chatglm-web Goto Github PK

用 FastAPI 和 Vue3 搭建的 ChatGLM 网页 (前端样式仿照chatgpt-web, 支持chatglm流式输出、前端调整参数、上下文选择、保存图片、知识库问答等功能)

License: MIT License

JavaScript 0.37% Shell 0.41% Dockerfile 0.47% TypeScript 28.23% HTML 1.08% Python 5.53% Vue 46.08% Less 17.80% CSS 0.04%

chatglm-web's Introduction

ChatGLM Web

介绍

默认模型更改为ChatGLM2-6B

这是一个可以自己在本地部署的ChatGLM网页，使用ChatGLM-6B模型来实现接近ChatGPT的对话效果。源代码Fork和修改于Chanzhaoyu/chatgpt-web & WenJing95/chatgpt-web& 开源模型ChatGLM

与ChatGPT对比，ChatGLM Web有以下优势：

独立部署。ChatGLM Web只需要一个能运行ChatGLM-6B模型的服务器即可使用，可以使用自己微调的GLM模型。
完全离线。ChatGLM Web依赖于ChatGLM-6B模型，可以在离线环境或者内网中使用。

待实现路线

[✗] 支持chatglm、llama等模型

[✓] 追上原仓库的功能（权限控制、图片、消息导入导出、Prompt Store）

[✗] 支持langchain的知识问答

[✗] More...

快速部署

如果你不需要自己开发，只需要部署使用，可以直接跳到 ~~使用最新版本docker镜像启动（待完善）~~

开发环境搭建

Node

node 需要 ^16 || ^18 版本（node >= 14 需要安装 fetch polyfill ），使用 nvm 可管理本地多个 node 版本

node -v

PNPM

如果你没有安装过 pnpm

npm install pnpm -g

Python

python 需要 3.8 以上版本，进入文件夹 /service 运行以下命令

pip install --no-cache-dir -r requirements.txt

开发环境启动项目

后端服务

硬件需求（参考自chatglm-6b官方仓库）

量化等级	最低 GPU 显存（推理）	最低 GPU 显存（高效参数微调）
FP16（无量化）	13 GB	14 GB
INT8	8 GB	9 GB
INT4	6 GB	7 GB

# 使用知识库功能需要在启动API前运行
python gen_data.py
# 进入文件夹 `/service` 运行以下命令
python main.py

还有以下可选参数可用：

device 使用设备,cpu或者gpu
quantize 量化等级。可选值：16，8，4，默认为16
host HOST，默认值为 0.0.0.0
port PORT，默认值为 3002

也就是说可以这样启动（这里修改端口的话前端也需要修改，建议使用默认端口）

python main.py --device='cuda:0' --quantize=16 --host='0.0.0.0' --port=3002

前端网页

根目录下运行以下命令

# 前端网页的默认端口号是3000，对接的后端服务的默认端口号是3002，可以在 .env 和 .vite.config.ts 文件中修改
pnpm bootstrap
pnpm dev

打包为docker容器

-- 待更新

常见问题

Q: 为什么 Git 提交总是报错？

A: 因为有提交信息验证，请遵循 Commit 指南

Q: 如果只使用前端页面，在哪里改请求接口？

A: 根目录下 .env 文件中的 VITE_GLOB_API_URL 字段。

Q: 文件保存时全部爆红?

A: vscode 请安装项目推荐插件，或手动安装 Eslint 插件。

Q: 前端没有打字机效果？

A: 一种可能原因是经过 Nginx 反向代理，开启了 buffer，则 Nginx 会尝试从后端缓冲一定大小的数据再发送给浏览器。请尝试在反代参数后添加 proxy_buffering off;，然后重载 Nginx。其他 web server 配置同理。

Q: build docker容器的时候，显示exec entrypoint.sh: no such file or directory？

A: 因为entrypoint.sh文件的换行符是LF，而不是CRLF，如果你用CRLF的IDE操作过这个文件，可能就会出错。可以使用dos2unix工具将LF换成CRLF。

参与贡献

贡献之前请先阅读贡献指南

感谢原作者Chanzhaoyu和所有做过贡献的人，开源模型ChatGLM

赞助

如果你觉得这个项目对你有帮助，请给我点个Star。

如果情况允许，请支持原作者Chanzhaoyu

License

MIT © NCZkevin

chatglm-web's People

Contributors

Stargazers

Watchers

Forkers

wanddy zero506 iamleon121 drswith appzk listener-he gaoxiaojun bluemain zorrock curiszhou itsharex sun1122 comymh reasonz yzxzero sunday2146 xingxingzaixian eight-corner hu3402379 gary2018x liseri qxcool charl66 knowlimit sinlt guoguozhang gpt-develop-group dumplingbao eggwardhan 99insight xiangganluo wochenlong pkafma-aon dreamstudioai wuyx inspireme juicehub jeffersonchou hongyujiang diandianyezi fred913 327101303 pangao1990 jason-chen-2017 5102a yihaocompany weideliti xzhoubinx index103000 mag05270 daxiang18 jacksnowfuck kai2020-hello bigboy-pp sflyers angelhand wojiaozhangyang 15392778677 copperdong juntangmr arsenal0817 caotianxing zhuqingyun0510 fingermask guanhuashan leyif cr9294 jacksonhou bestsongc jcongjason amirwzw flamewww rogersuncledrew eventlowop

chatglm-web's Issues

程序小白，求助示范比如如何让-web和 GLM等模型实现步骤

首先感谢大佬的努力

我在本地电脑已经安装且已经可以使用chatGLM了，但不知道怎么让 -web识别，希望大佬能讲解具体步骤和修改什么文件讲解

Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Traceback (most recent call last):
File "/root/chat/lib/python3.11/site-packages/urllib3/connectionpool.py", line 790, in urlopen
response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "/root/chat/lib/python3.11/site-packages/urllib3/connectionpool.py", line 536, in _make_request
response = conn.getresponse()
^^^^^^^^^^^^^^^^^^
File "/root/chat/lib/python3.11/site-packages/urllib3/connection.py", line 454, in getresponse
httplib_response = super().getresponse()
^^^^^^^^^^^^^^^^^^^^^
File "/usr/python3.11/lib/python3.11/http/client.py", line 1374, in getresponse
response.begin()
File "/usr/python3.11/lib/python3.11/http/client.py", line 318, in begin
version, status, reason = self._read_status()
^^^^^^^^^^^^^^^^^^^
File "/usr/python3.11/lib/python3.11/http/client.py", line 287, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/root/chat/lib/python3.11/site-packages/requests/adapters.py", line 486, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File "/root/chat/lib/python3.11/site-packages/urllib3/connectionpool.py", line 844, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File "/root/chat/lib/python3.11/site-packages/urllib3/util/retry.py", line 470, in increment
raise reraise(type(error), error, _stacktrace)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/chat/lib/python3.11/site-packages/urllib3/util/util.py", line 38, in reraise
raise value.with_traceback(tb)
File "/root/chat/lib/python3.11/site-packages/urllib3/connectionpool.py", line 790, in urlopen
response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "/root/chat/lib/python3.11/site-packages/urllib3/connectionpool.py", line 536, in _make_request
response = conn.getresponse()
^^^^^^^^^^^^^^^^^^
File "/root/chat/lib/python3.11/site-packages/urllib3/connection.py", line 454, in getresponse
httplib_response = super().getresponse()
^^^^^^^^^^^^^^^^^^^^^
File "/usr/python3.11/lib/python3.11/http/client.py", line 1374, in getresponse
response.begin()
File "/usr/python3.11/lib/python3.11/http/client.py", line 318, in begin
version, status, reason = self._read_status()
^^^^^^^^^^^^^^^^^^^
File "/usr/python3.11/lib/python3.11/http/client.py", line 287, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/root/chatglm-web/service/main.py", line 178, in
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/chat/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 663, in from_pretrained
tokenizer_class = get_class_from_dynamic_module(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/chat/lib/python3.11/site-packages/transformers/dynamic_module_utils.py", line 388, in get_class_from_dynamic_module
final_module = get_cached_module_file(
^^^^^^^^^^^^^^^^^^^^^^^
File "/root/chat/lib/python3.11/site-packages/transformers/dynamic_module_utils.py", line 286, in get_cached_module_file
commit_hash = model_info(pretrained_model_name_or_path, revision=revision, token=use_auth_token).sha
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/chat/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/root/chat/lib/python3.11/site-packages/huggingface_hub/hf_api.py", line 1675, in model_info
r = get_session().get(path, headers=headers, timeout=timeout, params=params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/chat/lib/python3.11/site-packages/requests/sessions.py", line 602, in get
return self.request("GET", url, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/chat/lib/python3.11/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/chat/lib/python3.11/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/chat/lib/python3.11/site-packages/requests/adapters.py", line 501, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

404 Not Found

使用https://github.com/THUDM/ChatGLM-6B的后端api和此项目的前端。
后端报错显示如下
INFO: Started server process [7348]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:3002 (Press CTRL+C to quit)
INFO: 127.0.0.1:50608 - "POST /chat-process HTTP/1.1" 404 Not Found
INFO: 127.0.0.1:50609 - "POST /chat-process HTTP/1.1" 404 Not Found
前端报错如下：
2023/4/16 15:03:40 ：你好

2023/4/16 15:03:40：Request failed with status code 404

如何解决

win10 报毒

无法下载解压

要想实现打字机效果可以修改main.py代码

import asyncio
将yield "data: " + message修改为yield await asyncio.sleep(0) or ("data: " + message)

本地知识库

请问这个如何设置本地知识库，应该如何才能根据自己定义的知识库回答问题呢

chatglm2出来了能否同样支持

小白提问，能具体讲讲，將其和比如ChatGLM结合的具体过程么

首先感谢大佬的制作和精妙的UI

我没有多少开发软件和代码的经验，想询问下具体將chatglm-6b和该ui结合的方式，已经本地电脑上可以打开模型了，但不知道如何让-web去识别到和与我的glm对接，希望能详细说明步骤和修改什么文件内容

你好，请问404错误可能是什么原因导致的？

求助，这是什么问题

cuda已经正常安装了

Traceback (most recent call last):
File "D:\chatglm-web-main\service\main.py", line 186, in
model = AutoModel.from_pretrained(model_path, trust_remote_code=True).half().quantize(quantize).cuda()
File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b\658202d88ac4bb782b99e99ac3adff58b4d0b813\modeling_chatglm.py", line 1434, in quantize
self.transformer = quantize(self.transformer, bits, empty_init=empty_init, **kwargs)
File "C:\Users\Administrator/.cache\huggingface\modules\transformers_modules\THUDM\chatglm-6b\658202d88ac4bb782b99e99ac3adff58b4d0b813\quantization.py", line 159, in quantize
weight_tensor=layer.attention.query_key_value.weight.to(torch.cuda.current_device()),
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\cuda_init_.py", line 674, in current_device
lazy_init()
File "C:\Users\Administrator\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\cuda_init.py", line 239, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

chatglm-web功能建议和想法

欢迎大家提供建议，目前后续想法包括

追上chatgpt-web原仓库的一些功能，包括权限控制，Prompt Store
支持和chatgpt双模型效果对比
数据导入、导出

RuntimeError: context has already been set

执行python main.py --device='cpu' --quantize=4 时报错，之前已经读取知识库文件了

connect ECONNREFUSED

前端
➜ Local: http://localhost:3000/
➜ Network: http://192.168.1.9:3000/
➜ Network: http://172.28.80.1:3000/
➜ press h to show help
11:08:46 [vite] http proxy error at /chat-process:
Error: connect ECONNREFUSED ::1:3002
at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1494:16)
后端
oading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 8/8 [00:15<00:00, 1.89s/it]
INFO: Started server process [17736]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:3002 (Press CTRL+C to quit)

部署好了chatglm2以后，没有开代理，前端没有打字效果

通过http://127.0.0.1:3000/#/chat/1002方式
第一：特别慢，需要30秒
第二：而且前端没有打字机效果

前端后端都支持流式输出吗

如何使用本地知识库

前端没有打字机效果

前端后端都是在windows本地部署，本地没有安装nginx，但前端没有打字机效果

WARNING - asyncio - socket.send() raised exception.

当我客户端停止接收数据后，服务端抛出大量报错
07/18/2023 16:14:17 - WARNING - asyncio - socket.send() raised exception.
07/18/2023 16:14:17 - WARNING - asyncio - socket.send() raised exception.
07/18/2023 16:14:17 - WARNING - asyncio - socket.send() raised exception.
07/18/2023 16:14:18 - WARNING - asyncio - socket.send() raised exception.
07/18/2023 16:14:18 - WARNING - asyncio - socket.send() raised exception.
07/18/2023 16:14:18 - WARNING - asyncio - socket.send() raised exception.
07/18/2023 16:14:18 - WARNING - asyncio - socket.send() raised exception.
07/18/2023 16:14:18 - WARNING - asyncio - socket.send() raised exception.
07/18/2023 16:14:18 - WARNING - asyncio - socket.send() raised exception.
07/18/2023 16:14:18 - WARNING - asyncio - socket.send() raised exception.
07/18/2023 16:14:19 - WARNING - asyncio - socket.send() raised exception.
07/18/2023 16:14:19 - WARNING - asyncio - socket.send() raised exception.

frontend root_path support?

已阅读README和上游项目README
无类似issue

麻烦问一下前端支持挂载在子路径吗？
比如主页从 http://chatglm.nczkevin.com/ 变更为 http://nczkevin.com/chatglm/
直接代理前端的话主要问题在于 main.js 和 src/client.js 会有路径问题

默认记忆20条，但是当问第11个问题的时候，会报错

分析了下原因，是因为记忆20条，第11个问题的时候，总共会有21条数据，裁剪数据会把第一个问题裁剪掉，导致问答不完整，需要把main.py里的记忆20改成21

克隆下来后跑不起来service

python main.py --device='cpu' --host='127.0.0.1' --port='7680'
Traceback (most recent call last):
File "/Users/weicheng/Documents/Dev/llm/chatglm-web/service/main.py", line 16, in
import knowledge
File "/Users/weicheng/Documents/Dev/llm/chatglm-web/service/knowledge.py", line 4, in
ix = storage.open_index()
File "/opt/homebrew/lib/python3.10/site-packages/whoosh/filedb/filestore.py", line 176, in open_index
return indexclass(self, schema=schema, indexname=indexname)
File "/opt/homebrew/lib/python3.10/site-packages/whoosh/index.py", line 421, in init
TOC.read(self.storage, self.indexname, schema=self._schema)
File "/opt/homebrew/lib/python3.10/site-packages/whoosh/index.py", line 618, in read
raise EmptyIndexError("Index %r does not exist in %r"
whoosh.index.EmptyIndexError: Index 'MAIN' does not exist in FileStorage('knowdata')

运行前端时报错

您好，部署repo后出现如下错误：

运行后端成功，显示代码如下
运行前端时，显示若干ip地址
打开 http://localhost:3000，输入问题，出现如下错误

python main.py 是默认参数，没有改参数，在前端输入问题，回复500错误，请问如何调试？