thudm / codegeex4 Goto Github PK

CodeGeeX4-ALL-9B, a versatile model for all AI software development scenarios, including code completion, code interpreter, web search, function calling, repository-level Q&A and much more.

Home Page: https://codegeex.cn

License: Apache License 2.0

Python 73.48% Rust 26.52%

codegeex4's People

Contributors

Stargazers

Watchers

Forkers

skyformat99 tomaslei001 1008610010 donjuanplatinum fai666 offsoc kushinn strategist922 mojowebs utopic-dev thomascherickal ai-awesome-app zzdx713 zhouzy0622 outbrave xingyu-zhong kekewind qzl164 ai-in-pm princetrunks stophobia alexocculate liunix61 fengyunzaidushi adminduzj azaj01 xcytxs zhaozibo ahzutech augiot davgit pixeperfa chandan0000 clic-ethiopia mikeoller82 z-m-c barrensea fellowtraveler hgxszhj albertwang001 binwensun styner2023 coyang mtfgqd liying1989 jasonyang170 cvcuiwei gshan4056 williamlisci kustomzone winionm maloong-zhang tian0zhi miaoxd khaliq88 mandlinsarah zhoumz123 dxd214 iyangming happy-xlf huang-sys hilam8899 eltociear azure-dragon-ai blazionyou zqs74 jifeng13139 swaying-robin yaojian95 borgorg sanyaade-teachings

codegeex4's Issues

本地模式下无法使用，模型不支持Qwen

使用本地模式，模型使用CodeQwen，提示连接成功，但是无法使用，请问如何解决呢

[Feature] 请求将codegeex4-all-9b发布到ollama平台中；

ollama从0.2.7开始已经支持function call了，请求将codegeex4-all-9b发布到ollama平台中；

Hey there! 👋 Your project CodeGeeX4 caught my eye on the trending list. Incredible job! 💪 Could you share what the most rewarding part of developing this project was? Looking forward to your insights!

Best regards,
@nectariferous
Telegram: t.me/nectariferous

🌐 Web: likhon.xyz
© All rights reserved.

👉 For Promo/Help/Support, DM @nectariferous

set plugin to use local api

do you have a plan to make the vs/jetbrains plugin supporting local deployed CodeGeeX Service

prompt 过长， Code Review 效果不如 GLM4-9B?

system: 你是审查机器人请给出代码评审，审查包含代码质量、可读性、拼写错误、可维护性、复用性、Bug、命名等。
prompt:

- 你是审查机器人请给出代码评审，审查包含代码质量、可读性、拼写错误、可维护性、复用性、Bug、命名， 

- 不需要审查代码风格、未使用的变量、导入路径, 

- 中文或代码回答

- 你要输出 JSON 文本，格式为：
[
 {
 filename:"app.tsx", 
 
 review:[{code: 代码行号, content:"review 评论，此处可用markdown", score:"严重等级(致命、严重、一般、轻微)"}]
 }
]

- 代码前面附带了代码行号

- 你只需要对修改后，与修改前有差异的代码评论

- 下面是代码数据：

修改前的代码:  
xxxx
 
修改后的代码:
xxxxx

参数：

temperature=0.2,
presence_penalty=1.2,
top_p=0.95

支持使用Llama-factory进行推理吗

如题

modelscope页面挂了

Question about CodeGeeX4

Hey! 👋 I came across your CodeGeeX4 project. It's fantastic! Keep up the good work! 💪 Could you send me more details on Telegram? Also, please review my work and follow me on GitHub @nectariferous. Thanks!

typo

能提供一个预训练的demo吗

我想使用我自己的代码库，用text类型的文件，来预训练一下。但是现在在create_datasets时报错

train_dataset, eval_dataset = create_datasets(tokenizer, args)

希望官方能提供一个demo，类似starcoder2

IDEA升级到2024.2后codegeex插件报错

报错信息：Plugin 'CodeGeeX' (version '2.13.0-223') is not compatible with the current version of the IDE, because it requires build 241.* or older but the current build is IU-242.20224.300

vscode ollama连接失败

ollama已经成功启动了codegeex4模型，但是用codegeex去连接，一直提示：连接错误，请确认模型配置

最低的gpu显存是多少？

How is the ability to generate SQL from text?

CodeGeeX4在智谱AI开放平台中的商用申请链接中下拉选项框无此模型

在README_zh.md这个文件中的底部申请链接中，对应到官网开放平台的商用申请表中无CodeGeeX模型

Reproducing BigCodeBench Scores

Hi there,

We're trying to reproduce the scores reported on BigCodeBench using v0.1.7post2. As there is no chat template provided inside the HF tokenizer config, I slightly changed the code and used the default chat template. So far, I got 49.0 on Complete and 38.9 on Instruct. The reproduced Instruct performance is a bit lower than your reported one. I doubt you had a customized template during the evaluation. Could you share more details about your setup?

bigcode-project/bigcodebench#19

Cheers

ollama 403 err

run on ollama, 403 err

对于分词器，能否提供tokenizer.json文件？

能否像CodeQwen1.5-7B-Chat一样，提供tokenizer.json文件？

webstorm特殊场景下，无法完成输入，输入的内容会被删除。

问题：在输入时，内容会被删除，导致无法继续输入。

参考视频如下：

333.mp4

输出全部为0

inputs = tokenizer.apply_chat_template([{"role": "user", "content": "你是谁"}],
                                       add_generation_prompt=True, tokenize=True, return_tensors="pt",
                                       return_dict=True).to(device)
with torch.no_grad():
    print("inputs", inputs)
    outputs = model.generate(**inputs, max_length=512)
    print("outputs", outputs)
    outputs = outputs[:, inputs['input_ids'].shape[1]:]
    print("outputs_part", outputs)
    print(tokenizer.decode(outputs[0], skip_special_tokens=True))

outputs_part tensor([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0]], device='cuda:0')

vscode插件无法补全括号内的参数

如果函数有完整的括号，在其内部无法进行补全

vscode远程，代码补全和生成注释无法使用，其他功能均正常。

如题。代码补全和生成注释报一下错误。

2024-07-10 10:41:34.001 [info] Local mode state: false
2024-07-10 10:41:34.001 [info] CodeGeeX is now active
2024-07-10 10:41:36.106 [info] Registering commands...
2024-07-10 11:12:33.167 [error] AggregateError: 
	at internalConnectMultiple (node:net:1114:18)
	at internalConnectMultiple (node:net:1177:5)
	at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
	at listOnTimeout (node:internal/timers:575:11)
	at process.processTimers (node:internal/timers:514:7)
2024-07-10 11:14:15.027 [info] 10/7/2024, 11:14:15 am [Error] [Add Comment] Internal error occurs
2024-07-10 11:14:37.670 [info] 10/7/2024, 11:14:37 am [Error] [Add Comment] Internal error occurs

请问这个问题如何解决？

There is a typo for the key value of line 29 in function_call_demo /main.py file

The key word suppose to be "type" rather than "typo"

关于嵌入模型的选择问题

目前demo中的嵌入模型使用的是在线服务，embedding-2模型。但是因为我们公司的开发环境是不联网的，所以想请教一下embedding-2模型是否有开源计划。
或者使用其它的开源嵌入模型会不会影响效果。因为目前我在网上找到的资料，关于RAG基本上都是针对自然语言文本的，不清楚代码和自然语言在向量转换和向量检索方面是否一样。

IDEA插件不支持本地模式

看演示编辑器插件中可以使用本地模式连接模型工具，但是在idea中安装最新版的插件后没有发现有可以开启本地模式的页面

在PyCharm中本地模式无法自动补全代码，报“服务器错误 Code 500 服务器内部错误”

PyCharm：2024.1
插件：2.13.0 - 223
ollama：0.3
模型：ollama默认的 codegeex4 模型

问题：可以连接模型、聊天，但在编程窗口无法自动补全，提示“服务器错误：Code 500 服务器内部错误”

webstorm插件，会导致较大项目一直卡死

问题：在webstorm使用codegeex插件时，部分大型项目卡死，如下图，经过测试，会卡两三个小时以上。

受影响版本：如下图：

关闭插件后，项目秒开，经过多轮验证。确定是该插件引起。

项目情况：

项目使用umi/dumi 库，会在src下生成.umi缓存文件，文件数量巨大，尽管webstorm已经设置该目录为禁用索引目录，但是插件仍然对该目录文件进行了遍历索引解析，从而导致卡死。

期望

希望优化此问题。

completions not working

使用ollama本地运行codegeex4
请求补全的接口curl -s http://localhost:62333/v1/completions -d @/tmp/p3.json|jq .

请求的body

{
  "model": "codegeex4",
  "messages": [
    {
      "role": "user",
      "content": "###LANGUAGE:Python\n###MODE:BLOCK\n<|code_suffix|>\n<|code_prefix|>def parse_nested_parens(paren_string: str) -> List[int]:\n    \"\"\" Input to this function is a string represented multiple groups for nested parentheses separated by spaces.\n    For each of the group, output the deepest level of nesting of parentheses.\n    E.g. (()()) has maximum two levels of nesting while ((())) has three.\n\n    >>> parse_nested_parens('(()()) ((())) () ((())()())')\n   [2, 3, 1, 3]\n\"\"\"\n<|code_middle|>"
    }
  ],
  "temperature": 0.2,
  "top_p": 0.95,
  "max_tokens": 64,
  "presence_penalty": 1,
  "stream": false,
  "stop": []
}

响应 "finish_reason": "load"，没有生成text是何原因？

vscode插件cmd down和原有快捷键冲突！

vsocde的codegeex插件，cmd + ↓ 代表接收下一条建议，但是在所有情况下都会与编辑器、控制台的向下滚动快捷键冲突，导致失效

vllm加载模型之后没推理，一直满GPU占用，是怎么回事？

代码如下:

from transformers import AutoTokenizer
from vllm import LLM, SamplingParams
max_model_len, tp_size = 131072, 1
model_name = "/models/codegeex4-all-9b"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
llm = LLM(
    model=model_name,
    tensor_parallel_size=tp_size,
    max_model_len=max_model_len,
    trust_remote_code=True,
    enforce_eager=True,
)
stop_token_ids = [151329, 151336, 151338]
sampling_params = SamplingParams(temperature=0.95, max_tokens=1024, stop_token_ids=stop_token_ids)

vllm的输出：

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
INFO 07-11 15:51:44 llm_engine.py:169] Initializing an LLM engine (v0.5.1) with config: model='/models/codegeex4-all-9b', speculative_config=None, tokenizer='/models/codegeex4-all-9b', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, rope_scaling=None, rope_theta=None, tokenizer_revision=None, trust_remote_code=True, dtype=torch.bfloat16, max_seq_len=131072, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=1, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=True, kv_cache_dtype=auto, quantization_param_path=None, device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='outlines'), observability_config=ObservabilityConfig(otlp_traces_endpoint=None), seed=0, served_model_name=/models/codegeex4-all-9b, use_v2_block_manager=False, enable_prefix_caching=False)
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
WARNING 07-11 15:51:44 tokenizer.py:126] Using a slow tokenizer. This might cause a significant slowdown. Consider using a fast tokenizer instead.
WARNING 07-11 15:51:44 utils.py:562] Using 'pin_memory=False' as WSL is detected. This may slow down the performance.
INFO 07-11 15:55:24 model_runner.py:255] Loading model weights took 17.5635 GB

占用显存40GB，如果用xfomers的backend也一样

ubuntu下pycharm的codegeex登录打开浏览器不输入链接

如题，可能是因为我的chromium浏览器的原因，所以我请求加入复制登陆链接的功能

代码补全如何关联现有项目

代码补全出来的代码有时候是根本不存在的方法，如何关联现有项目已有的方法？

报错了请问这个怎么处理？

2024-07-05 18:43:44 - Invalid URL '': No scheme supplied. Perhaps you meant https://?
Traceback (most recent call last):
File "C:\Users\xxx\AppData\Local\Programs\Python\Python39\lib\site-packages\chainlit\utils.py", line 44, in wrapper
return await user_function(**params_values)
File "run.py", line 150, in main
for part in stream:
File "J:\Tools\CodeGeeX4\CodeGeeX4\repodemo\llm\api\codegeex4.py", line 21, in codegeex4
response = requests.post(url, json=data, headers=headers, verify=False, stream=True)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\api.py", line 115, in post
return request("post", url, data=data, json=json, **kwargs)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\sessions.py", line 575, in request
prep = self.prepare_request(req)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\sessions.py", line 486, in prepare_request
p.prepare(
File "C:\Users\xxx\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\models.py", line 368, in prepare
self.prepare_url(url, params)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python39\lib\site-packages\requests\models.py", line 439, in prepare_url
raise MissingSchema(
requests.exceptions.MissingSchema: Invalid URL '': No scheme supplied. Perhaps you meant https://?

按照公众号视频号教程，下载项目装了库打开了chainlit ，run了之后，就这样了，模型也执行对应文件下载完毕

但是相同的模型服务器配置同事使用vscode可以正常访问本地模型，所以我怀疑咱这个idea的插件我是不是配置的不对还是使用姿势不对？

is function calling supported?

I have tested this model with same system prompt of GLM4. This model only emits parameters, but not the function name. For example, here, get_weather is not generated:

You  > what's the weather like in beijng?
A.I. > 
{"city_name": "beijing"}