aixcoder-plugin / aixcoder-7b Goto Github PK

official repository of aiXcoder-7B Code Large Language Model

License: Apache License 2.0

Python 100.00%

aixcoder-7b's Introduction

aiXcoder-7B Code Large Language Model

🏠 Official website｜🛠 VS Code Plugin｜🛠 Jetbrains Plugin｜🤗 Model Weights｜WeChat｜WeChat Official Account

Welcome to the official repository of aiXcoder-7B Code Large Language Model. This model is designed to understand and generate code across multiple programming languages, offering state-of-the-art performance in code completion, comprehension, generation, and more tasks about programming languages.

Table of Contents

Model Introduction
Quickstart
Data for aiXcoder 7B
Training
Details of Experimental Results
License
Acknowledgments

Model Introduction

As the capabilities of large code models are gradually being unearthed, aiXcoder has consistently pondered on how to make these models more beneficial in real development scenarios. To this end, we have open-sourced aiXcoder 7B Base, which has undergone extensive training on 1.2T Unique Tokens, and the model's pre-training tasks as well as the contextual information have been uniquely designed for real-world code generation contexts.

aiXcoder 7B Base stands out as the most effective model in code completion scenarios among all models of similar parameter sizes, and it also surpasses mainstream models like codellama 34B and StarCoder2 15B in the average performance on the multilingual nl2code benchmark.

In our ongoing exploration to apply large code models, the release of aiXcoder 7B Base represents a significant milestone. The current version of aiXcoder 7B Base is a foundational model that focuses on improving the efficiency and accuracy of code completion and code generation tasks, aiming to provide robust support for developers in these scenarios. It is important to note that this version has not undergone specific instruct-tuning, which means it might not yet offer optimal performance for specialized higher-level tasks such as test case generation and code debugging.

However, we have plans for further development of the aiXcoder model series already in motion. In the near future, we aim to release new versions of the model that have been meticulously instruct-tuned for a wider range of programming tasks, including but not limited to test case generation and code debugging. Through these instruct-tuned models, we anticipate offering developers more comprehensive and deeper programming support, helping them to maximize efficiency at every stage of software development.

aiXcoder 7B surpasses mainstream models in nl2code benchmark. aiXcoder-7B is an enhancement of aiXcoder-7B-Base, fine-tuned on one hundred thousand data entries similar to Evol-instruct for one epoch.

aiXcoder 7B Base surpasses mainstream models in code completion scenarios.

Quickstart

Environment Requirements

Option 1: Build Env

To run the model inference code, you'll need the following environment setup:

Python 3.8 or higher
PyTorch 2.1.0 or higher
sentencepiece 0.2.0 or higher
transformers 4.34.1 or higher (if run inference by transformers library)

Please ensure all dependencies are installed using the following command:

conda create -n aixcoder-7b python=3.11
conda activate aixcoder-7b
git clone [email protected]:aixcoder-plugin/aiXcoder-7b.git
cd aiXcoder-7b
pip install -r requirements.txt

requirements.txt listed all necessary libraries and their versions.

To achieve faster inference speeds, especially for large models, we recommend installing flash attention. Flash attention is an optimized attention mechanism that significantly reduces computation time for transformer-based models without sacrificing accuracy.

Before proceeding, ensure your environment meets the CUDA requirements as flash attention leverages GPU acceleration. Follow these steps to install flash attention:

git clone [email protected]:Dao-AILab/flash-attention.git
cd flash-attention
MAX_JOBS=8 python setup.py install

Option 2: Docker

For a consistent and isolated environment, we recommend running the model inference code using Docker. Here's how to set up and use Docker for our model:

Install Docker: If you haven't already, install Docker on your machine.
Pull the Docker Image: Pull the Docker image from Docker Hub.

docker pull pytorch/pytorch:2.1.0-cuda11.8-cudnn8-devel

Run the Container: Once the image is pulled, you can run the model inside a Docker container.

docker run --gpus all -it -v /dev/shm:/dev/shm --name aix_instance pytorch/pytorch:2.1.0-cuda11.8-cudnn8-devel /bin/bash
pip install sentencepiece
git clone [email protected]:aixcoder-plugin/aiXcoder-7b.git
cd aiXcoder-7b

This command starts a container named aix_instance from the pytorch image. You can interact with the model inside this container.

To achieve faster inference speeds, especially for large models, we recommend installing flash attention.

git clone [email protected]:Dao-AILab/flash-attention.git
cd flash-attention
MAX_JOBS=8 python setup.py install

Model Inference: Within the Docker container, you can run the model inference code as described in the Inference Example section.

Using Docker provides a clean, controlled environment that minimizes issues related to software versions and dependencies.

Model Weights

You can download the model weights from the following link:

aiXcoder Base Download
aiXcoder Instruct Download (Comming soon...)

Inference Example

Command Line Execution

For a quick start, you can run the model inference directly from the command line:

torchrun --nproc_per_node 1 sess_megatron.py --model_dir "path/to/model_weights_dir"

Replace "path/to/model_weights_dir" with the actual path to your downloaded model weights.

or run inference with huggingface's transformers：

python sess_huggingface.py

Python Script Execution

Alternatively, you can invoke the model programmatically within your Python scripts. This method provides more flexibility for integrating the model into your applications or workflows. Here's a simple example on how to do it:

from sess_megatron import TestInference

infer = TestInference()
res = infer.run_infer(
    # for FIM style input, code_string stands for prefix context
    code_string="""# 快速排序算法""", 
    # for FIM style input, later_code stands for suffix context
    later_code="\n",
    # file_path should be a path from project to file
    file_path="test.py",
    # max num for generated tokens
    max_new_tokens=256,
)
print(res)

"""output:

def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[0]
    less = [i for i in arr[1:] if i <= pivot]
    greater = [i for i in arr[1:] if i > pivot]
    return quick_sort(less) + [pivot] + quick_sort(greater)


# 测试
arr = [3, 2, 1, 4, 5]
print(quick_sort(arr))  # [1, 2, 3, 4, 5]
"""

import torch
import sys
from hf_mini.utils import input_wrapper
from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda" # the device to load the model onto

tokenizer = AutoTokenizer.from_pretrained("aiXcoder/aixcoder-7b-base")
model = AutoModelForCausalLM.from_pretrained("aiXcoder/aixcoder-7b-base", torch_dtype=torch.bfloat16)


text = input_wrapper(
    # for FIM style input, code_string stands for prefix context
    code_string="# 快速排序算法",
    # for FIM style input, later_code stands for suffix context
    later_code="\n# 测试\narr = [3, 2, 1, 4, 5]\nprint(quick_sort(arr))  # [1, 2, 3, 4, 5]",
    # file_path should be a path from project to file
    path="test.py"
)

if len(text) == 0:
    sys.exit()

inputs = tokenizer(text, return_tensors="pt", return_token_type_ids=False)

inputs = inputs.to(device)
model.to(device)

outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=False))



"""output:
def quick_sort(arr):
    # 如果数组长度小于等于1，直接返回
    if len(arr) <= 1:
        return arr
    # 选择数组的第一个元素作为基准
    pivot = arr[0]
    # 初始化左右指针
    left, right = 1, len(arr) - 1
    # 循环直到左指针小于右指针
    while left < right:
        # 从右到左找到第一个小于基准的元素，与左指针元素交换
        if arr[right] < pivot:
            arr[left], arr[right] = arr[right], arr[left]
            left += 1
        # 从左到右找到第一个大于等于基准的元素，与右指针元素交换
        if arr[left] >= pivot:
            right -= 1
    # 将基准元素与左指针元素交换
    arr[left], arr[0] = arr[0], arr[left]
    # 对左半部分进行递归排序
    quick_sort(arr[:left])
    # 对右半部分进行递归排序
    quick_sort(arr[left + 1:])
    return arr</s>
"""

Quantized through bitsandbytes

We can also install Bitsandbytes through pip install bitsandbytes acceleration, and simply add configuration to perform int8 or int4 inference (if you need to further compress the temporary memory applied at runtime, it is recommended to install FlashAttention):

import sys
import torch
from hf_mini.utils import input_wrapper
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig    

# to use 4bit use `load_in_4bit=True` instead
bnb_config = BitsAndBytesConfig(load_in_8bit=True) 

device = "cuda" # the device to load the model onto

tokenizer = AutoTokenizer.from_pretrained("aiXcoder/aixcoder-7b-base")
model = AutoModelForCausalLM.from_pretrained("aiXcoder/aixcoder-7b-base", quantization_config=bnb_config, device_map=device, attn_implementation='flash_attention_2')

text = input_wrapper(
    code_string="# 快速排序算法",
    later_code="\n",
    path="test.py"
)

if len(text) == 0:
    sys.exit()

inputs = tokenizer(text, return_tensors="pt", return_token_type_ids=False)

inputs = inputs.to(device)    

outputs = model.generate(**inputs, max_new_tokens=256)
print(f"Model memory footprint: {model.get_memory_footprint() / 2**20:.2f} MB")
print(f"Torch max memory allocated: {torch.cuda.max_memory_allocated() / 2**20:.2f} MB")

"""
load_in_4bit=True:
    - Model memory footprint: 5656.52 MB
    - Torch max memory allocated: 6448.89 MB

load_in_8bit=True:
    - Model memory footprint: 9008.52 MB
    - Torch max memory allocated: 10061.51 MB
"""

Fine-tuning example

If you want to fine-tune on your own code, you can quickly get started with training using Huggingface's PEFT tools. Before doing so, you need to install the necessary libraries with pip install -r requirements_peft.txt.

Then, execute the training command:

accelerate launch finetune.py \
        --model_id "aiXcoder/aixcoder-7b-base" \
        --dataset_name "bigcode/the-stack-smol" \
        --subset "data/rust" \
        --dataset_text_field "content" \
        --split "train" \
        --max_seq_length 1024 \
        --max_steps 10000 \
        --micro_batch_size 1 \
        --gradient_accumulation_steps 8 \
        --learning_rate 5e-6 \
        --warmup_steps 20 \
        --fim_rate 0.5 \
        --num_proc "$(nproc)"

In the fine-tuning script, we have constructed a simple random FIM (Fill-In-the-Middle) training task that can train the model on the completion and generation capabilities on your own data. It should be noted that the aiXcoder-7b-base uses structured FIM during pre-training, which involves constructing a complete code block as the MIDDLE. However, creating such training data involves syntactic parsing, which may require developers to implement themselves.

Data for aiXcoder 7B

The data for aiXcoder is divided into a core dataset and an extended dataset. The core dataset comprises the programming languages commonly used in development, as well as natural languages closely related to code. The core dataset's programming languages mainly include nearly a hundred mainstream languages such as C++, Python, Java, and JavaScript, while the natural language component primarily consists of StackOverflow Q&As, technical blogs, code documentation, and computer science papers. The extended data mainly consists of filtered open-source code datasets, high-quality English natural language datasets, and high-quality Chinese natural language datasets.

The aiXcoder core dataset is mainly used to enhance the performance of the large code model in the aforementioned programming languages, undergoing a rigorous filtering and selection process. Specifically, this process includes the following steps: 1) Selection of raw data; 2) Comprehensive ranking and selection of projects; 3) Code deduplication and the removal of automatically generated code using methods such as MinHashes (Broder, 2000); 4) Identification and handling of personal sensitive information; 5) Cleaning of commented code; 6) Syntactic analysis to filter incorrect or anomalous code files; 7) Static analysis to detect and eliminate 163 types of high-risk bugs and 197 types of defects in mainstream programming languages such as Java, C++, Python, and JavaScript.

Raw Data Selection
- Exclude projects under copyleft licenses.
- Deduplicate projects gathered from various code hosting platforms and open-source datasets
Project-Level Comprehensive Ranking
- Calculate project metrics, including the number of Stars, Git Commit counts, and the quantity of Test files.
- Exclude the lowest 10% of data based on a comprehensive score.
Code File-Level Filtering
- Remove automatically generated code.
- Employ near-deduplication for redundancy removal.
Sensitive Information Removal
- Use named entity recognition models to identify and delete sensitive information such as names, IP addresses, account passwords, and URLs.
Commented Code
- Randomly deleting large sections of commented code
Syntax Analysis
- Delete code with syntax parsing errors or syntactical errors in the top fifty languages.
Static Analysis
- Utilize static analysis tools to scan for and locate 161 types of Bugs affecting code reliability and maintainability, as well as 197 types of vulnerabilities impacting code security.

# "__init__" method should not return a value

# Noncompliant: a TypeError will be raised
class MyClass(object):
    def __init__(self):
        self.message = 'HelloWorld'
        return self  

# Compliant solution
class MyClass(object):
    def __init__(self):
        self.message = 'HelloWorld'

The mentioned code illustrates a bug pattern in Python where the init method should not return a value.

Training

Training Hyperparameters

Tokenizer:

Byte Pair Encoding (BPE) based on bytecode
Vocabulary size of 49,152

Model Structure:

RoPE (Rotary Positional Embedding) for relative position encoding
SwiGLU as the intermediate layer
Grouped Query Attention

Training Parameters:

Structured FIM (Fill in the middle) training tasks make up 70% of the training, while autoregressive training tasks account for 30%
Pretraining sequence length of 32,768

Batch processing method

After preprocessing, our code data is organized by project, with the order of files within a project considering both rules and randomness. Specifically, we attempt to cluster similar or dependent code files together using methods like Calling Graph, K-Means clustering, file path similarity, and TF-IDF distance, to help the model better understand the relationships between code files. However, the ordering of code files also incorporates randomness, since in real programming scenarios, projects are not complete, and code files with similarities or dependencies may not be fully developed yet.

By ensuring that the project code files overall exhibit randomness while locally having similar or dependent relationships, we stretch the project code files into a vector and organize the sequence of batches using the Transformer-XL style processing. Even though the sequence length of a single batch has already reached 32,768 during the pre-training process, this method still allows for the extension of the visible sequence length to be even longer.

Pre-training Tasks

Unlike other natural language large models or code models, in the context of code programming, aiXcoder considers the structural characteristics of code itself, aiming to have the model predict complete code nodes. In simple terms, the aiXcoder 7b training tasks combine the fill in the middle (FIM, Bavarian et al., 2022) and parser generator tool techniques. When constructing training data, we parse the code into an abstract syntax tree (AST) and randomly select a complete node to construct a FIM task. The rationale behind this approach is twofold: first, we need to ensure that the input data is relatively complete, with both the preceding and subsequent parts being at the same hierarchical level. Secondly, we also want the model's predictions to be more complete, with the generated code having a full hierarchical structure.

for i in range(20):
    if i % 5 == 0:
        print("Hello World")

Given that simple code can be parsed into an abstract syntax tree (AST), we will construct structured Fill In the Middle (FIM) training tasks based on the nodes of the AST.

Suppose we select the IF node in the above AST, then we will construct training samples from the IF node and its subtree. The following two examples are equivalent:

# fill in the middle, SPM mode
"<s>▁<AIX-SPAN-PRE>▁<AIX-SPAN-POST>        print(\"Hello World\")\n▁<AIX-SPAN-MIDDLE># the file path is: test.py\n# the code file is written by Python\nfor i in range(20):\n    if i % 5 == 0:<\s>"

# fill in the middle, PSM mode
"<s>▁<AIX-SPAN-PRE># the file path is: test.py\n# the code file is written by Python\nfor i in range(20):\n    if ▁<AIX-SPAN-POST>        print(\"Hello World\")\n▁<AIX-SPAN-MIDDLE>i % 5 == 0:<\s>"

Details of Experimental Results

NL2Code Benchmarks

Table 1 shows the performance of the aiXcoder-7B Base model on standalone method generation benchmarks. Our model achieves the current best results among the large-scale pre-trained base models within hundreds of billions of parameters.

Code Completion (Fill in the Middle)

Different from the standalone nl2code task in Table 1, in real-world programming scenarios, we need to consider the code completion capability in the context of the cursor position. Generally, various open-source large language models for code incorporate the Fill in the Middle (FIM) mode during pre-training to enhance the model's ability to generate more accurate results when considering the code context. Therefore, we will use FIM as the default code completion method to evaluate the performance of each model in real-world programming scenarios.

Currently, the mainstream evaluation dataset for context-aware code completion is the single-line evaluation method proposed by Santacoder (Ben Allal et al., 2023). This evaluation dataset extracts single lines of code from HumanEval or MultiPL-E and evaluates the Exact Match metric of the model's generated results, given the complete preceding and following context.

To further evaluate the code completion capabilities of large language models for code in a more fine-grained manner, aiXcoder has built an evaluation dataset that is larger in size, more diverse in the code being tested, longer in the context length of the code being tested, and closer to real-world development projects. This evaluation dataset will also be open-sourced on GitHub simultaneously. During the evaluation process, we ensure that different large language models for code use the same maximum sequence length of 16K and evaluate the generation performance in different scenarios, such as generating complete method blocks, conditional blocks, loop processing blocks, exception handling blocks, and a total of thirteen cases.

Table 3 shows the average generation performance of different models in different languages. The final evaluation results are the average of all completion scenarios and evaluation samples. The aiXcoder 7B Base model achieves the best performance across major programming languages and various evaluation criteria, indicating that aiXcoder 7B Base has the best basic code completion capability among all open-source models of the same scale and is the most suitable base model for providing code completion capabilities in real-world programming scenarios.

For each evaluation result in Table 3, there are more detailed evaluation dimensions. Tables 4 to 7 show the details of the multi-dimensional evaluation of different models in different languages:

Method signature indicates the model's capability to generate method signatures based on context.
Method body represents the model's ability to generate a complete method body based on context, including the function signature.
Single line refers to the completion of single lines of code.
Method with comment denotes generating a corresponding function body based on context, including function signatures and comments.
Empty indicates the model's ability to predict emptiness in the case of complete context.
Method body top, mid, bottom show the code generation performance respectively in the upper part of the function body, the middle part, and the lower part.
If, for, while, try, switch statement represent the effects of generating conditional code blocks, loop code blocks, exception catch blocks, and conditional branch blocks.

Cross-file Code Evaluation

Another important capability of large language models for code is the ability to understand code context across files, as developers often need to consider information from other files within the current project when writing code. Therefore, we adopted the CrossCodeEval (Ding et al., 2023) evaluation dataset to assess the model's ability to extract cross-file contextual information.

In Table 8, we fix the context length for all models at 16K and format the input using the PSM pattern in FIM. After the model completes inference, all output results are decoded using Greedy Search. First, as a baseline, we evaluate the generation capabilities of various large code models in a single-file scenario.

Then, using BM25 as the similarity metric, we search for the three most similar code blocks within the project as prompts to reassess the model's generation performance. Finally, "w/Ref." indicates that we assume we know what the correct Reference code looks like, and then search for the three most similar codes within the project as prompts to re-evaluate the model's generation performance.

Ultimately, the aiXcoder-7B model performs very well in all languages, demonstrating our model's ability to extract contextual information, especially cross-file contextual information.

License

The source code in this repository is licensed under the Apache-2.0 License - see the LICENSE file for details. The model weights are licensed under the Model License for academic research use; for commercial use, please apply by sending an email to [email protected].

Acknowledgments

We would like to thank all contributors to the open-source projects and datasets that made this work possible.

Thank you for your interest in our Code Large Language Model. We look forward to your contributions and feedback!

aixcoder-7b's People

Contributors

Stargazers

Watchers

Forkers

near500 iyangpengyu shellsec cntank01 yuanzhongqiao kaxipig keyman9848 fengbao24 unibots1043 e0397123 frank1016 l658775 petercao ningchungui techthiyanes eamon-cai hesam7711 kasign gurshant23 genjiyaswanth dearborn-open-ai shubhamkr96 taoxijin eltociear tmumeoke scchess codeaudit fealves-systems charliechap3 decentralizedbug timosanmaz john-rice taroone1369 rezabehnoud eowyncim farhadfa22 miss-mickeymouse pink-black-bear sarahjanenft mdwoicke yacineali74 alexocculate dtvmastermlg splintersp farhadrafshar kalanertim jumanjimi edmaxim f901107 infoaitek24 nxtrevgen duvetsqua91 babixzbaby-vodhunter dependify dkstar11q oxledotion cephagneannouncermark facubingpodhani dullnes91 beckield0 screedebuspecialssing nshoesrox m-mckningou 88-shoesrox wannada-kitbs targetcoopsh ladyin-w squareandcompass sbrightdark hubayirp scopency93 23deltatana rocketsizzlin43czarri stophobia iminetd37 87cephagne bagotoxic-y jjiggyou idealed-godzillayellow saberieflashgal allishoesa diddystblueba surrealsleekn hydrupsw91 linecode tredesboi 79maintema gazernumero-84 bpodhani ailmendra-l soccertary-98 i-excillu mandralg60 cuecardsh81 screedebu-finaltalk 12isoleph franko00007 m-inhibexio a-flavoredbubble leonloves-glenesto

aixcoder-7b's Issues

支持使用vllm推理吗？

目前aixcoder7b支持使用vllm加速推理吗？

Can I configure the JetBrains plugin to point to a locally self-hosted aiXcoder model?

I have deployed an aiXcoder model locally, but when using the aiXcoder plugin, I noticed that it requires me to log in. After a successful login, I couldn't find an option to configure the plugin to point to my self-hosted aiXcoder model. Is there a way to set the plugin to use a locally deployed model? Or are there any plans to introduce this configuration option in the future? Many thanks!

怎么实现vscode本地代码补全

本地部署怎么进行vscode的代码补全？有推荐的插件吗？应该使用7B-base模型还是7B模型？

[New Feature] Support fine tuning

RT.

Doest aiXcoder-7B support Rust?

请问下有官方量化版本的模型推出吗，例如GPTQ AWQ的

humaneval 效果测试

请问如果我要用aixcoder-7b做一个纯生成的任务,而不是FIM格式的,例如一个代码网页问答页面, 参数如何设置.
比如 aixcoder-7b在测试生成类题目,如human-eval python的时候, later_code和file_path,解码方式以及推理参数是如何设置的?

当前开源的模型支持哪些功能呢

当前开源出来的模型是否支持infilling 代码注释生成，代码翻译，代码翻译等功能呢？

aiXcoder-7B-base跟aiXcoder-7B有什么区别

看测试结果，aiXcoder-7B的准确率更高

[Question] Training about aiXcoder-7B

Congratulations on this wonderful work! I noticed that the Evol-Instruct method is utilized in aiXcoder-7B training. There are some differences between the traditional implementation of Evol-Instruct and aiXcoder's prompts modified based on FIM. Is there any specific implementation strategy or example for it? Thanks!

For enabling scripting option via this code

We can record all the command that was hit during an activity using script functionality in linux and can use this code generator to create a script to perform that activity

-- All the parameters that are used during activity can be asked at one go in form of a batch file

Used Chatgpt to explain more

In our current workflow, documenting and replicating complex activities in Linux environments is time-consuming and prone to errors. Manually recording every command executed during a task is tedious and often leads to inconsistencies. Moreover, recreating these activities requires manual intervention and may result in deviations from the original process. This inefficiency not only hampers productivity but also poses risks to the reliability and stability of our operations.

Objective:
The primary objective of this project is to develop a tool that automates the recording of commands executed during activities in Linux environments. This tool will capture the command sequence and generate a script that can be used to replicate the activity accurately. Furthermore, the tool will incorporate functionality to create batch files that prompt users for parameters, simplifying the execution of tasks with varying inputs.

Solution Overview:
The proposed solution consists of two main components:

Command Recording Module: This module will intercept and record all commands executed within a designated session or timeframe. It will capture the command sequence along with relevant metadata such as timestamps and user identifiers. The recorded data will be stored in a structured format for further processing. -- > This part is already done using "script" command in linux

Script Generation Module: Upon completion of an activity, the recorded commands will be processed by the script generation module to create a reproducible script. This script will encapsulate the sequence of commands required to perform the activity, ensuring consistency and accuracy in subsequent executions. Additionally, the module will provide an option to generate batch files that prompt users for input parameters, enhancing flexibility and usability.

Benefits:

Time Savings: By automating the process of recording and scripting activities, we can significantly reduce the time required to document and replicate tasks.
Accuracy and Consistency: The generated scripts ensure that activities are performed consistently, reducing the risk of errors and deviations.
Usability: Batch file generation simplifies the execution of tasks by prompting users for input parameters, making it easier to adapt scripts to different scenarios.
Knowledge Sharing and Collaboration: Standardized scripts enable seamless sharing of best practices and facilitate collaboration among team members.
Audit Trail: The recorded command history provides a detailed audit trail of activities, enhancing accountability and compliance.

有没有类似gradio的界面端程序支持？

如题，想问问各位大佬是否有gradio端界面调用aiXcoder-7B模型

支持多卡部署吗？

作为一个新手，我在部署的时候遇到了OOM的问题，我想问一下怎么样才能够多卡部署，能够给我一个示例吗？我将万分感激！

[Document] Add some examples of training data to the README.

RT.

[New Feature] Add some instructions for Hugging Face based methods

While previous version didn't explain how the weights should be loaded, most runs used the default loading method (downloaded from the modelscope), this version some instructions will be added to help those who want to run the model based on the hugging face framework loading local weight.

Can you support running it through ollama (https://github.com/ollama/ollama)?

支持自动生成注解和单元测试吗？

其他代码辅助的大模型有这些功能，还可以自动生成API文档。

sess_megatron.py 这个类中，如果要改成可多并发而且是流式输出的应该怎么写

看到def run_infer(）函数有一个while True: 循环收集数据

[New Feature] Training code for training or finetuning on private dataset and APIs and Other Language (C# etc.)

Thanks for the excellent work. Waiting for the update!

能否提供预训练脚本

基于公司数据做持续预训练, 可否提供预训练脚本,包括 lr，max_seq_len, 及其他需要注意的细节等

可以请教一下用什么工具实现语法检查和静态分析的吗

可以请教一下用什么工具实现语法检查和静态分析的吗，可否开源一些脚本，非常感谢

code_string和later_code的输入长度有限制吗？

你好，请问下，code_string和later_code的输入长度有限制吗？

请问你们的数据集fim-extended具体怎么测的会开源吗？我拿这份数据测了下Exact Match和你们的report结果相差甚远啊

deepseek 7B 的EM结果平均只有10多分，但是santacoder的数据结果和report的差不多

补全如何使用

我看了下你的 sess_huggingface.py文件，我的需求是想做一个代码补全的demo，但是发现回复的内容太多，我在思考是否跟input_wrapper提供的参数设置有关系
调用例子如以下：
"code_string": "The programming language I am using is Java. I only need you to complete the possible code that may be written at the end of my code, without providing any extra explanation or description. Please directly complete the code for your response field. If it cannot be processed, return an empty string. The current code is public static void main(String[] args) {\n SpringApplication application \u003d new SpringApplication(TranCoderApplication.class);\n application.addInitializers(SpringBeanUtil::setApplicationContext);\n spring."
我希望接口返回run(args);
但是实际接口返回：

the file path is: test.py

the code file is written by Python

The programming language I am using is Java. I only need you to complete the possible code that may be written at the end of my code, without providing any extra explanation or description. Please directly complete the code for your response field. If it cannot be processed, return an empty string. The current code is public static void main(String[] args) {
SpringApplication application = new SpringApplication(TranCoderApplication.class);
application.addInitializers(SpringBeanUtil::setApplicationContext);
spring.application.run(args);
}

Assistant:

public static void main(String[] args) {
SpringApplication application = new SpringApplication(TranCoderApplication.class);
application.addInitializers(SpringBeanUtil::setApplicationContext);
spring.application.run(args);
}

2.发现一个问题，就是代码生成完成之后的末尾，最后都会带上一个

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.