Code Monkey home page Code Monkey logo

awesome-llm-and-aigc's Introduction



🔥🔥🔥 This repository lists some awesome public projects about Large Language Model, Vision Foundation Model, AI Generated Content, the related Datasets and Applications.



  • Frameworks

    • Official Version

      • Large Language Model
        • GPT-1 : "Improving Language Understanding by Generative Pre-Training". (, 2018).

        • GPT-2 : "Language Models are Unsupervised Multitask Learners". (OpenAI blog, 2019). Better language models and their implications.

        • GPT-3 : "GPT-3: Language Models are Few-Shot Learners". (arXiv 2020).

        • InstructGPT : "Training language models to follow instructions with human feedback". (arXiv 2022). "Aligning language models to follow instructions". (OpenAI blog, 2022).

        • ChatGPT: Optimizing Language Models for Dialogue.

        • GPT-4: GPT-4 is OpenAI’s most advanced system, producing safer and more useful responses.

        • Whisper : Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. "Robust Speech Recognition via Large-Scale Weak Supervision". (arXiv 2022).

        • LLaMA : Inference code for LLaMA models. "LLaMA: Open and Efficient Foundation Language Models". (arXiv 2023).

        • LangChain : 🦜️🔗 LangChain.⚡ Building applications with LLMs through composability ⚡

        • Auto-GPT : Auto-GPT: An Autonomous GPT-4 Experiment. Auto-GPT is an experimental open-source application showcasing the capabilities of the GPT-4 language model. This program, driven by GPT-4, chains together LLM "thoughts", to autonomously achieve whatever goal you set. As one of the first examples of GPT-4 running fully autonomously, Auto-GPT pushes the boundaries of what is possible with AI.

        • GPT-Engineer : Specify what you want it to build, the AI asks for clarification, and then builds it. GPT Engineer is made to be easy to adapt, extend, and make your agent learn how you want your code to look. It generates an entire codebase based on a prompt.

        • StableLM : StableLM: Stability AI Language Models.

        • JARVIS : JARVIS, a system to connect LLMs with ML community. "HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace". (arXiv 2023).

        • MiniGPT-4 : MiniGPT-4: Enhancing Vision-language Understanding with Advanced Large Language Models.

        • minGPT : A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training.

        • nanoGPT : The simplest, fastest repository for training/finetuning medium-sized GPTs.

        • Claude : Claude is a next-generation AI assistant based on Anthropic’s research into training helpful, honest, and harmless AI systems.

        • MicroGPT : A simple and effective autonomous agent compatible with GPT-3.5-Turbo and GPT-4. MicroGPT aims to be as compact and reliable as possible.

        • Dolly : Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform. Hello Dolly: Democratizing the magic of ChatGPT with open models

        • LMFlow : An extensible, convenient, and efficient toolbox for finetuning large machine learning models, designed to be user-friendly, speedy and reliable, and accessible to the entire community. Large Language Model for All.

        • Open-Assistant : OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

        • Colossal-AI : Making big AI models cheaper, easier, and scalable. "Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training". (arXiv 2021).

        • Lit-LLaMA : ⚡ Lit-LLaMA. Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

        • GPT-4-LLM : "Instruction Tuning with GPT-4". (arXiv 2023).

        • StarCoder : 💫 StarCoder is a language model (LM) trained on source code and natural language text. Its training data incorporates more that 80 different programming languages as well as text extracted from GitHub issues and commits and from notebooks.

        • Stanford Alpaca : Stanford Alpaca: An Instruction-following LLaMA Model.

        • feizc/Visual-LLaMA : Open LLaMA Eyes to See the World. This project aims to optimize LLaMA model for visual information understanding like GPT-4 and further explore the potentional of large language model.

        • Lightning-AI/lightning-colossalai : Efficient Large-Scale Distributed Training with Colossal-AI and Lightning AI.

        • GPT4All : GPT4All: An ecosystem of open-source on-edge large language models. GTP4All is an ecosystem to train and deploy powerful and customized large language models that run locally on consumer grade CPUs.

        • ChatALL : Concurrently chat with ChatGPT, Bing Chat, bard, Alpaca, Vincuna, Claude, ChatGLM, MOSS, iFlytek Spark, ERNIE and more, discover the best answers.

        • 1595901624/gpt-aggregated-edition : 聚合ChatGPT官方版、ChatGPT免费版、文心一言、Poe、chatchat等多平台,支持自定义导入平台。

        • FreedomIntelligence/LLMZoo : ⚡LLM Zoo is a project that provides data, models, and evaluation benchmark for large language models.⚡ Tech Report

        • shm007g/LLaMA-Cult-and-More : News about 🦙 Cult and other AIGC models.

        • X-PLUG/mPLUG-Owl : mPLUG-Owl🦉: Modularization Empowers Large Language Models with Multimodality.

        • i-Code : The ambition of the i-Code project is to build integrative and composable multimodal Artificial Intelligence. The "i" stands for integrative multimodal learning. "CoDi: Any-to-Any Generation via Composable Diffusion". (arXiv 2023).

        • WorkGPT : WorkGPT is an agent framework in a similar fashion to AutoGPT or LangChain.

        • h2oGPT : h2oGPT is a large language model (LLM) fine-tuning framework and chatbot UI with document(s) question-answer capabilities. "h2oGPT: Democratizing Large Language Models". (arXiv 2023).

        • FlagAI|悟道·天鹰(Aquila) : FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model. Our goal is to support training, fine-tuning, and deployment of large-scale models on various downstream tasks with multi-modality.

        • ChatGLM-6B : ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型。 ChatGLM-6B 是一个开源的、支持中英双语的对话语言模型,基于 General Language Model (GLM) 架构,具有 62 亿参数。 "GLM: General Language Model Pretraining with Autoregressive Blank Infilling". (ACL 2022). "GLM-130B: An Open Bilingual Pre-trained Model". (ICLR 2023).

        • ChatGLM2-6B : ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型。ChatGLM2-6B 是开源中英双语对话模型 ChatGLM-6B 的第二代版本,在保留了初代模型对话流畅、部署门槛较低等众多优秀特性的基础之上,ChatGLM2-6B 引入了更强大的性能、更强大的性能、更高效的推理、更开放的协议。

        • Chinese LLaMA and Alpaca : 中文LLaMA&Alpaca大语言模型+本地CPU/GPU部署 (Chinese LLaMA & Alpaca LLMs). 本项目开源了中文LLaMA模型和指令精调的Alpaca大模型。这些模型在原版LLaMA的基础上扩充了中文词表并使用了中文数据进行二次预训练,进一步提升了中文基础语义理解能力。同时,中文Alpaca模型进一步使用了中文指令数据进行精调,显著提升了模型对指令的理解和执行能力。"Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca". (arXiv 2023).

        • MOSS : An open-source tool-augmented conversational language model from Fudan University. MOSS是一个支持中英双语和多种插件的开源对话语言模型,moss-moon系列模型具有160亿参数,在FP16精度下可在单张A100/A800或两张3090显卡运行,在INT4/8精度下可在单张3090显卡运行。MOSS基座语言模型在约七千亿中英文以及代码单词上预训练得到,后续经过对话指令微调、插件增强学习和人类偏好训练具备多轮对话能力及使用多种插件的能力。

        • CPM-Bee : CPM-Bee是一个完全开源、允许商用的百亿参数中英文基座模型,也是CPM-Live训练的第二个里程碑。

        • PandaLM : PandaLM: Reproducible and Automated Language Model Assessment.

        • SpeechGPT : "SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities". (arXiv 2023).

        • GPT2-Chinese : Chinese version of GPT2 training code, using BERT tokenizer.

        • 百度-文心大模型 : 百度全新一代知识增强大语言模型,文心大模型家族的新成员,能够与人对话互动,回答问题,协助创作,高效便捷地帮助人们获取信息、知识和灵感。

        • 阿里云-通义千问 : 通义千问,是阿里云推出的一个超大规模的语言模型,功能包括多轮对话、文案创作、逻辑推理、多模态理解、多语言支持。能够跟人类进行多轮的交互,也融入了多模态的知识理解,且有文案创作能力,能够续写小说,编写邮件等。

        • 商汤科技-日日新SenseNova : 日日新(SenseNova),是商汤科技宣布推出的大模型体系,包括自然语言处理模型“商量”(SenseChat)、文生图模型“秒画”和数字人视频生成平台“如影”(SenseAvatar)等。

      • Vision Foundation Model
        • Visual ChatGPT : Visual ChatGPT connects ChatGPT and a series of Visual Foundation Models to enable sending and receiving images during chatting. "Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models". (arXiv 2023).

        • InternImage : "InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions". (CVPR 2023).

        • GLIP : "Grounded Language-Image Pre-training". (CVPR 2022).

        • GLIPv2 : "GLIPv2: Unifying Localization and Vision-Language Understanding". (arXiv 2022).

        • DINO : "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection". (ICLR 2023).

        • DINOv2 : "DINOv2: Learning Robust Visual Features without Supervision". (arXiv 2023).

        • Grounding DINO : "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection". (arXiv 2023). "知乎「三分钟热度」《十分钟解读Grounding DINO-根据文字提示检测任意目标》"。

        • SAM : The repository provides code for running inference with the Segment Anything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model. "Segment Anything". (arXiv 2023).

        • Grounded-SAM : Marrying Grounding DINO with Segment Anything & Stable Diffusion & Tag2Text & BLIP & Whisper & ChatBot - Automatically Detect , Segment and Generate Anything with Image, Text, and Audio Inputs. We plan to create a very interesting demo by combining Grounding DINO and Segment Anything which aims to detect and segment Anything with text inputs!

        • SEEM : We introduce SEEM that can Segment Everything Everywhere with Multi-modal prompts all at once. SEEM allows users to easily segment an image using prompts of different types including visual prompts (points, marks, boxes, scribbles and image segments) and language prompts (text and audio), etc. It can also work with any combinations of prompts or generalize to custom prompts! "Segment Everything Everywhere All at Once". (arXiv 2023).

        • SAM3D : "SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model". (arXiv 2023).

        • ImageBind : "ImageBind: One Embedding Space To Bind Them All". (CVPR 2023).

        • Track-Anything : Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI. "Track Anything: Segment Anything Meets Videos". (arXiv 2023).

        • qianqianwang68/omnimotion : "Tracking Everything Everywhere All at Once". (arXiv 2023).

        • LLaVA : 🌋 LLaVA: Large Language and Vision Assistant. Visual instruction tuning towards large language and vision models with GPT-4 level capabilities. "Visual Instruction Tuning". (arXiv 2023).

        • M3I-Pretraining : "Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information". (arXiv 2022).

        • BEVFormer : BEVFormer: a Cutting-edge Baseline for Camera-based Detection. "BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers". (arXiv 2022).

        • Uni-Perceiver : "Uni-Perceiver: Pre-training Unified Architecture for Generic Perception for Zero-shot and Few-shot Tasks". (CVPR 2022).

        • AnyLabeling : 🌟 AnyLabeling 🌟. Effortless data labeling with AI support from YOLO and Segment Anything! Effortless data labeling with AI support from YOLO and Segment Anything!

        • X-AnyLabeling : 💫 X-AnyLabeling 💫. Effortless data labeling with AI support from Segment Anything and other awesome models!

        • Label Anything : OpenMMLab PlayGround: Semi-Automated Annotation with Label-Studio and SAM.

        • RevCol : "Reversible Column Networks". (arXiv 2023).

        • Macaw-LLM : Macaw-LLM: Multi-Modal Language Modeling with Image, Audio, Video, and Text Integration.

      • AI Generated Content
        • Stable Diffusion : Stable Diffusion is a latent text-to-image diffusion model. Stable Diffusion was made possible thanks to a collaboration with Stability AI and Runway and builds upon our previous work "High-Resolution Image Synthesis with Latent Diffusion Models". (CVPR 2022).

        • Stable Diffusion Version 2 : This repository contains Stable Diffusion models trained from scratch and will be continuously updated with new checkpoints. "High-Resolution Image Synthesis with Latent Diffusion Models". (CVPR 2022).

        • StableStudio : StableStudio by Stability AI. 👋 Welcome to the community repository for StableStudio, the open-source version of DreamStudio.

        • DragGAN : "Stable Diffusion Training with MosaicML. This repo contains code used to train your own Stable Diffusion model on your own data". (SIGGRAPH 2023).

        • AudioGPT : AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head.

        • PandasAI : Pandas AI is a Python library that adds generative artificial intelligence capabilities to Pandas, the popular data analysis and manipulation tool. It is designed to be used in conjunction with Pandas, and is not a replacement for it.

        • mosaicml/diffusion : Stable Diffusion Training with MosaicML. This repo contains code used to train your own Stable Diffusion model on your own data.

        • ControlNet : Let us control diffusion models! "Adding Conditional Control to Text-to-Image Diffusion Models". (arXiv 2023).

        • VisorGPT : Customize spatial layouts for conditional image synthesis models, e.g., ControlNet, using GPT. "VisorGPT: Learning Visual Prior via Generative Pre-Training". (arXiv 2023).

        • Midjourney : Midjourney is an independent research lab exploring new mediums of thought and expanding the imaginative powers of the human species.

        • DreamStudio : Effortless image generation for creators with big dreams.

        • Firefly : Adobe Firefly: Experiment, imagine, and make an infinite range of creations with Firefly, a family of creative generative AI models coming to Adobe products.

        • Jasper : Meet Jasper. On-brand AI content wherever you create.

        • : Whatever you want to ask, our chat has the answers.

        • : Leverage the AI-powered platform to ideate, create, distribute, and measure your content and prove your content marketing ROI.

        • ChatPPT : ChatPPT来袭命令式一键生成PPT。

    • Cpp Implementation

      • llama.cpp : Inference of LLaMA model in pure C/C++.

      • skeskinen/llama-lite : Embeddings focused small version of Llama NLP model.

      • whisper.cpp : High-performance inference of OpenAI's Whisper automatic speech recognition (ASR) model.

      • Const-me/Whisper : High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model.

      • wangzhaode/ChatGLM-MNN : Pure C++, Easy Deploy ChatGLM-6B.

      • ztxz16/fastllm : 纯c++实现,无第三方依赖的大模型库,支持CUDA加速,目前支持国产大模型ChatGLM-6B,MOSS; 可以在安卓设备上流畅运行ChatGLM-6B。

    • Rust Implementation

    • Zig Implementation

  • Awesome List

  • Paper Overview

    • daochenzha/data-centric-AI : A curated, but incomplete, list of data-centric AI resources. "Data-centric Artificial Intelligence: A Survey". (arXiv 2023).

    • KSESEU/LLMPapers : Collection of papers and related works for Large Language Models (ChatGPT, GPT-3, Codex etc.).

  • Learning Resources



Open API




  • IDE


    • Cursor : An editor made for programming with AI 🤖. Long term, our plan is to build Cursor into the world's most productive development environment.
  • Wechat


  • Translator


  • Local knowledge Base


    • imClumsyPanda/langchain-ChatGLM : langchain-ChatGLM, local knowledge based ChatGLM with langchain | 基于本地知识库的 ChatGLM 问答。基于本地知识库的 ChatGLM 等大语言模型应用实现。

    • yanqiangmiffy/Chinese-LangChain : Chinese-LangChain:中文langchain项目,基于ChatGLM-6b+langchain实现本地化知识库检索与智能答案生成。俗称:小必应,Q.Talk,强聊,QiangTalk。

  • Question Answering System


    • THUDM/WebGLM : WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023). "WebGLM: Towards An Efficient Web-Enhanced Question Answering System with Human Preferences". (arXiv 2023).
  • Academic Field


    • GPTZero: The World's #1 AI Detector with over 1 Million Users. Detect ChatGPT, GPT3, GPT4, Bard, and other AI models.

    • BurhanUlTayyab/GPTZero : An open-source implementation of GPTZero. GPTZero is an AI model with some mathematical formulation to determine if a particular text fed to it is written by AI or a human being.

    • BurhanUlTayyab/DetectGPT : An open-source Pytorch implementation of DetectGPT. DetectGPT is an amazing method to determine whether a piece of text is written by large language models (like ChatGPT, GPT3, GPT2, BLOOM etc). However, we couldn't find any open-source implementation of it. Therefore this is the implementation of the paper. "DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature". (arXiv 2023).

    • binary-husky/chatgpt_academic : ChatGPT 学术优化。科研工作专用ChatGPT拓展,特别优化学术Paper润色体验,支持自定义快捷按钮,支持markdown表格显示,Tex公式双显示,代码显示功能完善,新增本地Python工程剖析功能/自我剖析功能。

    • kaixindelele/ChatPaper : Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文总结+润色+审稿+审稿回复。 💥💥💥面向全球,服务万千科研人的ChatPaper免费网页版正式上线: 💥💥💥

    • WangRongsheng/ChatGenTitle : 🌟 ChatGenTitle:使用百万arXiv论文信息在LLaMA模型上进行微调的论文题目生成模型。

    • nishiwen1214/ChatReviewer : ChatReviewer: use ChatGPT to review papers; ChatResponse: use ChatGPT to respond to reviewers. 💥💥💥ChatReviewer的第一版网页出来了!!! 直接点击:

    • Shiling42/web-simulator-by-GPT4 : Online Interactive Physical Simulation Generated by GPT-4.

    • imartinez/privateGPT :Interact privately with your documents using the power of GPT, 100% privately, no data leaks. Built with LangChain and GPT4All and LlamaCpp.

  • Medical Field


  • Legal Field


    • LaWGPT : 🎉 Repo for LaWGPT, Chinese-Llama tuned with Chinese Legal knowledge. LaWGPT 是一系列基于中文法律知识的开源大语言模型。该系列模型在通用中文基座模型(如 Chinese-LLaMA、ChatGLM 等)的基础上扩充法律领域专有词表、大规模中文法律语料预训练,增强了大模型在法律领域的基础语义理解能力。在此基础上,构造法律领域对话问答数据集、**司法考试数据集进行指令精调,提升了模型对法律内容的理解和执行能力。
  • Financial Field


  • Math Field


  • Device Deployment


    • MLC LLM : Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

    • Lamini : Lamini: The LLM engine for rapidly customizing models 🦙.

  • GUI



awesome-llm-and-aigc's People


codingonion avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.