Code Monkey home page Code Monkey logo

llm-foundry's Introduction

LLM Foundry

PyPi Version PyPi Package Version Chat @ Slack License


LLM Foundry

This repository contains code for training, finetuning, evaluating, and deploying LLMs for inference with Composer and the MosaicML platform. Designed to be easy-to-use, efficient and flexible, this codebase is designed to enable rapid experimentation with the latest techniques.

You'll find in this repo:

  • llmfoundry/ - source code for models, datasets, callbacks, utilities, etc.
  • scripts/ - scripts to run LLM workloads
    • data_prep/ - convert text data from original sources to StreamingDataset format
    • train/ - train or finetune HuggingFace and MPT models from 125M - 70B parameters
      • train/benchmarking - profile training throughput and MFU
    • inference/ - convert models to HuggingFace or ONNX format, and generate responses
      • inference/benchmarking - profile inference latency and throughput
    • eval/ - evaluate LLMs on academic (or custom) in-context-learning tasks
  • mcli/ - launch any of these workloads using MCLI and the MosaicML platform

MPT

Mosaic Pretrained Transformers (MPT) are GPT-style models with some special features -- Flash Attention for efficiency, ALiBi for context length extrapolation, and stability improvements to mitigate loss spikes. As part of MosaicML's Foundation series, we have open-sourced several MPT models:

Model Context Length Download Demo Commercial use?
MPT-30B 8192 https://huggingface.co/mosaicml/mpt-30b Yes
MPT-30B-Instruct 8192 https://huggingface.co/mosaicml/mpt-30b-instruct Yes
MPT-30B-Chat 8192 https://huggingface.co/mosaicml/mpt-30b-chat Demo No
MPT-7B 2048 https://huggingface.co/mosaicml/mpt-7b Yes
MPT-7B-Instruct 2048 https://huggingface.co/mosaicml/mpt-7b-instruct Yes
MPT-7B-Chat 2048 https://huggingface.co/mosaicml/mpt-7b-chat Demo No
MPT-7B-StoryWriter 65536 https://huggingface.co/mosaicml/mpt-7b-storywriter Yes

To try out these models locally, follow the instructions in scripts/inference/README.md to prompt HF models using our hf_generate.py or hf_chat.py scripts.

MPT Community

We've been overwhelmed by all the amazing work the community has put into MPT! Here we provide a few links to some of them:

  • ReplitLM: replit-code-v1-3b is a 2.7B Causal Language Model focused on Code Completion. The model has been trained on a subset of the Stack Dedup v1.2 dataset covering 20 languages such as Java, Python, and C++
  • LLaVa-MPT: Visual instruction tuning to get MPT multimodal capabilities
  • ggml: Optimized MPT version for efficient inference on consumer hardware
  • GPT4All: locally running chat system, now with MPT support!
  • Q8MPT-Chat: 8-bit optimized MPT for CPU by our friends at Intel

Tutorial videos from the community:

Something missing? Contribute with a PR!

Latest News

Hardware and Software Requirements

This codebase has been tested with PyTorch 1.13.1 and PyTorch 2.0.1 on systems with NVIDIA A100s and H100s. This codebase may also work on systems with other devices, such as consumer NVIDIA cards and AMD cards, but we are not actively testing these systems. If you have success/failure using LLM Foundry on other systems, please let us know in a Github issue and we will update the support matrix!

Device Torch Version Cuda Version Status
A100-40GB/80GB 1.13.1 11.7 ✅ Supported
A100-40GB/80GB 2.0.1 11.7, 11.8 ✅ Supported
H100-80GB 1.13.1 11.7 ❌ Not Supported
H100-80GB 2.0.1 11.8 ✅ Supported
A10-24GB 1.13.1 11.7 🚧 In Progress
A10-24GB 2.0.1 11.7, 11.8 🚧 In Progress

MosaicML Docker Images

We highly recommend using our prebuilt Docker images. You can find them here: https://hub.docker.com/orgs/mosaicml/repositories.

The mosaicml/pytorch images are pinned to specific PyTorch and CUDA versions, and are stable and rarely updated.

The mosaicml/llm-foundry images are built with new tags upon every commit to the main branch. You can select a specific commit hash such as mosaicml/llm-foundry:1.13.1_cu117-f678575 or take the latest one using mosaicml/llm-foundry:1.13.1_cu117-latest.

Please Note: The mosaicml/llm-foundry images does not come with the llm-foundry package preinstalled, just the dependencies. You will still need to pip install llm-foundry either from PyPi or from source.

Docker Image Torch Version Cuda Version LLM Foundry dependencies installed?
mosaicml/pytorch:1.13.1_cu117-python3.10-ubuntu20.04 1.13.1 11.7 No
mosaicml/pytorch:2.0.1_cu118-python3.10-ubuntu20.04 2.0.1 11.8 No
mosaicml/llm-foundry:1.13.1_cu117-latest 1.13.1 11.7 Yes
mosaicml/llm-foundry:2.0.1_cu118-latest 2.0.1 11.8 Yes

Installation

This assumes you already have PyTorch and CMake installed.

To get started, clone this repo and install the requirements:

git clone https://github.com/mosaicml/llm-foundry.git
cd llm-foundry

# Optional: we highly recommend creating and using a virtual environment
python -m venv llmfoundry-venv
source llmfoundry-venv/bin/activate

pip install -e ".[gpu]"  # or pip install -e . if no NVIDIA GPU

Quickstart

Here is an end-to-end workflow for preparing a subset of the C4 dataset, training an MPT-125M model for 10 batches, converting the model to HuggingFace format, evaluating the model on the Winograd challenge, and generating responses to prompts.

If you have a write-enabled HuggingFace auth token, you can optionally upload your model to the Hub! Just export your token like this:

export HUGGING_FACE_HUB_TOKEN=your-auth-token

and uncomment the line containing --hf_repo_for_upload ....

(Remember this is a quickstart just to demonstrate the tools -- To get good quality, the LLM must be trained for longer than 10 batches 😄)

cd scripts

# Convert C4 dataset to StreamingDataset format
python data_prep/convert_dataset_hf.py \
  --dataset c4 --data_subset en \
  --out_root my-copy-c4 --splits train_small val_small \
  --concat_tokens 2048 --tokenizer EleutherAI/gpt-neox-20b --eos_text '<|endoftext|>'

# Train an MPT-125m model for 10 batches
composer train/train.py \
  train/yamls/pretrain/mpt-125m.yaml \
  data_local=my-copy-c4 \
  train_loader.dataset.split=train_small \
  eval_loader.dataset.split=val_small \
  max_duration=10ba \
  eval_interval=0 \
  save_folder=mpt-125m

# Convert the model to HuggingFace format
python inference/convert_composer_to_hf.py \
  --composer_path mpt-125m/ep0-ba10-rank0.pt \
  --hf_output_path mpt-125m-hf \
  --output_precision bf16 \
  # --hf_repo_for_upload user-org/repo-name

# Evaluate the model on Winograd
python eval/eval.py \
  eval/yamls/hf_eval.yaml \
  icl_tasks=eval/yamls/winograd.yaml \
  model_name_or_path=mpt-125m-hf

# Generate responses to prompts
python inference/hf_generate.py \
  --name_or_path mpt-125m-hf \
  --max_new_tokens 256 \
  --prompts \
    "The answer to life, the universe, and happiness is" \
    "Here's a quick recipe for baking chocolate chip cookies: Start by"

Note: the composer command used above to train the model refers to Composer library's distributed launcher.

Contact Us

If you run into any problems with the code, please file Github issues directly to this repo.

If you want to train LLMs on the MosaicML platform, reach out to us at [email protected]!

llm-foundry's People

Contributors

a-jacobson avatar abhi-mosaic avatar alextrott16 avatar aspfohl avatar bandish-shah avatar bcui19 avatar bmosaicml avatar codestar12 avatar corymosaicml avatar dakinggg avatar dblalock avatar dskhudia avatar ejyuen avatar growlix avatar hanlint avatar jacobfulano avatar jfrankle avatar karan6181 avatar knighton avatar landanjs avatar lupesko avatar markbastian avatar mrseeker avatar mvpatel2000 avatar nik-mosaic avatar patrickhwood avatar samhavens avatar sashadoubov avatar vchiley avatar vladd-i avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.