Code Monkey home page Code Monkey logo

fms-acceleration's Introduction

FMS Acceleration

This monorepo collects libraries of packages that accelerate fine-tuning / training of large models, intended to be part of the fms-hf-tuning suite.

This package is in BETA under extensive development. Expect breaking changes!

Plugins

Plugin Description Depends License Status
framework This acceleration framework for integration with huggingface trainers Beta
accelerated-peft For PEFT-training, e.g., 4bit QLoRA. Huggingface
AutoGPTQ
Apache 2.0
MIT
Beta
TBA Unsloth-inspired. Fused LoRA and triton kernels (e.g., fast cross-entropy, rms, rope) Xformers Apache 2.0 with exclusions. Under Development
TBA MegaBlocks inspired triton Kernels and acclerations for Mixture-of-Expert models Apache 2.0 Under Development

Usage with FMS HF Tuning

This is intended to be a collection of many acceleration routines (including accelerated peft and other techniques). Below demonstrates a concrete example to show how to accelerate your tuning experience with tuning/sft_trainer.py from fms-hf-tuning.

Example: Accelerated GPTQ-LoRA Training

Below instructions for accelerated peft fine-tuning. In particular GPTQ-LoRA tuning with the AutoGPTQ triton_v2 kernel; this kernel is state-of-the-art provided by jeromeku on Mar 2024:

  1. Checkout fms-hf-tuning and install the framework library:
    pip install -e .[fms-accel]
    
    or alternatively install the framework directly:
    pip install git+https://github.com/foundation-model-stack/fms-acceleration.git#subdirectory=plugins/framework
    
  2. The above installs the command line utility fms_acceleration.cli, which can then be used to install plugins. Use list to view available plugins; this list updates as more plugins get developed:
    $ python -m fms_acceleration.cli list
    
    Choose from the list of plugin shortnames, and do:
    * 'python -m fms_acceleration.cli install <pip-install-flags> PLUGIN_NAME'.
    
    List of PLUGIN_NAME [PLUGIN_SHORTNAME]:
    
    1. fms_acceleration_peft [peft]
    
    and then install the plugin. We install the fms-acceleration-peft plugin for GPTQ-LoRA tuning with triton v2 as:
    python -m fms_acceleration.cli install fms_acceleration_peft
    
    The above is the equivalent of:
    pip install git+https://github.com/foundation-model-stack/fms-acceleration.git#subdirectory=plugins/accelerated-peft
    
  3. Prepare a YAML configuration for the acceleration framework plugins
  4. Run sft_trainer.py with the following arguments:
    • --acceleration_framework_config_file pointing to framework configuration YAML.

    • arguments required for correct operation (e.g., if using accelerated peft, then peft_method is required).

    • More info on defaults.yaml and scenarios.yaml found here.

      # when using sample-configurations, arguments can be referred from
      # defaults.yaml and scenarios.yaml
      python sft_trainer.py \
          --acceleration_framework_config_file sample-configurations/acceleration_framework.yaml \
          ... # arguments from default.yaml
          ...  # arguments from scenarios.yaml
      

Over time, more plugins will be updated, so please check here for the latest accelerations!.

CUDA Dependencies

This repo requires CUDA to compute the kernels, and it is convinient to use NVidia Pytorch Containers that already comets with CUDA installed. We have tested with the following versions:

  • pytorch:24.03-py3

Benchmarks

The benchmarks can be reproduced with the provided scripts.

See below CSV files for various results:

Code Architecture

For deeper dive into details see framework/README.md.

Maintainers

IBM Research, Singapore

fms-acceleration's People

Contributors

fabianlim avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.