Code Monkey home page Code Monkey logo

function_vectors's Introduction

Function Vectors in Large Language Models

This repository contains data and code for the paper: Function Vectors in Large Language Models.

Setup

We recommend using conda as a package manager. The environment used for this project can be found in the fv_environment.yml file. To install, you can run:

conda env create -f fv_environment.yml
conda activate fv

Demo Notebook

Checkout notebooks/fv_demo.ipynb for a jupyter notebook with a demo of how to create a function vector and use it in different contexts.

Data

The datasets used in our project can be found in the dataset_files folder.

Code

Our main evaluation scripts are contained in the src directory with sample script wrappers in src/eval_scripts.

Other main code is split into various util files:

  • eval_utils.py contains code for evaluating function vectors in a variety of contexts
  • extract_utils.py contains functions for extracting function vectors and other relevant model activations.
  • intervention_utils.py contains main functionality for intervening with function vectors during inference
  • model_utils.py contains helpful functions for loading models & tokenizers from huggingface
  • prompt_utils.py contains data loading and prompt creation functionality

Citing our work

The preprint can be cited as follows

@article{todd2023function,
    title={Function Vectors in Large Language Models}, 
    author={Eric Todd and Millicent L. Li and Arnab Sen Sharma and Aaron Mueller and Byron C. Wallace and David Bau},
    year={2023},
    eprint={2310.15213},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.