Code Monkey home page Code Monkey logo

liah-lie_in_a_haystack's Introduction

๐Ÿคฅ LIAH - a Lie-in-haystack

LIAH

With longer context lengths for LLMs. It is increasingly difficult to test if fine tuned models attend to all depths of the context.

The needle in haystack is a popular approach. However since the LLMs can also answer about the needle instead of the needle. Tests have shown that a "Lie" works well in this context ๐Ÿ˜Š

Lost in the Middle - Paper

lie: "Picasso painted the Mona Lisa"

retrieve: "Who painted the Mona Lisa?"

Installation

pip install liah

Example Usage

# update OPENAI_API_KEY in the env with your token.
# If you need Open AI models for the final evaluation
from liah import Liah
from vllm import LLM, SamplingParams

# Create a sampling params object.
sampling_params = SamplingParams(temperature=0.8, top_p=0.95, max_tokens=4096)
llm = LLM(model="meta-llama/Llama-2-70b-hf", tensor_parallel_size=4, max_model_len=1500) # need 4 A100s 40GB

#Create Liah
liah = Liah(
    model_name="Your Model",
    max_context_length=2000,
    context_length_interval=10,
    test_mode=True,
)

#Get a sample from different depths and context_lengths
for i, sample in enumerate(liah.getSample()):
    # test the sample text with your model
    output = llm.generate([sample["prompt"]], sampling_params)[0]
    #Update liah with the response
    liah.update(sample, output.outputs[0].text)

#Contains the plot file from Liah
plotFilePath = liah.evaluate()

Sample plot

sample-plot

Contribute

bash
pip install pre-commit

then (in the repository, just once)

bash
pre-commit install

before commit (optional)

bash
pre-commit run --all-files

liah-lie_in_a_haystack's People

Contributors

melvinebenezer avatar

Stargazers

James avatar  avatar

Watchers

 avatar Kostas Georgiou avatar

liah-lie_in_a_haystack's Issues

context length error when min and max are same

the context lengths were producing mutiple same contextlengths when the min and max were the same

eg, min_context_length= 1000 and max_context_length=1000. It would produce 10 , 1k context lengths. which eventually
creates 10 * 10 samples (one for each depth)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.