Code Monkey home page Code Monkey logo

fms-lm-eval-service's Introduction

Kubernetes Wrapper for LM-Evaluation-Harness (lm-eval)

This project is a Kubernetes wrapper for the lm-evaluation-harness, aimed at facilitating the deployment and management of language model evaluations within Kubernetes/OpenShift environments. It's currently under active development and resides in this repository as a work-in-progress.

Overview

The Kubernetes wrapper for lm-evaluation-harness (hereafter referred to as lm-eval-aas) extends the basic functionality of the original tool by integrating with Kubernetes APIs for state management and deploying as a Custom Resource (CR). This integration allows for more scalable and flexible deployment options suitable for various computational and storage needs.

Architecture Decision Record (ADR)

We have an ADR available for this project, which outlines the rationale behind major architectural decisions. You can view and comment on the ADR here: View ADR.

Key Features (None Currently Implemented)

  • Kubernetes API Integration: Unlike previous demonstrations that managed
    state with a NoSQL database or local disk storage, lm-eval-aas uses the Kubernetes API for state management, enhancing the robustness and scalability of the application.

  • Deployment as a Custom Resource (CR): The tool is designed to be deployed as a CR within a Kubernetes cluster, allowing for better integration with existing cluster management practices and tools.

  • Support for Persistent Volume Claims (PVCs): lm-eval-aas can mount PVCs to access custom data sets. This is particularly useful for evaluations that require large or specific data sets not typically stored within the cluster, and for customers with sentative data handling requirements.

Development Status

This project is currently in a work-in-progress state. Contributions and feedback are welcome. Please refer to the issue tracker in this repository to report bugs or suggest enhancements.

Getting Started

As the project is still under development, detailed instructions on deploying and using lm-eval-aas will be provided as the features are finalized and the project reaches a stable release.

Diagrams

Mermaid chart links:

Flow Diagram

graph LR;
    A[Client] --> B{Request};
    B -->|GET| C[Kubernetes Deployment];
    C -->|Process Request| D1[Pod 1];
    C -->|Process Request| D2[Flask App];
    F --> |Response| C;
    C -->|Process Request| D3[Pod N];
    D1 -->C;
    D3 -->C;
    C -->|Response| B;
    B -->|Response| A;

    subgraph Pod2
        D2((Flask App))
        D2 --> E1{Parse Arguments};
        E1 -->|Command Line Invocation| E2[lm-eval];
        E2 --> E3[TGIS];
        E2 --> E4[BAM];
        E2 --> E5[RHOAI Inference];
        E2 --> E6[...Other];
        E3 --> |Inference| E2;
        E4 --> |Inference| E2;
        E5 --> |Inference| E2;
        E6 --> |Inference| E2;
        D2 --> F[Calculate Metrics];
        E2 --> F;
        
    end
Loading

Architecture Diagram

graph LR;

subgraph "Docker Container" 
    %% style rounded
    A[Install lm-eval and Flask]
    B[Expose REST Interface to lm-eval cli via Flask]
    C[Run lm-eval via cli with transpiled parameters]
end;

subgraph "Kubernetes/OpenShift Deployment" 
    %% style rounded
    D[Ingress]
    E[Load Balancer]
    F[Pod 1]
    G[Pod 2]
    H[...]
end;

subgraph "Log and Output Storage" 
    %% style rounded
    I[Log Stream]
    J[output.json Storage -PVC/COS/etc.]
end;

subgraph "User" 
    %% style rounded
    K[UI]
    L[Client]
end;

A --> B;
B --> C;

D --> E;
E -->|Ticket ID|D;
E --> F;
F -->|Ticket ID|E;
E --> G;
E --> H;

F --> I;
G --> I;
H --> I;

F --> J;
G --> J;
H --> J;

K --> L;

L --> |GET| D;
D --> |Ticket ID| L;
K --> |Ticket ID| I;
K --> |Ticket ID| J;
L --> |Ticket ID| K;
I --> |Logs| K;
J --> |Results| K;
Loading

fms-lm-eval-service's People

Contributors

yhwang avatar rawkintrevo avatar

Stargazers

Guangya Liu avatar

Watchers

JJ Asghar avatar  avatar Raghu Ganti avatar Yu Chin Fabian Lim avatar Dr. Rashed Z. Bhatti avatar Sukriti Sharma avatar

fms-lm-eval-service's Issues

Update the Flask app to use custom resource: `LMEvalJob`

Currently, the Flask app runs the lm-eval job by spawning subprocesses within the same Pod. Since the CRD and controller is ready, we need to update the implementation to:

  • submit_job API, create an LMEvalJob CR
  • poll_job API, check the corresponding CR's status
  • job_results API, retrieve the results from the corresponding CR
  • list_jobs, get a list of LMEvalJob
  • cancel_job, update the LMEvalJOb's status field to initiate the concealing flow

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.