Code Monkey home page Code Monkey logo

genai-llm-rag-pattern's People

Contributors

bryonbaker avatar butler54 avatar caldeirav avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

genai-llm-rag-pattern's Issues

Automation of adding a vLLM server runtime to OpenShift AI

Motivation:

  1. To allow use of the openAI api format
  2. To be able to use older GPUs e.g. v100s

DoD:

  • Configured as code in github repo
  • Model server tested on reference model
  • Model server works with llama + V100 GPUs
  • Documented how to use / reconfigure

Autoscaling and managing GPUs on OSD, ROSA, and ARO

Clusters which are lifecycle managed via OCM such as ROSA, ARO and OCD uses machine pools and autoscaling managed from the OCM console.

instascale does allow the use of a pool managed from OCM.

This requires a workflow to setup the pool. (This may be partially done by hand initially; potentially in the future via an ansible job or similar).

Example pipeline for Caikit format conversion

As an AI developer, I want a simple process to pull model X from huggingface and convert it to Caikit format.

Most likely by a data pipeline.

  • Secrets management required for HF
  • Parameterize with model on HF so it is reusable.

Definition of Done:

  • UaT process shows model can be served from Caikit using fastapi
  • Documentation in repo for website.

Build workflow / demo story which includes asynchronous use of MCAD

Today many of our demos use MCAD - but are blocking e.g. a user needs to keep their workbench running, and needs to progress the steps to shut down the ray cluster / MCAD job.

Provide a workflow where the MCAD batch job (e.g. fine tuning) is done asynchronously.

Document how to use the artifacts (e.g. jupyter notebooks) both inline and in docs website.

Remove traceloop SDK or port to on-prem deployment

My understanding is traceloop is a SaaS platform. Today the default deployment is not configured and I believe should be a no-op. TO BE TESTED.

Either:

  1. Configure for in cluster.
  2. Deploy something else.

Build RAG application images in OpenShift

Open question as to whether S2I or Tekton can be used to build images

  • Ensure 'bootstrapped' cluster will trigger at least manual build (to populate images)
  • Use internal registry

Automate the deployment of a default datascience project for use in the demo

Preconfigure a data science project or projects* including:

  1. Creation of a minio bucket
  2. Configuration of data connection for minio
  3. Setup a pytorch workbench
  4. Setup workbench with secrets for huggingface credential management.
  5. Setup a model server
  6. Pre-load jupyter content into the workbench, if possible

Open questions:

  1. Can you point to a specific sub-directory in a repo for jupyter content (e.g. allowing for mono-repo setups)

Cleanup repository root README.md file

The repository root README.md file has had a number of violations of both the superlinter and mdformat.

This is a problem as all "README.md" files had to be exlcluded in order for running superlinter to be consistent.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.