Multicloud E2E RAG demo

TODO

Deploy the LLM using vLLM & KServe
Create Chat UI using embedder & KServe endpoint
- simple chat application with history

Enhancements

remove jumphost.sh & apps.sh
Add Opensearch COS monitoring
use Embeddings service as KServe endpoint
multicloud:
- juju cloud config
- jumphost create command
NVidia NIM as LLM & Embeddings

Infrastructure Installation

Create the jumphost

Skip if you already have a jumphost or decide to use your local machine.

For AWS, go to the aws-jumphost folder and use terraform:

cd ./aws-jumphost
terraform init
terraform apply

Export the public ip of the jumphost and SSH using defined key

export JUMPHOST_IP=$(terraform output -raw instance_public_ip)
export PATH_TO_KEY=...

ssh -i PATH_TO_KEY ubuntu@$JUMPHOST_IP

Install common packages for jumphost from project root directory

cd ./..
bash jumphost-common.sh

Juju cloud configuration

Juju cloud credentials need to be configured separately for each of the clouds, more info can be found here

For AWS, configure the aws cloud config file based on template with your AWS IAM user credentails and add them to the cloud:

juju add-credential aws -f ./aws-jumphost/aws-credentials.yaml

Deploy Kubernetes cluster

Bootstrap juju controller

juju bootstrap aws/eu-west-1 aws-controller --bootstrap-constraints 'cores=2 mem=4G'

Deploy kubernetes cluster with Juju and Microk8s

juju add-model mk8s aws

juju deploy ./k8s/k8s-bundle.yaml --model mk8s

juju ssh -m mk8s microk8s/leader -- sudo microk8s status

We are using hostpath storage to eliminate the dependency on the external cloud. The root disk is 100GB to acomodate both Kubernetes hostpath storage and Docker Image caching.

Wait untill all Pods are created and configure additional microk8s plugins

juju ssh -m mk8s microk8s/leader -- sudo microk8s enable gpu ingress metallb:10.64.140.43-10.64.140.49

juju expose microk8s

Save kubeconfig into the kube config default, if you do not use jumphost consider using different location.

juju ssh -m mk8s microk8s/leader -- sudo microk8s config > ~/.kube/config

Taint GPU nodes with PreferNoSchedule:

kubectl get nodes -l "nvidia.com/gpu.present=true" -o jsonpath='{.items[*].metadata.name}' \
    | xargs -I{} kubectl taint nodes {} node-preference=gpu:PreferNoSchedule --overwrite

Optionally, install volcano scheduler if you need more advanced scheduling policies for your workloads.

kubectl apply -f https://raw.githubusercontent.com/volcano-sh/volcano/master/installer/volcano-development.yaml

Connect to UIs

Follow this instruction once to get access to expose MetalLB IPs to your local machine.

First, add you public key to the Kuberentes leader node. I will use my launchpad ID, you can also add your public key directly to the ~/.ssh/authorized_keys on the remote host.

juju ssh -m mk8s microk8s/leader -- ssh-import-id barteus

Next step is expose them to your computer via sshuttle. Use new terminal window on your local computer. You will need root access to your computer, because sshuttle will add additional entries to your IP tables.

On the jumphost run:

MK8S_LEADER_IP=$(juju status -m mk8s microk8s/leader --format json | jq -r '.machines[] | .["dns-name"]')
echo $MK8S_LEADER_IP

On your local computer in new terminal run:

shuttle -r ubuntu@$MK8S_LEADER_IP 10.0.0.0/8 172.31.0.0/16

Deploy COS

Add deployed K8s as a cloud

juju add-k8s mk8s --cluster-name=microk8s-cluster --client --controller aws-controller

Deploy the Observability stack

juju add-model cos mk8s

juju deploy cos-lite --model cos \
  --trust \
  --overlay ./cos/offers-overlay.yaml \
  --overlay ./cos/storage-small-overlay.yaml

To access the COS, go to the section "Access the UIs"

Add the self monitoring the deployed Kuberentes cluster

juju consume aws-controller:admin/cos.alertmanager-karma-dashboard cos-alertmanager -m mk8s
juju consume aws-controller:admin/cos.grafana-dashboards cos-grafana -m mk8s
juju consume aws-controller:admin/cos.loki-logging cos-loki -m mk8s
juju consume aws-controller:admin/cos.prometheus-receive-remote-write cos-prometheus -m mk8s

juju deploy grafana-agent grafana-agent-cos --channel latest/stable -m mk8s

juju relate grafana-agent-cos:cos-agent microk8s:cos-agent -m mk8s
#juju relate grafana-agent-cos:cos-agent microk8s-gpu:cos-agent -m mk8s
juju relate cos-loki:logging grafana-agent-cos:logging-consumer -m mk8s
juju relate cos-prometheus:receive-remote-write grafana-agent-cos:send-remote-write -m mk8s
juju relate cos-grafana:grafana-dashboard grafana-agent-cos:grafana-dashboards-provider -m mk8s

Get the IP of the COS entrypoint. In the catalog you can find links to other services.

juju run -m cos traefik/0 show-proxied-endpoints --format=yaml --model cos \
  | yq '."traefik/0".results."proxied-endpoints"' \
  | jq

Grafana admin user details can be extracted using Juju action:

echo Grafana access
juju run grafana/leader get-admin-password --model cos

### Deploy Kubeflow and MLflow

Deploy Kubeflow, MLflow and integrate with COS

```bash
juju add-model kubeflow mk8s

juju deploy -m kubeflow --debug ./ckf/bundle.yaml \
    --overlay ./ckf/authentication-overlay.yaml \
    --overlay ./ckf/cos-integration.yaml \
    --overlay ./ckf/mlflow-integration.yaml \
    --trust

Kubeflow access to the UI:

echo Kubeflow access
echo IP: $(kubectl -n kubeflow get svc istio-ingressgateway-workload -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo User: $(juju config dex-auth static-username)
echo Password $(juju config dex-auth static-password)

Deploy Opensearch

Create a new model and set cloudinit-userdata for it. Deploy opensearch.

juju add-model os aws

juju model-config --model os --file=./opensearch/cloudinit-userdata.yaml

juju deploy -m os ./opensearch/bundle.yaml

Ignore the error with replicas, or edit the bundle to deploy Opensearch in HA mode.

When deployment is GREEN, get the access information:

juju run opensearch/leader get-password > ./opensearch/os-creds.yaml

export OS_IP=$(juju status -m os opensearch/leader --format json | jq -r '.machines[] | .["dns-name"]')
export OS_PORT=9200
export OS_USERNAME=$(cat ./opensearch/os-creds.yaml | yq -C ".username")
export OS_PASSWORD=$(cat ./opensearch/os-creds.yaml | yq -C ".password")

echo Endpoints: $OS_IP:$OS_PORT
echo Username: $OS_USERNAME
echo Password: $OS_PASSWORD

cat ./opensearch/os-creds.yaml | yq -C ".ca-chain" | tee ./opensearch/os-cert.yaml
echo Certificate saved under ./opensearch/os-cert.yaml

Connect using curl to check connectivity:

curl -k --cacert ./opensearch/os-cert.yaml -XGET https://$OS_USERNAME:$OS_PASSWORD@$OS_IP:$OS_PORT/

Create Opensearch secret and PodDefaults in the "admin" user namespace in Kubeflow. This requires that you log into the Kubeflow for the first time before running the script below.

sh ./opensearch/os-pod-default.sh

Configure Object storage Bucket and Opensearch Index

Go to the Kubeflow and create a Kubeflow Notebook with all PodDefaults enabled.

In the Kubeflow notebook run setup-bucket.ipynb to create bucket and upload all files in the documents folder to it.

Run Ingestion pipeline notebook or create a Kubeflow pipeline using ingestion-pipelines.yaml file.

Deploy ML models

TBD

Deploy Chat UI

TBD

Cleanup

Remove in the AWS cloud console:

machines
security groups

Remove configuration on the jumphost.

rm -Rf ~/.local/share/juju/

Manual cleanup of Argo Workflows completed pods:

Without removing completed Pods the PVCs will not be removed even if stated in the pipeline definition.

Run it only when no Pipeline Run is executed.

kubectl delete po -n admin -l workflows.argoproj.io/completed=true

barteus / demo-aws-mk8s-ckf-mlflow Goto Github PK

demo-aws-mk8s-ckf-mlflow's Introduction

Multicloud E2E RAG demo

TODO

Infrastructure Installation

Create the jumphost

Juju cloud configuration

Deploy Kubernetes cluster

Connect to UIs

Deploy COS

Deploy Opensearch

Configure Object storage Bucket and Opensearch Index

Deploy ML models

Deploy Chat UI

Cleanup

Manual cleanup of Argo Workflows completed pods:

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent