opendatahub-io / caikit-tgis-serving Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
When deploying a LLM model using the new Caikit+TGIS architecture introduced with #107 , the TGIS container (i.e., transformer-container) fails to start if the cluster has FIPS cryptography enabled.
These are the 2 errors I got in the container logs
There was a problem when trying to write in your cache folder (/.cache/huggingface/hub). You should set the environment variable TRANSFORMERS_CACHE to a writable directory. fips.c(145): OpenSSL internal error, assertion failed: FATAL FIPS SELFTEST FAILURE
Note: the TRANSFORMERS_CACHE is actually set in the ServinRuntime
This was found on a OpenShift 4.13.18 cluster with RHODS 2.1.2 (aka 1.32.2) and KServe 0.11 installed
Caikit has released new versions of the libraries. We need to update the dependencies and validate that the image is correct.
kserve install failed with the following error .
[perfci@f23-h33-000-6018r ~]$ oc -n redhat-ods-operator describe subs/rhods-operator
...
Conditions:
Message: constraints not satisfiable: no operators found from catalog rhods-catalog in namespace openshift-marketplace referenced by subscription rhods-operator, subscription rhods-operator exists
Reason: ConstraintsNotSatisfiable
Status: True
Type: ResolutionFailed
Last Transition Time: 2023-08-22T19:06:09Z
Message: targeted catalogsource openshift-marketplace/rhods-catalog missing
Reason: UnhealthyCatalogSourceFound
Status: True
Type: CatalogSourcesUnhealthy
Last Updated: 2023-08-22T19:06:09Z
Based on (this slack thread](https://redhat-internal.slack.com/archives/C05742W6F7T/p1692610378820169), knative-serving configuration needs to be updated to use mTLS:true
option in SMCP.
With this issue, we need to update:
Remaining demos are:
Add those to documentation.
Users may need to set particular TGI(S) parameter when using Caikit+TGIS runtime on KServe. An example is the model timeout parameter which can be necessary to be tweaked based on the model size.
We should document the procedure in our docs
Additionally, in a future UI effort, this option should be present in the user interface
As part of my automated scale test, I observe that the InferenceService sometimes reports as Loaded
, but the call to GRPC endpoint returns with errors.
Examples:
<command>
set -o pipefail;
i=0;
GRPCURL_DATA=$(cat "subprojects/llm-load-test/openorca-subset-006.json" | jq .dataset[$i].input )
grpcurl -insecure -d "$GRPCURL_DATA" -H "mm-model-id: flan-t5-small-caikit" u0-m7-predictor-watsonx-serving-scale-test-u0.apps.psap-watsonx-dgxa100.perf.lab.eng.bos.redhat.com:443 caikit.runtime.Nlp.NlpService/TextGenerationTaskPredict
</command>
<stderr> ERROR:
<stderr> Code: Unavailable
<stderr> Message: connections to all backends failing; last error: UNKNOWN: ipv4:127.0.0.1:8033: Failed to connect to remote host: Connection refused
<command>
set -o pipefail;
set -e;
dest=/mnt/logs/016__watsonx_serving__validate_model_all/u0-m6/answers.json
queries=/mnt/logs/016__watsonx_serving__validate_model_all/u0-m6/questions.json
rm -f "$dest" "$queries"
for i in $(seq 10); do
GRPCURL_DATA=$(cat "subprojects/llm-load-test/openorca-subset-006.json" | jq .dataset[$i].input )
echo $GRPCURL_DATA >> "$queries"
grpcurl -insecure -d "$GRPCURL_DATA" -H "mm-model-id: flan-t5-small-caikit" u0-m6-predictor-watsonx-serving-scale-test-u0.apps.psap-watsonx-dgxa100.perf.lab.eng.bos.redhat.com:443 caikit.runtime.Nlp.NlpService/TextGenerationTaskPredict >> "$dest"
echo "Call $i/10 passed"
done
</command>
<stdout> Call 1/10 passed
<stdout> Call 2/10 passed
<stdout> Call 3/10 passed
<stdout> Call 4/10 passed
<stdout> Call 5/10 passed
<stdout> Call 6/10 passed
<stdout> Call 7/10 passed
<stdout> Call 8/10 passed
<stdout> Call 9/10 passed
<stderr> ERROR:
<stderr> Code: Unavailable
<stderr> Message: error reading from server: EOF
Versions
NAME DISPLAY VERSION REPLACES PHASE
jaeger-operator.v1.47.1-5 Red Hat OpenShift distributed tracing platform 1.47.1-5 jaeger-operator.v1.47.0-2-0.1696814090.p Succeeded
kiali-operator.v1.65.9 Kiali Operator 1.65.9 kiali-operator.v1.65.8 Succeeded
rhods-operator.2.3.0 Red Hat OpenShift Data Science 2.3.0 rhods-operator.2.2.0 Succeeded
serverless-operator.v1.30.1 Red Hat OpenShift Serverless 1.30.1 serverless-operator.v1.30.0 Succeeded
servicemeshoperator.v2.4.4 Red Hat OpenShift Service Mesh 2.4.4-0 servicemeshoperator.v2.4.3 Succeeded
quay.io/opendatahub/text-generation-inference@sha256:0e3d00961fed95a8f8b12ed7ce50305acbbfe37ee33d37e81ba9e7ed71c73b69
quay.io/opendatahub/caikit-tgis-serving@sha256:ed920d21a4ba24643c725a96b762b114b50f580e6fee198f7ccd0bc73a95a6ab
See caikit/caikit#654, version bump in #210
"As a preview of this effort, we are pleased to announce that there is a new operator for Istio on Red Hat OpenShift for developer preview and early feedback. This new operator - temporarily called the “Sail Operator” (more on this below) will be the foundation for OpenShift Service Mesh 3."
https://cloud.redhat.com/blog/introducing-a-new-operator-for-istio-on-openshift
In caikit>=0.15.0
, there's a new entrypoint for python -m caikit.runtime
that allows the server to run either the HTTP or gRPC server or both in parallel. This should be a simple update to the entrypoint in start-serving.sh.
Issue to track the PR attached that enhances the quickstart script.
Add github action to build/test the image, possibly using the docker-compose smoke test proposed in #112
The previous Caikit+TGIS image has to be split into separate containers but will reside in the same SR/pod.
Add more ServingRuntime
examples with:
Anything I'm forgetting, @Xaenalt ?
currently the caikit install instructions for the demo of t5/flan have you use specific namespaces. But namespaces are cluster-scoped and must be unique. Two users in the same cluster cannot create a namespace minio
It would be good if the instructions let you specify namespaces ahead of time (via bash env) for minio and other components and then used those vars. For example, myminio
and mydemo
While using the ServingRuntime definition from https://github.com/opendatahub-io/caikit-tgis-serving/pull/131/files#diff-94e62eddc4f3b075ea6c7d9eb86d45728d2c9ebb3c00ae43fd81863ccb6c01f9 which leverages on REST call (HTTP port 8080) I'm facing issues in getting the model answers.
The query returns empty response. These are 2 example of REST calls I tried:
curl -d '{"model_id": "<model_name>","inputs": "At what temperature does water boil?"}' -insecure <ksvc_url>:8080/api/v1/task/text-generation
curl --json '{"model_id": "<model_name>","inputs": "At what temperature does water boil?"}' -insecure <ksvc_url>:8080/api/v1/task/text-generation
I also tried by getting the cluster CA secret and include it in the curl call like this:
oc get secret -n openshift-ingress router-certs-default -o json | jq '.data."tls.crt"' | sed 's/"//g' | base64 -d > <filename>.crt
curl --json '{"model_id": "<model_name>","inputs": "At what temperature does water boil?"}' -insecure <ksvc_url>:8080/api/v1/task/text-generation --cacert <filename>.crt
Is there anything wrong with the way I'm performing the call? Please notice that using the same ServingRuntime set to use gRPC port it works just fine
After following instructions to deploy and access Metrics on RHODS 1.32 v2 RC7 (brew.registry.redhat.io/rh-osbs/iib:568805), according to:
https://github.com/opendatahub-io/caikit-tgis-serving/blob/main/demo/kserve/metrics.md
which involves applying 2 configmaps into a test namespace (for example TEST_NS=watsonx)
The configMaps were created in the test namespace:
$▶ oc describe configmap/cluster-monitoring-config -n ${TEST_NS}
Name: cluster-monitoring-config
Namespace: watsonx
Labels: <none>
Annotations: <none>
Data
====
config.yaml:
----
enableUserWorkload: true
BinaryData
====
Events: <none>
$▶ oc describe configmap/user-workload-monitoring-config -n ${TEST_NS}
Name: user-workload-monitoring-config
Namespace: watsonx
Labels: <none>
Annotations: <none>
Data
====
config.yaml:
----
prometheus:
logLevel: debug
retention: 15d #Change as needed
BinaryData
====
Events: <none>
But the expected metrics for caikit, tgi or istio were not observed as expected:
Looking at the default namespace openshift-monitoring
- we can see that the original configmap was not changed:
Apparently, if updating the default configmaps in openshift-user-workload-monitoring
and in cluster-monitoring-config
then the expected metrics show up.
Related feature: opendatahub-io/caikit#3
At the moment, the scripts only support RHODS/PREVIEW RHODS.
However, the new odh operator v2.1 is out so we need to enhance the scripts to support opendatahub.
The script for odh supports odh operator 1.9 that uses alphav1 api. From ODH 1.10, it will use v1 api like RHODS so we need to remove this part.
https://github.com/opendatahub-io/caikit-tgis-serving/pull/84/files#r1331736041
SMMR update part from scripts is not needed anymore because a new reconciliation to control the SMMR was added into odh-model-controller.
--insecure
parameter that bypass certificate validation.I'm now trying to make this work with the grpc library in Python. The Python implementation does not allow to bypass certificate validation for TLS encryption (grpcurl is written in Go, for which the bypass is implemented, therefore working).
So to get it to work you have to export the SSL certificate and use it when defining the channel. Like this:
with open('certificate.pem', 'rb') as f:
creds = grpc.ssl_channel_credentials(f.read())
server_address = inference_server_url
channel = grpc.secure_channel(server_address, creds)
This work on some servers, but not on others where you get this error:
_MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "failed to connect to all addresses; last error: UNKNOWN: ipv4:52.87.25.239:443: Peer name caikit-example-isvc-predictor-kserve-demo.apps.aisrhods-dell.bj30.p1.openshiftapps.com is not in peer certificate"
debug_error_string = "UNKNOWN:failed to connect to all addresses; last error: UNKNOWN: ipv4:52.87.25.239:443: Peer name caikit-example-isvc-predictor-kserve-demo.apps.aisrhods-dell.bj30.p1.openshiftapps.com is not in peer certificate {grpc_status:14, created_time:"2023-10-09T16:09:55.378527928+00:00"}"
>
Self-signed certificates format are identical in both cases (only the CN or Organization changes obviously), installation of the Caikit+TGIS stack is identical as far as we can tell.
So to solve the issue it's either:
!grpcurl not an option
)After Caikit-TGIS image splitting, the goal is:
Based on watsonx requirements, we should make available these metrics, at least:
However, users won't find metrics with the same name and some of them need to be computed by combination. Examples:
tgi_batch_inference_count-tgi_batch_inference_success
plus adding the time period syntaxsum(container_memory_working_set_bytes{pod='<isvc_predictor_pod_name>',namespace='<isvc_namespace>',container='',}) BY (pod, namespace)
Moreover, there are additional metrics which deserves to be documented, like tgi_request_generated_tokens_count
Not all service mesh resources are being remove with the uninstall script. See https://docs.openshift.com/container-platform/4.13/service_mesh/v2x/removing-ossm.html for list of resources to remove
With a fresh cluster, the scripts/doc is not working because SCMP is not running properly with this msg:
- lastTransitionTime: '2023-10-20T12:54:10Z'
message: >-
Dependency "Jaeger CRD" is missing: error: no matches for kind "Jaeger"
in version "jaegertracing.io/v1"
reason: DependencyMissingError
status: 'False'
type: Reconciled
- lastTransitionTime: '2023-10-20T12:54:10Z'
message: >-
Dependency "Jaeger CRD" is missing: error: no matches for kind "Jaeger"
in version "jaegertracing.io/v1"
reason: DependencyMissingError
status: 'False'
type: Ready
I am not sure if it is ServiceMesh issue or not but at least, it blocked KServe installation so we need to add jaeger as a pre-requisite. By the way, we removed jaeger with this confirm msg
The Kiali and Jaeger Tracing operators are not required to be installed
@bartoszmajsak do you have any idea?
At the moment, there is no readiness probe. grpc health prob would be a good one to check it.
https://github.com/grpc-ecosystem/grpc-health-probe/
Developing and testing of the caikit-tgis-serving component on an Apple laptop (Intel and ARM chipsets) does not seem to be supported by this project.
We need to find a way to allow developers using Apple hardware to make meanifull contributions to the project. To that end we should:
An initial pass at the problem has been discussed in #171.
There is a synchronization issue at the launch of the Pod with the current images:
Ready
:flan-t5-small-gpu-predictor-00001-deployment-6768c548d8-8btqc 4/4 Running 0 41s
Loaded
in the inference service: modelStatus:
copies:
failedCopies: 0
totalCopies: 1
states:
activeModelState: Loaded
targetModelState: Loaded
HOST=...
METHOD=caikit.runtime.Nlp.NlpService/TextGenerationTaskPredict
while true; do
GRPCURL_DATA=$(echo "{'max_new_tokens': 25, 'min_new_tokens': 25, 'text': 'At what temperature does liquid Nitrogen boil?'}" | sed "s/'/\"/g")
grpcurl -insecure -d "$GRPCURL_DATA" -H "mm-model-id: flan-t5-small-caikit" $HOST $METHOD
sleep 1
done
ERROR:
Code: Internal
Message: Unhandled exception during prediction
ERROR:
Code: Internal
Message: Unhandled exception during prediction
ERROR:
Code: Internal
Message: Unhandled exception during prediction
ERROR:
Code: Internal
Message: Unhandled exception during prediction
ERROR:
Code: Internal
Message: Unhandled exception during prediction
ERROR:
Code: Internal
Message: Unhandled exception during prediction
ERROR:
Code: Internal
Message: Unhandled exception during prediction
{
"generated_text": "74 degrees F.C., a temperature of 74 degrees F.C., a temperature of ",
"generated_tokens": "25",
"finish_reason": "MAX_TOKENS",
"producer_id": {
"name": "Text Generation",
"version": "0.1.0"
},
"input_token_count": "10"
}
in the transformer-container
logs, we can see this error:
{"channel": "GP-SERVICR-I", "exception": null, "level": "warning", "log_code": "<RUN49049070W>", "message": "<_InactiveRpcError of RPC that terminated with:
\tstatus = StatusCode.UNAVAILABLE
\tdetails = \"failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:8033: Failed to connect to remote host: Connection refused\"
\tdebug_error_string = \"UNKNOWN:failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:8033: Failed to connect to remote host: Connection refused {created_time:\"2023-10-24T11:48:51.016344787+00:00\", grpc_status:14}\"
>", "model_id": "flan-t5-small-caikit", "num_indent": 0, "stack_trace": "Traceback (most recent call last):
File \"/caikit/lib/python3.9/site-packages/caikit/runtime/servicers/global_predict_servicer.py\", line 283, in _handle_predict_exceptions
yield
File \"/caikit/lib/python3.9/site-packages/caikit/runtime/servicers/global_predict_servicer.py\", line 260, in predict_model
response = work.do()
File \"/caikit/lib/python3.9/site-packages/caikit/runtime/work_management/abortable_action.py\", line 118, in do
return self.__work_thread.get_or_throw()
File \"/caikit/lib/python3.9/site-packages/caikit/core/toolkit/destroyable_thread.py\", line 188, in get_or_throw
raise self.__runnable_exception
File \"/caikit/lib/python3.9/site-packages/caikit/core/toolkit/destroyable_thread.py\", line 124, in run
self.__runnable_result = self.runnable_func(
File \"/caikit/lib/python3.9/site-packages/caikit_nlp/modules/text_generation/text_generation_tgis.py\", line 237, in run
return self.tgis_generation_client.unary_generate(
File \"/caikit/lib/python3.9/site-packages/caikit_nlp/toolkit/text_generation/tgis_utils.py\", line 315, in unary_generate
batch_response = self.tgis_client.Generate(request)
File \"/caikit/lib64/python3.9/site-packages/grpc/_channel.py\", line 1161, in __call__
return _end_unary_response_blocking(state, call, False, None)
File \"/caikit/lib64/python3.9/site-packages/grpc/_channel.py\", line 1004, in _end_unary_response_blocking
raise _InactiveRpcError(state) # pytype: disable=not-instantiable
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
\tstatus = StatusCode.UNAVAILABLE
\tdetails = \"failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:8033: Failed to connect to remote host: Connection refused\"
\tdebug_error_string = \"UNKNOWN:failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:8033: Failed to connect to remote host: Connection refused {created_time:\"2023-10-24T11:48:51.016344787+00:00\", grpc_status:14}\"
>
", "thread_id": 140123215742720, "timestamp": "2023-10-24T11:48:51.017178"}
quay.io/opendatahub/text-generation-inference@sha256:0e3d00961fed95a8f8b12ed7ce50305acbbfe37ee33d37e81ba9e7ed71c73b69
quay.io/opendatahub/caikit-tgis-serving@sha256:adb8d1153b900e304fbcc934189c68cffea035d4b82848446c72c3d5554ee0ca
caikit_tgit_config.yaml.log
inference_service.yaml.log
serving_runtime.yaml.log
The new documentation should have sections that answer the following questions:
In https://github.com/opendatahub-io/caikit-tgis-serving/blob/main/demo/kserve/metrics.md
It is mentioned to use -n $TEST_NS
for applying:
custom-manifests/metrics/uwm-cm-enable.yaml
and custom-manifests/metrics/uwm-cm-conf.yaml
However, since the namespace was already defined in the yamls, it will fail with:
$▶ oc apply -f ./custom-manifests/metrics/uwm-cm-enable.yaml -n $TEST_NS
error: the namespace from the provided object "openshift-monitoring" does not match the namespace "kserve-demo".
You must pass '--namespace=openshift-monitoring' to perform this operation.
Please update the yamls - and remove the namespace item, in order for above command to work.
Related feature: opendatahub-io/caikit#3
Currently the July 28th caikit-nlp (the one used in pr-25) is not able to load models with errors of the following form:
{"channel": "TXT_GEN", "exception": null, "level": "error", "log_code": "<NLP51672289E>", "message": "exception raised: ValueError('value check failed: Cannot run model /opt/models/flan-t5-small-caikit/artifacts with TGIS locally since it has no base artifacts')", "num_indent": 0, "thread_id": 139931863193344, "timestamp": "2023-07-28T18:55:16.932746"}
{"channel": "MODEL-LOADER", "exception": null, "level": "error", "log_code": "<RUN62912924E>", "message": "load failed when processing path: /opt/models/flan-t5-small-caikit with error: ValueError('value check failed: Cannot run model /opt/models/flan-t5-small-caikit/artifacts with TGIS locally since it has no base artifacts')", "model_id": "flan-t5-small-caikit", "num_indent": 0, "thread_id": 139931863193344, "timestamp": "2023-07-28T18:55:16.933255"}
This seems to have something to do with the caikit tgis local backend
In order to solve typical issues, it would be a good idea to start documenting this in FAQ style.
This doc will help users solve their issues by themselves.
Using the latest version https://github.com/opendatahub-io/caikit-tgis-serving/blob/main/demo/kserve/scripts/generate-wildcard-certs.sh the script is failing with the following error:
+ openssl req -x509 -newkey rsa:2048 -sha256 -days 3560 -nodes -subj '/CN=<value>' -extensions san -config <configpath> -CA <cert_path> -CAkey <key_path> -keyout <keyout_path> -out <out_path>
req: Unrecognized flag CA
req: Use -help for summary.
The cause was due an outdated pkg installed on the system. It could happen to anyone using the script, hence it would be good to explicitly say the minimum reqs
When trying to load a model in a Pod running with a memory limit too low, the out-of-memory error message is swallowed by TGIS and hard to troubleshoot (in addition to Caikit swallowing the TGIS error):
2023-09-26T09:40:45.259993Z INFO text_generation_launcher: Starting shard 0
Shard 0: supports_causal_lm = False, supports_seq2seq_lm = True
2023-09-26T09:40:55.279072Z INFO text_generation_launcher: Waiting for shard 0 to be ready...
2023-09-26T09:40:57.571196Z ERROR text_generation_launcher: Shard 0 failed to start:
2023-09-26T09:40:57.571219Z INFO text_generation_launcher: Shutting down shards
{"channel": "TGISPROC", "exception": null, "level": "error", "log_code": "<MTS11752287E>", "message": "exception raised: RuntimeError('TGIS failed to boot up with the model. See logs for details')", "num_indent": 0, "thread_id": 140590947739392, "timestamp": "2023-09-26T09:40:59.288074"}
while troubleshooting it, I observed that even TGIS return code does not refect the OOM error, although my attemps confirmed that not giving enough memory was the cause of the load failure:
sh-4.4$ text-generation-launcher --num-shard 1 --model-name /mnt/models/flan-t5-large/artifacts/ --port 3000;
2023-09-26T11:42:33.150862Z INFO text_generation_launcher: Launcher args: Args { model_name: "/mnt/models/flan-t5-large/artifacts/", revision: None, deployment_framework: "hf_transformers", dtype: None, dtype_str: Some("float16"), num_shard: Some(1), max_concurrent_requests: 150, max_sequence_length: 4096, max_new_tokens: 1024, max_batch_size: 256, max_batch_weight: Some(47458400), max_prefill_weight: None, max_waiting_tokens: 24, port: 3000, grpc_port: 8033, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", master_port: 29500, json_output: false, tls_cert_path: None, tls_key_path: None, tls_client_ca_cert_path: None, output_special_tokens: false, cuda_process_memory_fraction: 1.0 }
2023-09-26T11:42:33.151097Z INFO text_generation_launcher: Starting shard 0
Shard 0: supports_causal_lm = False, supports_seq2seq_lm = True
2023-09-26T11:42:43.180572Z INFO text_generation_launcher: Waiting for shard 0 to be ready...
2023-09-26T11:42:50.384697Z ERROR text_generation_launcher: Shard 0 failed to start:
2023-09-26T11:42:50.384723Z INFO text_generation_launcher: Shutting down shards
sh-4.4$ echo $?
1
HuggingFace PR merged and odh caikit-nlp repo was updated. As a result, caikit tgis serving runtime needs to build a new image with the latest source and library.
in order to query a model using caikit+TGIS runtime, we must pass the model_id
parameter in the HTTP payload (or mm-model-id
for grpc).
However, it can have any value and as far as the endpoint is correct, the model responses.
In the following screenshot you can see 3 calls:
Some resources of KNative are not being removed with the uninstall script. See here https://docs.openshift.com/container-platform/4.8/serverless/install/removing-openshift-serverless.html#serverless-deleting-crds_removing-openshift-serverless for more details
When I create a ServingRuntime+InferenceService with some incorrect parameters, Caikit cannot load the model.
{"channel": "MODEL-LOADER", "exception": null, "level": "error", "log_code": "<RUN62912924E>", "message": "load failed when processing path: /mnt/models/flan-t5-small-caikit with error: RuntimeError('TGIS failed to boot up with the model. See logs for details')", "model_id": "flan-t5-small-caikit", "num_indent": 0, "thread_id": 140660900353792, "timestamp": "2023-09-21T19:39:45.781105"}
This part is expected. However, the InferenceService
still shows the model as Loaded
, which is unexpected:
modelStatus:
copies:
failedCopies: 0
totalCopies: 1
states:
activeModelState: Loaded
targetModelState: Loaded
transitionStatus: UpToDate
From req doc:
For example, the upstream version will be updated and need to incorporate in RHODS as appropriate without impacting deployed models. A new RHODS release must not break model serving functionality.
Caikit standalone image/SR needs to be created.
Several steps required:
The current version of caikit-tgis-serving
exposes some GRPC function arguments (and services) in random order. Two examples:
The critical part of it (random argument order) is already solved in caikit/caikit-nlp#237 main
. PR is open to fix the service method order.
The caikit-nlp
git ref should be updated before publishing the next release of caikit-tgis-serving
.
Per a comment in openshift-ci troubleshooting, it's possible our build/push workflow isn't quite correct: Thread
It's not 100% clear to me, but it seems like the mirror job also builds, or at least waits for the build, but the comment from the team there makes it unclear
It's also a good chance to review the entirety of our test workflows with the openshift-ci team, and use the 'request consultation' option to do an overall review of our openshift-ci jobs for better maintenance in the future
OpenDataHub Operator v2.1 changed the api.
The DataScienceCluster resource at [this address](https://github.com/opendatahub-io/caikit-tgis-serving/blob/main/demo/kserve/custom-manifests/opendatahub/kserve-dsc.yaml) doesn't work with the latest RC update (true/false vs Managed/Removed)
Due to this change, manifests need to be updated.
#107 added a caikit
image which relies on a separate tgis
container. This also includes an example setup of the ServingRuntime
/InferenceService
manifest that can be deployed and tested, but the documentation is missing and/or outdated.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.