einstack / glide Goto Github PK
View Code? Open in Web Editor NEW๐ฆ A open blazing-fast simple model gateway for rapid development of production GenAI apps
Home Page: https://docs.einstack.ai/glide/
License: Apache License 2.0
๐ฆ A open blazing-fast simple model gateway for rapid development of production GenAI apps
Home Page: https://docs.einstack.ai/glide/
License: Apache License 2.0
Buildout the Unified response based on GEP 0002.
Define language model pools based on the configuration passed.
Investigate if the default slogs module added in go1.21 is a good option to rely for logging functionality-wise.
If not, lets configure zap.
Requirements:
Receiving following error when deploying the following config:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x0 pc=0x10509150c]
goroutine 1 [running]:
glide/pkg.NewGateway(0x1400052e0c0)
/Users/max/code/Glide/pkg/gateway.go:39 +0x3c
glide/pkg/cmd.NewCLI.func1(0x14000690600, {0x1050c419c?, 0x7?, 0x1050c173f?})
/Users/max/code/Glide/pkg/cmd/cli.go:26 +0x54
github.com/spf13/cobra.(*Command).execute(0x14000690600, {0x14000122220, 0x2, 0x2})
/Users/max/go/pkg/mod/github.com/spf13/[email protected]/command.go:983 +0x840
github.com/spf13/cobra.(*Command).ExecuteC(0x14000690600)
/Users/max/go/pkg/mod/github.com/spf13/[email protected]/command.go:1115 +0x344
github.com/spf13/cobra.(*Command).Execute(0x14000070728?)
/Users/max/go/pkg/mod/github.com/spf13/[email protected]/command.go:1039 +0x1c
main.main()
/Users/max/code/Glide/main.go:25 +0x20
exit status 2
make: *** [run] Error 1
telemetry:
logging:
level: info
encoding: console # console, json
# other configs
api:
http:
listen_addr: 0.0.0.0:7685
max_body_size: "2Mi"
tls:
ca_path:
cert_path:
# other configs
routes:
language:
- id: openai-pool
strategy: priority # round-robin, weighted-round-robin, priority, least-latency, priority, etc.
models:
- id: primary
openai: # cohere, azureopenai, gemini, other providers we support
model: gpt-3.5-turbo
api_key: "sk-"
default_params:
temperature: 0.1
- id: secondary
cohere:
model: command-light
apiKey: ""
default_params: # set the default request params
temperature: 0.1
- id: latency-critical-pool
strategy: least-latency
models:
- id: primary
timeout_ms: 200
openai:
model: gpt-3.5-turbo
api_key: "sk-"
- id: secondary
timeout_ms: 200
cohere:
api_key: ""
- id: cohere-openai-ab-test
strategy: weighted-round-robin
models:
- id: openai
weight: 30
openai:
api_key: "sk-"
- id: cohere
weight: 70
cohere:
api_key: ""
API keys redacted.
Create a Helm chart to deploy Glide into a Kubernetes cluster easier
Max has API key/access on Azure
Configure a Github Action pipeline to build a new version of Glide on merges into the main branch.
GEP: EinStack/geps#7
The field under 'id' needs to be updated from 'openai' to 'provider'
routers:
language:
- id: test-router
models:
- id: openai
openai:
api_key: "sk"
- id: cohere
openai:
api_key: ""
Probably need to move provider config details to a YAML or other config setup
Init a CLI and add there one command that would run the Glide server with a health API
We need to find a way to track model latency that would be independent from the response size.
Options:
Reference:
Reference:
Transforms the request body to match the structure required by the AI provider.
It also ensures the values for each parameter are within the minimum and maximum
constraints defined in the provider's configuration. If a required parameter is missing,
it assigns the default value from the provider's configuration.
Simplify representation
Investigate why goleak reports that netpoll leaves pending goroutines in:
https://github.com/modelgateway/Glide/blob/38c56a2cf0c650ec56f7476453a14a0ec1ed2a10/leak_test.go#L7-L6
and enable the test back.
A bug report in Netpoll's Github: cloudwego/netpoll#305
Update Unified Request schema to accept different prompts for different models
Integrate with Codecov to track and visualize codebase coverage
Setup documentation for Glide and deploy/host it somewhere.
A popular option in Python community. I have been using it for my Hyx project.
Reference:
A popular option in CNCF community.
Reference:
Example of Real Docs:
A popular option in general, gives a good looking UI but unsure about of effort to get it working/maintain.
Reference:
Examples of Real Docs:
Popular Options are:
We can go with Apache 2.0 for code snippets + CC-BY-4.0 for the main content.
Allow to setup TLS for Glide HTTP API, so it's possible to:
Add a new set of API to support embeddings:
Max has API key
I want to test building via docker
Implement fallbacking on provider failures to another healthy provider.
Create a GEP to explain how will that work (GEP0005).
Create a Grafana dashboard to monitor Glide and health of models it uses
Let Glide be configured via a YAML file.
Requirements:
Reloading the service on config file change is outside of this task (#28)
Implement a mock provider called FaultyProvider that would enable us to do resiliency testing of Glide emulating super hostile conditions like:
Readme.md is a face of the project. Let's fix the broken links and update copy to make good impression on people visiting the repo
In order to collect metrics on different aspects of the system (e.g. how many times OpenAI failed/accepted requests), we need to setup OpenTelemetry.
Also, we could use OpenTelemetry to implemented distributed tracing, so it's seamlessly integrated into business services that communicates with it and instrumented by OTEL. Also, this should be useful for us to have a better debugging capabilities.
Requirements:
Ignore Nilaway false alarmsand enable the check in the CI
Setup a new GH Action pipeline on each PR that includes:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.