Code Monkey home page Code Monkey logo

ai-dial-sdk's Introduction

AI DIAL Python SDK

Overview

Framework to create applications and model adapters for AI DIAL.

Applications and model adapters implemented using this framework will be compatible with AI DIAL API that was designed based on Azure OpenAI API.

Usage

Install the library using pip:

pip install aidial-sdk

Echo application example

The echo application example replies to the user by repeating their last message:

# Save this as app.py
import uvicorn

from aidial_sdk import DIALApp
from aidial_sdk.chat_completion import ChatCompletion, Request, Response


# ChatCompletion is an abstract class for applications and model adapters
class EchoApplication(ChatCompletion):
    async def chat_completion(
        self, request: Request, response: Response
    ) -> None:
        # Get last message (the newest) from the history
        last_user_message = request.messages[-1]

        # Generate response with a single choice
        with response.create_single_choice() as choice:
            # Fill the content of the response with the last user's content
            choice.append_content(last_user_message.content or "")


# DIALApp extends FastAPI to provide a user-friendly interface for routing requests to your applications
app = DIALApp()
app.add_chat_completion("echo", EchoApplication())

# Run built app
if __name__ == "__main__":
    uvicorn.run(app, port=5000)

Run

python3 app.py

Check

Send the next request:

curl http://127.0.0.1:5000/openai/deployments/echo/chat/completions \
  -H "Content-Type: application/json" \
  -H "Api-Key: DIAL_API_KEY" \
  -d '{
    "messages": [{"role": "user", "content": "Repeat me!"}]
  }'

You will see the JSON response as:

{
    "choices":[
        {
            "index": 0,
            "finish_reason": "stop",
            "message": {
                "role": "assistant",
                "content": "Repeat me!"
            }
        }
    ],
    "usage": null,
    "id": "d08cfda2-d7c8-476f-8b95-424195fcdafe",
    "created": 1695298034,
    "object": "chat.completion"
}

Developer environment

This project uses Python>=3.8 and Poetry>=1.6.1 as a dependency manager.

Check out Poetry's documentation on how to install it on your system before proceeding.

To install requirements:

poetry install

This will install all requirements for running the package, linting, formatting and tests.

IDE configuration

The recommended IDE is VSCode. Open the project in VSCode and install the recommended extensions.

The VSCode is configured to use PEP-8 compatible formatter Black.

Alternatively you can use PyCharm.

Set-up the Black formatter for PyCharm manually or install PyCharm>=2023.2 with built-in Black support.

Environment Variables

Variable Default Description
DIAL_SDK_LOG WARNING DIAL SDK log level

Lint

Run the linting before committing:

make lint

To auto-fix formatting issues run:

make format

Test

Run unit tests locally for available python versions:

make test

Run unit tests for the specific python version:

make test PYTHON=3.11

Clean

To remove the virtual environment and build artifacts run:

make clean

Build

To build the package run:

make build

Publish

To publish the package to PyPI run:

make publish

ai-dial-sdk's People

Contributors

adubovik avatar ai-dial-actions avatar astsiapanay avatar dependabot[bot] avatar dspashynskyi avatar fviernau avatar nepalevov avatar oleksii-klimov avatar roman-romanov-o avatar vladisavvv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ai-dial-sdk's Issues

Allow custom chat/completions endpoints

  1. Currently it's impossible to declare chat/completions route other than via DIALApp.add_chat_completion method.

Which leads to the following server returning 404 when calling my-deployment chat completions:

app = DIALApp()

@app.post("/openai/deployments/my-deployment/chat/completions")
async def chat_completion(deployment_id: str, request: Request):
   pass

The same goes about /rate, /tokenize, /truncate_prompt endpoints.

  1. It's impossible to implement /chat/completions endpoint for an arbitrary deployment id. Meaning that the list of deployment names must be known beforehand. This is not very convenient.

Extend chat completion request class with the recently supported fields

As per documentation the following fields are currently missing in SDK:

  • seed
  • logprobs
  • top_logprobs
  • response_format

Note that it also solves the issue recently introduced in lanchain-openai==0.1.17 where logprobs is defaulted to False instead of None, so that the following code:

from langchain_openai import AzureChatOpenAI

llm = AzureChatOpenAI(
    openai_api_version="2023-12-01-preview",
    azure_deployment="gemini-1.5-flash-001", # or any other vertexai/bedrock model
)
llm.invoke("2+3=?")

fails with error:

BadRequestError: Error code: 400 - {'error': {'message': 'Your request contained invalid structure on path logprobs. extra fields not permitted', 'type': 'invalid_request_error'}}

Thus, the issue is partically caused by langchain-ai/langchain#23691 and partially by the need to sync Azure OpenAI API with the DIAL SDK.

Add default health check

Add the ability to optionally activate the /health endpoint returning status_code=200 and body={"status": "ok"}

share code to opensource

as agreed, we want to share internal toolset for DIAL platform with public

the codebase should be prepared, code history should be squashed

TextIO-compatible interface for append_content

Would be nice to have io.TextIO-compatible interface for Choice.append_content and Stage.append_content to support interoperability with existing python libraries.

Examples of the use-cases:

  1. Print:
print("Hello", file=choice.content_stream)
  1. Progress with tqdm
for item in tqdm(items, file=stage.content_stream):
    process(item)
  1. Stage logging
logging_handler = logging.StreamHandler(stream=stage.content_stream)
  1. Writers for some format:
csv_writer = csv.writer(choice.content_stream)
csv_writer.writerows(data)

And any other library which accepts file-like object as argument.

Support stages in request messages

Currently, the request message datatype doesn't include field for stages:

class CustomContent(ExtraForbidModel):
attachments: Optional[List[Attachment]] = None
state: Optional[Any] = None

This is in line with DIAL API (as of 6 Jun 2024), which tells that only the chat/completions response may have stages, but the request messages aren't allowed to.

However, this breaks a typical pattern of chat completion usage.

Suppose, this is a program one uses with a regular GPT-4 chat completions endpoint:

messages = []
while True:
    user = input("User: ")
    messages.append({"role": "user", "content": user})
    response = client.chat.completions.create(messages=messages)
    new_message = response.choices[0].message.dict()
    messages.append(new_message)

Then one decides to switch to a certain DIAL application (written using DIAL SDK), which also produces stages.
This code will break in the application, unless one explicitly removes the stages:

messages = []
while True:
    user = input("User: ")
    messages.append({"role": "user", "content": user})
    response = client.chat.completions.create(messages=messages)
    new_message = response.choices[0].message.dict()
+    if "stages" in new_message and "custom_content" in new_message["stages"]:
+        del new_message["custom_content"]["stages"]
    messages.append(new_message)

Which subverts the claim that DIAL API is backward compatible with OpenAI API.

A way to fail the stage without failing the request

Need some easy way to fail stage without failing the whole request.
The application logic may have alternative ways to do something. If some stage of the request fails, it does not always mean that the whole request should also fail.

Now you have to write a code like this fail the stage with some expected error an handle the alternative approach, and still have the request to be failed in case of some unexpected error:

	class MyFailStageException(Exception):
		pass
	...

	try:
		with choice.create_stage("stage1") as stage1:
			has_error = do_something()
			if has_error:
				raise MyFailStageException()

	except MyFailStageException:
		pass

	with choice.create_stage("stage2") as stage2:
		do_something_alternative()

It would be easier if you would be able to do something like stage.fail() without raising an exception which would fail the whole request

	with choice.create_stage("stage1") as stage1:
		has_error = do_something()
		if has_error:
			stage1.fail()

	# request execution continues here

Lifespan Events don't work with propagation_auth_headers=True argument

If I set the propagation_auth_headers argument in the DIALApp constructor and add one of the FastAPI lifespan events, the application will fail.

Code:

app = DIALApp('...', propagation_auth_headers=True)  # lifespan=lifespan)
app.add_chat_completion("echo", EchoApplication())

@app.on_event("startup")
async def startup_event():
    print('!!! Startup event !!!')

# Run builded app
if __name__ == "__main__":
    uvicorn.run(app, port=5000, host="0.0.0.0", lifespan='on')

Error message:

INFO: Started server process [4112]
INFO: Waiting for application startup.
ERROR: Exception in 'lifespan' protocol
Traceback (most recent call last):
File "...\venv\lib\site-packages\uvicorn\lifespan\on.py", line 86, in main
await app(scope, self.receive, self.send)
File "...\venv\lib\site-packages\uvicorn\middleware\proxy_headers.py", line 84, in call
return await self.app(scope, receive, send)
File "...\venv\lib\site-packages\fastapi\applications.py", line 1106, in call
await super().call(scope, receive, send)
File "...\venv\lib\site-packages\starlette\applications.py", line 122, in call
await self.middleware_stack(scope, receive, send)
File "...\venv\lib\site-packages\starlette\middleware\errors.py", line 149, in call
await self.app(scope, receive, send)
File "...\venv\lib\site-packages\aidial_sdk\header_propagator.py", line 26, in call
for header in scope["headers"]:
KeyError: 'headers'
ERROR: Application startup failed. Exiting.

Process finished with exit code 3

Use SecretStr type for api_key and jwt in Request

Pydantic library has SecretStr type to store secrets. The SecretStr string will be formatted as '**********' if accidentally printed to logs.
https://docs.pydantic.dev/1.10/usage/types/#secret-types

It would be great to use SecretStr instead of StrictStr for api_key and jwt fields of the aidial_sdk.chat_completion.Request.

This would also help with langchain interoperability, which already uses SecretStr as an api_key parameter:
https://github.com/langchain-ai/langchain/blob/acc8fb3ead6092685947a56d83b745cfda70c970/libs/partners/openai/langchain_openai/llms/azure.py#L48

Strictly define the scope of the public interface of the library

Currently, pretty much every single module in DIAL SDK is publically available.

E.g. modules utils/*.py may be potentially used by a library user, however, we do not really expect this to happen.

Our expectations should be expressed explicitly in the library code.

The modules/classes/methods which we consider to be private to the library (i.e. pertaining to its implementation details) should be prefixed with underscore, so that they are hidden from the public interface.

Telemetry: support conditional loading of instrumentors

None of opentelemetry instrumentors declare dependencies on the libraries which they instruments.

E.g. HTTPXClientInstrumentor doesn't depend on httpx library.

So if a DIAL SDK client doesn't have httpx dependency and it enables telemetry, then it will fail with the import error during initialization of the telemetry:

  File "./.venv/lib/python3.11/site-packages/opentelemetry/instrumentation/httpx/__init__.py", line 167, in <module>
    import httpx
ModuleNotFoundError: No module named 'httpx'

It happens because the instrumentor is loaded unconditionally:

from opentelemetry.instrumentation.httpx import HTTPXClientInstrumentor

def init_telemetry(
    app: FastAPI,
    config: TelemetryConfig,
):
   ###
   HTTPXClientInstrumentor().instrument()
   ###

Instead we should load it only lazily:

def init_telemetry(
    app: FastAPI,
    config: TelemetryConfig,
):
   ###
    try:
        from opentelemetry.instrumentation.httpx import HTTPXClientInstrumentor
        HTTPXClientInstrumentor().instrument()
    except ImportError:
        pass # httpx lib is not installed, not need to load the instrumentor
   ###

The same applies for header propagation logic.

Wrong errors format

Actual error response body:

{
  "detail": {
    "error": {
      "message": "Error during processing the request",
      "type": "runtime_error",
      "param": null,
      "code": null
    }
  }
}

Expected:

{
  "error": {
    "message": "Error during processing the request",
    "type": "runtime_error",
    "param": null,
    "code": null
  }
}

Support tools

Add tool-related fields to the request/response schemas.

Support header propagation for httpx

Header propagation doesn't work for openai>=1.0, because it uses httpx lib.

Note

Header propagation does work for openai<1.0, beause it uses aiohttp lib, which is supported in SDK.

Support rate response API

Add support of API rate response.

Scenario.
User asks a question for the application. The application response with a message to user. The user may react on the quality of the response by rising finger up/down.

Processing flow

  • accept request with deploymentId in the request path and JSON body including responseId - ID of the response produced by application, rate - user reaction on the response quality. Response id is an arbitrary string, rate - boolean value(true - user likes the response otherwise false)
  • Finds registered chat completion handler by deploymentId. If the handler is not found return 404.
  • Validate the request: JSON structure should be valid and contains required fields. Returns 404 if the validation fails.
  • Run the handler to process the request
  • Response: the API should respond with 200(ok)

Note. Chat completion handler has a default implementation of rate response: it does nothing.

Telemetry: use correct version for opentelemetry-exporter-prometheus

Trying to add aidial-sdk[telemetry] to the dependency leads to the warning:

Warning: The file chosen for install of opentelemetry-exporter-prometheus 1.12.0rc1 (opentelemetry_exporter_prometheus-1.12.0rc1-py3-none-any.whl) is yanked. Reason for being yanked: Version is deprecated.

See https://pypi.org/project/opentelemetry-exporter-prometheus/#history

Non-deprecated version of opentelemetry-exporter-prometheus should be used. Most likely, the 0.41b0 - same as for other instrumentations.

Add_attachment does not support Attachment object

Methods Choice.add_attachment and Stage.add_attachment do not accept an instance of Attachment class.
You have to write the code like this:

choice.add_attachment(
        type=attachment.type,
        title=attachment.title,
        data=attachment.data,
        url=attachment.url,
        reference_url=attachment.reference_url,
        reference_type=attachment.reference_type,
)

which looks odd.

I expect the following code to work:

choice.add_attachment(attachment)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.