Code Monkey home page Code Monkey logo

simpleai's Introduction

SimpleAI

A self-hosted alternative to the not-so-open AI API. It mostly focus on replicating the main endpoints for LLM:

  • Text completion (/completions/)
  • Chat (/chat/completions/) (See example here)
  • Edits (/edits/) (See example here)
  • Embeddings (/embeddings/) (See example here)

While it is not an exact replication of the original endpoints, it should be relatively to switch from one to the other, allowing you to experiment with competing approaches quickly and easily.

Overview

Why this project?

Well first of all it's a fun little project, and perhaps a better use of my time than watching some random dog videos on Reddit or YouTube. I also believe it can be a great way to:

  • experiment with new models and not be too dependent on a specific API provider,
  • create benchmarks to decide which approach works best for you,
  • handle some specific use cases where you cannot fully rely on an external service, without the need of re-writing everything

Installation

On a machine with Python 3.9+:

pip install simple_ai_server

Setup

Start by creating a configuration file to declare your models:

simple_ai init

It should create models.toml, where you declare your different models (see how below). Then start the server with:

simple_ai serve [--host 127.0.0.1] [--port 8080]

You can then see the docs and try it there.

Integrating and declaring a model

Model integration

Models are queried through gRPC, in order to separate the API itself from the model inference, and to support several languages beyond Python through this protocol.

To expose for instance an embedding model in Python, you simply have to import a few things, and implements the .embed() method of your EmbeddingModel class:

import logging
from dataclasses import dataclass

from simple_ai.api.grpc.embedding.server import serve, LanguageModelServicer

@dataclass(unsafe_hash=True)
class EmbeddingModel:
    def embed(self, 
        inputs: list=[],
    ) -> list:
        # TODO : implements the embed method
        return [[]]

if __name__ == '__main__':   
    model_servicer = LanguageModelServicer(model=EmbeddingModel())
    serve(address='[::]:50051', model_servicer=model_servicer)

For a completion task, follow the same logic, but import from simple_ai.api.grpc.completion.server instead, and implements a complete method.

Declaring a model

To add a model, you first need to deploy a gRPC service (using the provided .proto file and / or the tools provided in src/api/). Once your model is live, you only have to add it to the models.toml configuration file. For instance, let's say you've locally deployed a llama.cpp model available on port 50051, just add:

[llama-7B-4b]
    [llama-7B-4b.metadata]
        owned_by    = 'Meta / ggerganov'
        permission  = []
        description = 'C++ implementation of LlaMA model, 7B parameters, 4-bit quantization'
    [llama-7B-4b.network]
        url = 'localhost:50051'
        type = 'gRPC'

You can see see and try of the provided examples in examples/ directory (might require GPU).

Usage

Thanks to the Swagger UI, you can see and try the different endpoints here:

Example query with cUrl

Or you can directly use the API with the tool of your choice.

curl -X 'POST' \
  'http://127.0.0.1:8080/edits/' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "model": "alpaca-lora-7B",
  "instruction": "Make this message nicer and more formal",
  "input": "This meeting was useless and should have been a bloody email",
  "top_p": 1,
  "n": 1,
  "temperature": 1,
  "max_tokens": 256
}'

It's also compatible with OpenAI python client:

import openai

# Put anything you want in `API key`
openai.api_key = 'Free the models'

# Point to your own url
openai.api_base = "http://127.0.0.1:8080"

# Do your usual things, for instance a completion query:
print(openai.Model.list())
completion = openai.Completion.create(model="llama-7B", prompt="Hello everyone this is")

Contribute

This is very much work in progress and far from being perfect, so let me know if you want to help. PR, issues, documentation, cool logo, all the usual candidates are welcome.

Development Environment

In order for the following steps to work it is required to have make and poetry installed on your system.

To install the development environment run:

make install-dev 

This will install all dev dependencies as well as configure your pre-commit helpers.

License

MIT License

Copyright (c) 2023 Louis HENAULT

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

simpleai's People

Contributors

lhenault avatar nintorac avatar guangyusong avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.