Comments (11)
Just some feedback regarding documentation/newbiews on the project. I installed it today and had three difficulties:
- How do I add new models to the web server? When I use something other than the default model I get a 404
underlying_embeddings: Embeddings = InfinityEmbeddings(model=model, infinity_api_url="http://0.0.0.0:7997/v1")
-
What is the default port when launching via
infinity_emb
? The main page only at the end mentions 8080 but I figured it is indeed 7997. -
How to launch it via python. Running what I found in the docs
from infinity_emb import create_server
fastapi_app = create_server()
exits with status 0 immediately. Maybe something is missing.
from infinity.
I guess 7997 is fine when it is mentioned more prominently in the docs.
A uvicorn server should not exit immediately so I think something must be wrong in what I did.
from infinity.
Thanks for the creative suggestion. For now, there is no plan to create a video from it.
As of the Readme.md, in what way is it not clear to you how to use infinity, what do you think should be improved?
from infinity.
@michaelfeil, some examples of how to load a model would be nice. Examples in general, are nice in the readme.
from infinity.
@Jawn78 Knowing all components inside out doesnt make it easy to me to see which examples are missing?
Are startup commands missing? / more commands? Should I host the Swagger UI on github.io?
from infinity.
Love your feedback!
Port:
Should I switch the default port to 8080? Or is 7997
fine and just make it a bit more clear in the docs?
Python
The create_server
util is really just a replacement for what the CLI does. It will launch a uvicorn server.
I would recommend using the following: Install with all extras.
pip install "infinity_emb[all]==0.0.x"
from infinity_emb import AsyncEmbeddingEngine, transformer
sentences = ["Embedded this is sentence via Infinity.", "Paris is in France."]
engine = AsyncEmbeddingEngine(engine="torch")
async with engine:
embeddings, usage = await engine.embed(sentences)
embeddings = np.array(embeddings)
You can find possible usage here:
https://github.com/michaelfeil/infinity/blob/main/libs/infinity_emb/tests/unit_test/test_engine.py
from infinity.
Added some chances to the docs.
from infinity.
Unfortunately I also have trouble installing / running infinity on my M1 MacBook Pro.
I was assuming that I could access the Swagger documentation at http://localhost:7997, but that does not seem to be the case. Do I misunderstand something?
from infinity.
@michaelwechner Sounds like youre doing the correct thing. What is the issue youβre seeing?
from infinity.
The containter got created, resp. I can see it with "docker ps -a", but when I send a request to http://localhost:7997/, then it does not seem to be running at all.
from infinity.
Are you forwarding port 7997 from docker, as in the readme?
docker run -it --gpus all -p 7997:7997 michaelf34/infinity:0.0.26 --model-name-or-path BAAI/bge-small-en-v1.5 --port 7997
from infinity.
Related Issues (20)
- cannot use rerank (BAAI/bge-base-en-v1.5) HOT 1
- How does this compare to Huggingface's Text Embedding Inference? HOT 4
- Create llama-index `InfinityEmbeddings` as langchain HOT 4
- Parity break with OpenAI API: /models HOT 4
- Torch + Cuda + Bert crashes abruptly on startup HOT 10
- Asking to truncate to max_length but no maximum length HOT 1
- Adding torch.compile + fp16 + bettertransformer a CLI argument
- Support for nomic-ai/nomic-embed-text-v1.5 HOT 1
- Support for instructur/instructor-xl models HOT 5
- infinity_emb failed at startup using `torch.compile` when installed via pip HOT 8
- Reranker model fails to load (maidalun1020/bce-reranker-base_v1) - no max token length is set HOT 4
- "msg":"Input should be a valid list" HOT 6
- support for `revision` HOT 1
- unexpected keyword argument 'trust_remote_code' HOT 3
- Adding max token budget per batch
- How is long text handled?
- Return actual token count on forward pass HOT 1
- AMD ROCm docker images support (+ optimization) HOT 6
- AWQ-Bert / 4-bit Bert
- 422 error if /embeddings input is a string HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from infinity.