Comments (6)
Hello, I followed the command you gave to create the container and would like to call the local model (codellama)
The method I used to create the container:
docker run -itd -e OLLAMA_ORIGINS='chrome-extension://*' -p 11434:11434 --name ollama ollama/ollama
‘-itd’ because I want to deploy the local codellama in the plugin, then I go into the container and
‘ollama create codellama ModelFile’ ,
so that later I can access the local model externally via curl on port 11434, but still no response when I put the full url in the plugin
I try to use ‘ollama serve’ in the container but it comes up with
127.0.0.1:11434: bind: address already in use
from lumos.
Thank you very much for your itemized response to my question. I understand the reason why starting the container and ollama serve causes the port to be occupied; however, outside the container curl ... /api/create seems to have the same effect as ollama create inside the container. I also have normal access to 11434 under this method but still no response in the plugin. Next I'll try changing the dockerfile and re-pulling the image to try this. Also may I ask why so many plugins are based on the ollama interface? Why not use openai type interfaces like vllm or fastchat wouldn't they be more versatile. Again, thanks for the pointers!
from lumos.
@su-zelong, I suspect the issue is related to CORS security, which means the OLLAMA_ORIGINS
environment variable needs to be set when running the Docker container. The default instructions for running the container do not specify this.
Run this command to run the container and set the OLLAMA_ORIGINS
environment variable:
docker run -e OLLAMA_ORIGINS="chrome-extension://*" -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
If you have Docker Desktop, you can see that the OLLAMA_ORIGINS
environment variable is set.
Now, update the host address in OLLAMA_BASE_URL
to 0.0.0.0
and rebuild the extension (npm run build
):
const OLLAMA_BASE_URL = "http://0.0.0.0:11434";
After following these steps, the Chrome extension successfully calls the Ollama API served from the Docker container. I'll update the README
with these instructions.
from lumos.
I try to use ‘ollama serve’ in the container but it comes up with
127.0.0.1:11434: bind: address already in use
You won't be able to run ollama serve
from inside the container because the container is already configured to run ollama serve
at startup, which means the Ollama host (e.g. 127.0.0.1
) is already used (bound) by another process. See the Dockerfile entry point configuration: https://github.com/jmorganca/ollama/blob/main/Dockerfile#L28C1-L29
‘-itd’ because I want to deploy the local codellama in the plugin, then I go into the container and
‘ollama create codellama ModelFile’ ,
Once the custom model is created, it doesn't seem like there's a way to restart the ollama serve
process (see this issue: ollama/ollama#1266). If you want to create a custom Modelfile
and have the custom model served from the API in the container, you may have to create your own Dockerfile
and COPY
your custom Modelfile
s into the image.
Alternatively, you can create the custom model via the API after the container is running. This approach worked for me. I was able to create a custom model and call it from outside the container. High level steps:
- Run container:
docker run
- Create custom model (from outside container)
curl http://0.0.0.0:11434/api/create -d '{ "name": "Mario", "modelfile": "FROM llama2\nSYSTEM You are mario from Super Mario Bros." }'
- Call custom model (from outside container)
curl http://0.0.0.0:11434/api/chat -d '{ "model": "Mario", "messages": [ { "role": "user", "content": "why is the sky blue?" } ] }'
At this point, the Chrome extension should be able to call the custom model from outside the container.
from lumos.
I also have normal access to 11434 under this method but still no response in the plugin.
Do you see any errors in the developer console for the background.js
script? And to be clear, after updating the hostname and model in background.ts
and rebuilding the extension (npm run build
), the extension should be refreshed (or uninstalled and reinstalled).
Also may I ask why so many plugins are based on the ollama interface? Why not use openai type interfaces like vllm or fastchat wouldn't they be more versatile.
I can't answer why so many plugins are based on the Ollama interface. However, there are open issues for making Ollama compatible with OpenAI's interfaces (ollama/ollama#305, ollama/ollama#1316).
from lumos.
@su-zelong, please open a new issue if you're still unable to connect to a custom model running in the Docker container. Thanks.
from lumos.
Related Issues (20)
- Lumos with WSL2 HOT 8
- Dark mode HOT 2
- Chat with PDF file in browser? HOT 4
- Skip download of unsupported image formats
- Enable chat model for images (multimodal support) HOT 1
- Sort imports, add functionality to linting
- Add dark mode option to README
- TypeError: Cannot read properties of undefined (reading 'includes') HOT 2
- Support cancel streaming
- Unable to connect to Ollama API on Mac OS HOT 7
- Support uploading image file (multimodal)
- Add shortcut to unattach file
- Add functionality to delete individual message (or regenerate LLM response)
- Add LangChain `YoutubeLoader` HOT 4
- Add audio document loader HOT 1
- WebLLM HOT 2
- Window resolution abnormal HOT 12
- Add support for embedding model `mxbai-embed-large`
- Summarize chat for chat title/preview (chat history view)
- Support `snowflake-arctic-embed` embedding model
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lumos.