Code Monkey home page Code Monkey logo

Comments (6)

su-zelong avatar su-zelong commented on July 29, 2024 1

Hello, I followed the command you gave to create the container and would like to call the local model (codellama)
The method I used to create the container:

docker run -itd -e OLLAMA_ORIGINS='chrome-extension://*' -p 11434:11434 --name ollama ollama/ollama

‘-itd’ because I want to deploy the local codellama in the plugin, then I go into the container and
‘ollama create codellama ModelFile’ ,
so that later I can access the local model externally via curl on port 11434, but still no response when I put the full url in the plugin

I try to use ‘ollama serve’ in the container but it comes up with

127.0.0.1:11434: bind: address already in use

from lumos.

su-zelong avatar su-zelong commented on July 29, 2024 1

@andrewnguonly

Thank you very much for your itemized response to my question. I understand the reason why starting the container and ollama serve causes the port to be occupied; however, outside the container curl ... /api/create seems to have the same effect as ollama create inside the container. I also have normal access to 11434 under this method but still no response in the plugin. Next I'll try changing the dockerfile and re-pulling the image to try this. Also may I ask why so many plugins are based on the ollama interface? Why not use openai type interfaces like vllm or fastchat wouldn't they be more versatile. Again, thanks for the pointers!

from lumos.

andrewnguonly avatar andrewnguonly commented on July 29, 2024

@su-zelong, I suspect the issue is related to CORS security, which means the OLLAMA_ORIGINS environment variable needs to be set when running the Docker container. The default instructions for running the container do not specify this.

Run this command to run the container and set the OLLAMA_ORIGINS environment variable:

docker run -e OLLAMA_ORIGINS="chrome-extension://*" -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

If you have Docker Desktop, you can see that the OLLAMA_ORIGINS environment variable is set.

image

Now, update the host address in OLLAMA_BASE_URL to 0.0.0.0 and rebuild the extension (npm run build):

const OLLAMA_BASE_URL = "http://0.0.0.0:11434";

After following these steps, the Chrome extension successfully calls the Ollama API served from the Docker container. I'll update the README with these instructions.

from lumos.

andrewnguonly avatar andrewnguonly commented on July 29, 2024

@su-zelong

I try to use ‘ollama serve’ in the container but it comes up with

127.0.0.1:11434: bind: address already in use

You won't be able to run ollama serve from inside the container because the container is already configured to run ollama serve at startup, which means the Ollama host (e.g. 127.0.0.1) is already used (bound) by another process. See the Dockerfile entry point configuration: https://github.com/jmorganca/ollama/blob/main/Dockerfile#L28C1-L29

‘-itd’ because I want to deploy the local codellama in the plugin, then I go into the container and
‘ollama create codellama ModelFile’ ,

Once the custom model is created, it doesn't seem like there's a way to restart the ollama serve process (see this issue: ollama/ollama#1266). If you want to create a custom Modelfile and have the custom model served from the API in the container, you may have to create your own Dockerfile and COPY your custom Modelfiles into the image.

Alternatively, you can create the custom model via the API after the container is running. This approach worked for me. I was able to create a custom model and call it from outside the container. High level steps:

  1. Run container: docker run
  2. Create custom model (from outside container)
    curl http://0.0.0.0:11434/api/create -d '{
        "name": "Mario",
        "modelfile": "FROM llama2\nSYSTEM You are mario from Super Mario Bros."
    }'
    
  3. Call custom model (from outside container)
    curl http://0.0.0.0:11434/api/chat -d '{
        "model": "Mario",
        "messages": [
            {
                "role": "user",
                "content": "why is the sky blue?"
            }
        ]
    }'
    

At this point, the Chrome extension should be able to call the custom model from outside the container.

from lumos.

andrewnguonly avatar andrewnguonly commented on July 29, 2024

@su-zelong

I also have normal access to 11434 under this method but still no response in the plugin.

Do you see any errors in the developer console for the background.js script? And to be clear, after updating the hostname and model in background.ts and rebuilding the extension (npm run build), the extension should be refreshed (or uninstalled and reinstalled).

Also may I ask why so many plugins are based on the ollama interface? Why not use openai type interfaces like vllm or fastchat wouldn't they be more versatile.

I can't answer why so many plugins are based on the Ollama interface. However, there are open issues for making Ollama compatible with OpenAI's interfaces (ollama/ollama#305, ollama/ollama#1316).

from lumos.

andrewnguonly avatar andrewnguonly commented on July 29, 2024

@su-zelong, please open a new issue if you're still unable to connect to a custom model running in the Docker container. Thanks.

from lumos.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.