cocktailpeanut / mirror Goto Github PK

View Code? Open in Web Editor NEW

260.0 7.0 16.0 38.43 MB

Python 78.88% JavaScript 21.12%

mirror's Introduction

Mirror

Hackable AI Powered Mirror on your laptop.

Mirror is a web app that constantly watches the realtime video feed from the webcam and responds with comments.

100% Local and Private: Try all kinds of ideas. Don't worry, everything happens on your laptop with NO Internet connection.
FREE: Since the AI model is running 100% on your machine, you can keep it running forever and experiment with different things.
Hackable: Simply by changing the prompt (or tweaking the code), you can easily repurpose Mirror to do different things.

How it works

Watch the video of Mirror in action:

When you launch the app, the browser will ask you for webcam permission.
When you allow the webcam, it will start streaming the video to the AI (Bakllava, running on llama.cpp).
The AI will analyze the image and stream the response, which the frontend prints in realtime.

Usage

When you launch the web UI, it will immediately start streaming responses from the AI based on the prompt: "Describe a person in the image".

You can edit this field to let Mirror start streaming whatever you want

Some example prompts you can try:

What is this object I am holding?
What is the person doing?
Describe some notable events in the image.
How many people are in this picture?
Let me know if you see anything weird.

Install

[RECOMMENDED] 1 Click Install

Try the 1 click install using Pinokio: https://pinokio.computer/item?uri=https://github.com/cocktailpeanut/mirror

Make sure to use the latest version of Pinokio (0.1.49 and above)

Mirror has a lot of moving parts, so if you don't use the 1 Click Installer, it may take a lot of work:

Orchestration of multiple backends (llama.cpp server and the gradio webui server)

Install pre-requisites, such as cmake, visual studio (windows), ffmpeg, etc.

If you want to install manually, go to the following section.

Manual Install

Note that everything mentioned in this entire section is essentially what the 1 Click Installer does, automatically, and works on Mac, Windows, and Linux. So if you get stuck trying to run Mirror manually, try the 1 click install.

1. Clone this repository

git clone https://github.com/cocktailpeanut/mirror

2. Clone llama.cpp

git clone https://github.com/ggerganov/llama.cpp

3. Download AI Model

Download the following bakllava model files to the llama.cpp/models folder

4. Build llama.cpp

cd llama.cpp
mkdir build
cd build
cmake ..
cmake --build . --config Release

5. Install requirements

Create a venv and install rerquirements

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

6. Install prerequisites

Install FFMPEG: https://ffmpeg.org/download.html

7. Launch the llama.cpp server

First start the llama.cpp server:

Windows

cd llama.cpp\build\bin
Release\server.exe -m ..\..\ggml-model-q4_k.gguf --mmproj ..\..\mmproj-model-f16.gguf -ngl 1

Mac & Linux

cd llama.cpp\build\bin
./server -m ..\..\ggml-model-q4_k.gguf --mmproj ..\..\mmproj-model-f16.gguf -ngl 1

8. Launch the web UI

First activate the environment:

source venv/bin/activate

Then run the app.py file

python app.py

Credits

The backend code was inspired and adopted from Realtime Bakllava, which uses...
Llama.cpp for the LLM Server.
Bakllava for the Multimodal AI model.
The Web UI was built with gradio.

mirror's People

Contributors

Stargazers

Watchers

Forkers

tomchapin ukaserge psimyn dengpao-lfz oldaiprojects zeroxclem sm-da veryvanya jeffara 24-tak-14 repos-ai-local robintema jeonghopark supermalinge rizal-razuwan thesnowguru

mirror's Issues

识别的响应速度怎么样？

我想用这个来做一个实时识别的工具

但是不知道这个响应速度这么样？我希望可以实现毫毛级的响应

How can i upload Images/Videos?

How do you guys upload Images/Videos?

ValueError: Image is not numeric, but NoneType.

While running

python3 app.py

got error:

Running on local URL:  http://127.0.0.1:7862
Running on public URL: https://033ac8d375c94a3ecd.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)
Traceback (most recent call last):
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/gradio/queueing.py", line 456, in call_prediction
    output = await route_utils.call_process_api(
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/gradio/route_utils.py", line 232, in call_process_api
    output = await app.get_blocks().process_api(
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1522, in process_api
    result = await self.call_function(
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1156, in call_function
    prediction = await utils.async_iteration(iterator)
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/gradio/utils.py", line 515, in async_iteration
    return await iterator.__anext__()
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/gradio/utils.py", line 508, in __anext__
    return await anyio.to_thread.run_sync(
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/gradio/utils.py", line 491, in run_sync_iterator_async
    return next(iterator)
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/gradio/utils.py", line 662, in gen_wrapper
    yield from f(*args, **kwargs)
  File "/workspace/mirror/app.py", line 20, in run
    imageio.imsave('temp.png', frame)
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/imageio/v2.py", line 389, in imwrite
    raise ValueError("Image is not numeric, but {}.".format(imt.__name__))
ValueError: Image is not numeric, but NoneType.
Traceback (most recent call last):
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/gradio/queueing.py", line 456, in call_prediction
    output = await route_utils.call_process_api(
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/gradio/route_utils.py", line 232, in call_process_api
    output = await app.get_blocks().process_api(
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1522, in process_api
    result = await self.call_function(
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1156, in call_function
    prediction = await utils.async_iteration(iterator)
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/gradio/utils.py", line 515, in async_iteration
    return await iterator.__anext__()
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/gradio/utils.py", line 508, in __anext__
    return await anyio.to_thread.run_sync(
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/gradio/utils.py", line 491, in run_sync_iterator_async
    return next(iterator)
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/gradio/utils.py", line 662, in gen_wrapper
    yield from f(*args, **kwargs)
  File "/workspace/mirror/app.py", line 20, in run
    imageio.imsave('temp.png', frame)
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/imageio/v2.py", line 389, in imwrite
    raise ValueError("Image is not numeric, but {}.".format(imt.__name__))
ValueError: Image is not numeric, but NoneType.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/gradio/queueing.py", line 501, in process_events
    response = await self.call_prediction(awake_events, batch)
  File "/workspace/llama.cpp/venv/lib/python3.10/site-packages/gradio/queueing.py", line 465, in call_prediction
    raise Exception(str(error) if show_error else None) from error

PermissionError

Hello everyone,
I get the following error.

PermissionError: [WinError 32] The file is distributed by another process but this process cannot access the file: 'D:\mirror\cache\TEMP\tmpls714zfm'
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "D:\mirror\bin\miniconda\lib\shutil.py", line 816, in move
os.rename(src, real_dst)
PermissionError: [WinError 32] The file is distributed by another process but this process cannot access the file: 'D:\mirror\cache\TEMP\tmp09puxx2y' -> 'D:\mirror\cache\TEMP\gradio\f00584179e8e318c202d73d9ce75bc457ac415ac\webcam.png'

During handling of the above exception, another exception occurred

No image on Windows 11 Pro operating system desktop

I have mirror running on a Windows 11 Pro operating system, on a desktop computer. See

No image is appearing. Hoping for a clue to get it working. I'll provide further details on request.

Pass through previous response(s) to cut down on repetition in responses

Using the Washington Sq Park gif as an example, you could save time/improve legibility by passing back mirror's previous response(s), so as to avoid the repetitive flowery language about the fountain, or anything else unchanging about the scene. By doing so, you could hopefully have it just comment on what changed from capture moment to capture moment.

Error in Promt

I have tried different promts, it is processed but I cannot get output.
Here is the logs
logs.zip

Windows 11, doesn't install: cmake issues

d:\downloads\pinokio\api\mirror.git\llama.cpp>mkdir build && cd build && cmake .. && cmake --build . --config Release && cd ..
-- Building for: MinGW Makefiles
CMake Error: CMake was unable to find a build program corresponding to "MinGW Makefiles". CMAKE_MAKE_PROGRAM is not set. You probably need to select a different build tool.
CMake Error: CMake was unable to find a build program corresponding to "MinGW Makefiles". CMAKE_MAKE_PROGRAM is not set. You probably need to select a different build tool.
CMake Error: CMAKE_C_COMPILER not set, after EnableLanguage
CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
-- Configuring incomplete, errors occurred!

GIven that I installed via pinokio, I would have hoped it would install all dependencies

raise Exception(str(error) if show_error else None) from error Exception: None

i have install in macbook pro M1pro sonoma 14.4.1 (23E224)
and the error logo attached

logs.zip

Windows 11 - Image Flicker, No Connection

Pinokio install on Windows 11, RTX 3090

2023-11-12.09-50-41.mp4

Console keeps throwing these 👇

Traceback (most recent call last):
  File "C:\pinokio\bin\miniconda\lib\asyncio\events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
  File "C:\pinokio\bin\miniconda\lib\asyncio\proactor_events.py", line 165, in _call_connection_lost
    self._sock.shutdown(socket.SHUT_RDWR)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host

When I hit enter in the prompt box, it doesn't seem to transmit to the language model either.

why set -ngl 1

Is there a specific reason for setting the GPU layers to 1?