mozilla-ocho / memory-cache Goto Github PK

View Code? Open in Web Editor NEW

538.0 538.0 23.0 4.38 MB

MemoryCache is an experimental development project to turn a local desktop environment into an on-device AI agent

Home Page: https://memorycache.ai/

License: Mozilla Public License 2.0

HTML 5.60% JavaScript 56.56% CSS 5.84% Shell 0.28% Python 26.79% TypeScript 4.92%

artificial-intelligence firefox-addon local-ai

memory-cache's People

Contributors

Stargazers

Watchers

memory-cache's Issues

Update extension modal to use theming!

Browser extension Settings/Configuration screen

MemoryCache browser extension mockup that includes a configuration page

^ Mockup that demonstrates a settings page addition to the browser extension

This change would mean we could reduce the amount of 'Save' buttons on the application's home view and use the user's preference for formatting the download
The 'Links here' is meant to be a catch-all for relevant information (to be added if we feel the need)
Open to suggestions for the label 'Local file location'

We want to set up a single landing page using GH pages to outline the project goals, areas of explorations, and links to helpful resources / ways for people to follow along with Memory Cache development

Update to use GPU-accelerated hardware instead of CPU-bound with gpt4all

Memory Cache should use a GPU that is available to do inference in order to speed up performance of queries and deriving insights from documents.

What I tried so far

I spent a few days last week exploring the differences between the primordial privateGPT version and latest. One of the major differences is that the newer project updates include support for GPU inference for llama and gpt4all, but the challenge that I ran into with the newer version is that moving from the older groovy.ggml model (which is no longer supported given that privateGPT now uses the .gguf format) to llama doesn't have the same results when ingesting the same local file store and querying.

This might be a matter of how RAG is implemented, something about how I set things up on my local machine, or a function of model choice.

I've lazily tried to see if this can be resolved through dependency changes but I haven't had luck getting to a version that runs that supports .ggml and GPU acceleration together. From what I can tell, Nomic introduced a version of gpt4all that works on GPU in 2.4 (latest is 2.5+) but it's unclear if there's a way to get this working cleanly with minimal changes to how my fork of privateGPT uses langchain to import the gpt4all package. It is unclear to me if this works on Ubuntu or if it's only Vulkan APIs, I need to do some additional investigation.

I did get CUDA installed and verified that my GPU is properly detected and set up to run the sample projects provided by Nvidia.

What's next

I'm going to test against gpt4all's chat client with snoozy (which uses the same dataset as groovy) and the shared file directory, but there seems to be a sweet spot for the combination of primordial privateGPT + groovy that is challenging to replicate.
Branch and start experimenting with upgrading gpt4all and langchain in the primoridal privateGPT repo to see if I can get any of it running with the existing groovy.ggml model
Attempt to convert groovy from ggml to gguf using the llama.cpp utility and try to switch from gpt4all to llama, which might be easier than trying to get a proper CUDA-backed gpt4all working.

Testing

I've been using a highly subjective test to evaluate:

Prompt: "What is the meaning of a life well-lived?"

The answer for primordial privateGPT+groovy that has been augmented on my local files answers this question with a combination of "technology and community" consistently. No other combination of model/project has replicated that consistently.

Fix run_ingest.sh script to ignore trying to process .part files while downloading

The ingestion script attempts to run when it finds a .part file as the file is downloading. These should be ignored.

Investigate feasibility of making this work in vanilla Firefox

When implementing the feature to allow saving notes as quick text, I was able to do this with the existing downloads API and using a blob to store the text. Is it possible to do this for the entirety of the HTML contents of the webpage, so that it could work out of the box with vanilla Firefox instead of a custom build?

Add preliminary evaluation metric/criteria for insight quality

Feature Idea: Ingest from Bookmarks folder

The current setup needing to patch Firefox and add an extension got me thinking about other ways to get data from the browser.

I knew it's relatively easy to read the places.sqlite file of a profile (containing the browsing history and bookmarks), so I got the following idea I would like to submit:

The user could create a specific bookmark folder, and all the bookmarks put in it would be automatically ingested into the document DB.

That way no special setup would be needed on the browser to use Memory-Cache with it.

I implemented a PoC in the branch ingest-bookmark of my privateGPT fork. This branch adds the ingest_bookmarks.py script, that needs an environment variable BOOKMARK_FOLDER to be defined.
It reads the content of the user default profile to get all the bookmarks of this folder, fetch the page content for each bookmark and ingest them into the docs database.

For this PoC the script currently needs to be manually run, and will re-import all the bookmarks each time. It can be improved to only ingest new bookmarks, to allow overriding the selected user profile and so on.

Extension issue -- Save As PDF functionality is printing to connecting printer by default

Issue: Save As PDF functionality is printing to connecting printer by default
Expected behaviour: PDF is saved to defined MemoryCache folder in Downloads Directory

Utilizing source-build of Firefox with these alterations

Style guide page on MC Site

Get used to working in Jekyll sites, add a static page to maintain styling and design resources

Create taxonomy of categories as a first pass on how we might fine-tune a model to make insightful associations

Logo (Wordmark + Icon)

Add option to save text notes as .md files

Right now, the extension is just saving everything as a .txt file. A suggestion was made by @zfox23 to make this support markdown instead of just .txt. It could be added as an optional parameter into the (as of yet non-existent) settings page to choose what format to save notes as.

Generate unique "brainprint" from the contents of the MemoryCache subdirectory

The icon for Memory Cache is taken from the idea of a unique "brainprint" that can be generated based on the contents of the Memory Cache. A fun way to incorporate this into the app is to generate the logo based on the file contents.

Dont force this bloat on users, Mozilla. Not Pocket 2.0

Just more bloatware for Mozilla to shove into Firefox just like Pocket.

No one wants these random AIs, they just want to open a web browser and type chat.openai.com and talk there. We dont want a goddamn AI in every app and service we use.

End this madness please, dont force it upon me in firefox where I must use more scripts to remove the bloat.

Add link to MemoryCache repo to header

https://github.com/Mozilla-Ocho/Memory-Cache
https://memorycache.ai/

Persist popup on Firefox extension

If you clear focus from the extension window at the moment, it will disappear. Persist the state of the extension (either when there is text in the notes box, or until the tab is closed) to avoid data loss.

(Design) Conceptual brainstorming for app experience

FigJam whiteboard for early ideation and brainstorming

Make file names more readable

Use the title of the page when available to make the file names more easily readable.

Fix broken logo on non-home pages

Path is hardcoded in header.html, probably should be defined somewhere else

(Design) Refined flow diagram to communicate UX work

Translate the conceptual flow work into a refined diagram to iterate on https://github.com/misslivirose/Memory-Cache/issues/17

Add a text input field into plugin to quickly save notes

Users may want to add their own annotation (or even just quickly take a separate note) to save in their memory cache. We should add a text input field into the Firefox extension to generate a .txt file that saves into the directory.

(Design) Concept Mockups -- General flow

Working document for Figma -- contains a lot of working mess for UI assembly mostly -- building out a style kit to support expected needs with a priority for wide customization

Graphic development -- banner to represent the project visually

Feature Idea: RSS Feed Integration

I've been musing on the idea of finding a way to incorporate RSS into MemoryCache. It feels like it might be an interesting way to provide additional publishing sources that people might want to draw from, but at the same time, I'm torn because I think there's a risk of RSS content pushing a lot of things into the data store that never actually gets read by the human, which feels like a large factor to me. Something we might want to consider (cc @katetaylormoz) is whether we want to have an optional separate flow for user-added sources, which have to be "read" before being saved?

Alternatively, the solution might just be to recommend an RSS feed extension separately, and use the existing document flow.

Investigate ability to save downloads on one machine to later synchronize when back on home network

Scenario: I'm at an offsite and want to save some documents to later be shared with and ingested by my local machine that serves as my main AI workstation.

Some thinking:

I don't want the solution to be to move my pages to the cloud, but to have a way of setting up a sync locally on my home network
Ideally I can set this up without an account-based solution, but some other form of authenticating (maybe per-device?)

Add listener to run_ingest.sh to listen to file changes in sylinked directory

The current script only listens for modifications to the downloads folder, which means that if you manually add a file to a subdirectory in source_documents, it won't trigger the ingest script to run.

This isn't a huge issue, because it's going to pick it up in the next saved document anyway, but it might be good to have a flag to "listen" more intently to when new files are added, regardless of how they make their way into the cache.

File names - Issue with 'Save as PDF' functionality

Issue:
File name defaults to 'PAGE-[object Promise]' for every PDF saved

Preferred Behaviour (or similar):
PAGE-Carl-Sagan---Wikiquote-2024-01-23.html
PAGE-Carl-Sagan---Wikiquote-2024-01-23.pdf

Notes:
Priority is human-readable filename,
Nice-To-Have would be html and pdf are the same name with different formats

mozilla-ocho / memory-cache Goto Github PK

memory-cache's People

Contributors

Stargazers

Watchers

Forkers

memory-cache's Issues

What I tried so far

What's next

Testing

Recommend Projects

Recommend Topics

Recommend Org