ravn-tech / hypertag Goto Github PK
View Code? Open in Web Editor NEWHyperTag - Intuitive Knowledge Management WebApp & CLI for Humans using Deep Learning & Tags
License: Other
HyperTag - Intuitive Knowledge Management WebApp & CLI for Humans using Deep Learning & Tags
License: Other
eval CLIP embeddings
Text search happens right now only in vector space and thus ignores exact query token matches (which are a high signal though).
Depends on #32
Create a dedicated directory called "Search Texts". All directories names created in "Search Texts" are interpreted as search queries for text documents and accordingly populated with the results.
Right now the whole HyperTagFS directory gets rebuild on every tag changing operation. Instead only make partial updates.
Create a dedicated directory called "Search Images". All directories names created in "Search Images" are interpreted as search queries for image files and accordingly populated with the results.
This will tell the daemon to automatically watch the imported directory for new files and renames.
Needs fast and reliable IPC to work out
Candidates:
First basic version: Partition video into e.g. 16 uniformly spaced (by time) sections and take a screenshot. Embed each screenshot and use average as video embedding.
Advanced: Partition video with higher granularity and extract frames e.g. every 5 seconds or fixed high number (+100). Compute embedding for every extracted frame. Compute pairwise consecutive frame distances in embedding space to infer semantically coherent video sections (similar frames). Embed each section as average of coherent frames (below a threshold). The list of average frame embeddings should be a pretty good representation of the video and comes with section start & end metadata.
Add hash and size columns to files table.
On add: compute hash and size -> Ignore duplicates.
Right now text documents are represented as a single average embedding of all their sentences. Increase granularity / signal by vectorizing individual pages.
Related to #25
hypertag tag new_year_resolution.txt with year=2021
This will make HyperTag accessible for a broader audience
Add new tables:
Allow to search for image files using both text and images as queries
Use transactions (only commit once at the end)
Semantic search comes with fairly big dependencies that some users may not can / want to download.
Currently things stop working if no CUDA GPU is available. This is bad. Make CUDA optional (allow CPU only usage). Looks like CLIP does not work without CUDA...
If CLIP performs as good as DistilBERT, there is no need for DistilBERT anymore.
Test basic functions that are unlikely to change behavior:
Vectorize all text documents and let the user search them.
Just eyeballing: Glove model (average_word_embeddings_glove.6B.300d) seems to perform better than DistilBERT (stsb-distilbert-base), add some small benchmark tests with common and diverse papers and queries.
Models:
https://docs.google.com/spreadsheets/d/14QplCdTCDwEmTqrn1LH4yrbKvdogK4oQvYO1K1aPR5M/edit#gid=0
Match semantically very similar words. For example if files are tagged with science and research is queried it should match. Definitely add a toggle to turn this feature off as some users may find it confusing.
Related to #9
Watchdog looks like what we need: https://pythonhosted.org/watchdog/quickstart.html
Powered by CLIP
Add new columns auto_index_images, auto_index_texts to auto imports table
When a new file is added, automatically infer tags from semantically similar existing files tags.
Depends on #24
This will enable to sync the hypertag.db across different machines / devices, while still working with relative file paths.
Tesseract:
Even better: Find a solid GPU accelerated OCR implementation:
Fuzzy String Matching: https://github.com/seatgeek/fuzzywuzzy
Related to #9
Print all children of a tag
$ hypertag merge A into B
Moves all file association from A to B
Use a spatial index data structure (tree or hash based) -> https://github.com/nmslib/hnswlib/
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.