Comments (4)
simsearch: turns out it's a good basis but it has a very poor scoring system; definitely something that can be easily improved
using Ivshti/simsearch-rs@492d1ec, I performed a quick test to find The Office and Game of Thrones in 10k documents
simsearch perf, WASM
10k dataset of cinemeta items
two searches were performed for the "search" timing
in conclusion, we are OK to do a separate search on each keystroke
Firefox
indexing: 486ms - timer ended
searching: 13ms - timer ended
indexing: 419ms - timer ended
searching: 11ms - timer ended
Chrome
indexing: 536.856689453125ms
searching: 20.677978515625ms
indexing: 471.500244140625ms
searching: 22.561767578125ms
from stremio-core.
A new API can be created, where the SimSearch is created with a vector of items, and an index
and rank
closures
It should own that vector of items, and use the item's position in the vector as the ID
this will also eliminate the need for a forward map, since that's only used when deleting. And so, it will make indexing faster (2x?)
Searching actors could be implemented by making a separate instance (searchIndex catalog) for actors, that, upon opening each actor, points to a particular catalog with an extra param (byActor=...); the UI should handle that by opening it in Discover
from stremio-core.
Starting from simsearch-rs, we can make this usable for our purpose:
- Make an immutable index API version
- Inverse frequency (to implement tf-idf) by storing each doc score for a token ; may use a tree so that we can have token -> doc -> score; this would also allow popularity boosting
- New scoring: TF-IDF and maybe n-gram
- Automated tester that enters popular things.. partially, fully and misspelled
- Proper stop words and/or stemmer
- Optional prefix search (autocomplete) for the last term
- Cinemeta will use a made up “mixed” type for its catalog named “movies and series”; so that those will be searched and ranked together
from stremio-core.
Some classic movies are outside the top 10k by moviedb popularity, so in conclusion: we must index a larger set, or by a different factor; or a combo $or query
- top 15-20k by stremio
- OR
db.metadata.count({'popularities.stremio': {$gt: 0.15}})
; that's 18135 items as of 17.08.2019 - OR an
$or
query where the other condition is some combo between imdbRating and trakt popularity
from stremio-core.
Related Issues (20)
- No Sound with Android TV Hdmi Arc HOT 2
- Subtitles unparsed codes (e.g. \an8) HOT 1
- Add new filter in Explore section: Country of origin HOT 3
- Subtitles not syncronized on chromecast HOT 1
- Add caching limit option on Android TV
- Tests for ExternalPlayerLink
- "No videos found for this meta!" error if no behaviorHints.defaultVideoId present in the catalog reponse HOT 1
- Latest Update causes Stremio to crash on start on MacOS High Sierra HOT 1
- Support for Media Keys to Control Playback HOT 1
- Dterimo
- Add Network Interface Option HOT 1
- Fix LibraryItemState snake_case and camelCase inconsistency HOT 5
- [Feature-Request] Add in-memory caching option PC HOT 1
- Calender support for android tv and phones. HOT 1
- Mark video as watched when emitting PlayingOnDevice event
- Trakt is not scrobbling on Nvidia shield TV (2015) HOT 3
- Error - Stremio streaming server has thrown an error HOT 1
- Notifications & Continue Watching refinement
- Fullscreen shortcut HOT 1
- Tests for Toggle notifications of LibraryItem
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stremio-core.