Code Monkey home page Code Monkey logo

sponsorblockserver's Introduction

SponsorBlock Server

SponsorBlock is an extension that will skip over sponsored segments of YouTube videos. SponsorBlock is a crowdsourced browser extension that let's anyone submit the start and end time's of sponsored segments of YouTube videos. Once one person submits this information, everyone else with this extension will skip right over the sponsored segment.

This is the server backend for it

Server

This uses a Postgres or Sqlite database to hold all the timing data.

To make sure that this project doesn't die, I have made the database publicly downloadable at https://sponsor.ajay.app/database. You can download a backup or get archive.org to take a backup if you do desire. The database is under this license unless you get explicit permission from me.

Hopefully this project can be combined with projects like this and use this data to create a neural network to predict when sponsored segments happen. That project is sadly abandoned now, so I have decided to attempt to revive this idea.

Client

The client web browser extension is available here: https://github.com/ajayyy/SponsorBlock

Build Yourself

This is a node.js server, so clone this repo and run npm install to install all dependencies.

Make sure to put the database files in the ./databases folder if you want to use a pre-existing database. Otherwise, a fresh database will be created.

Rename config.json.example to config.json and fill the parameters inside. Make sure to remove the comments as comments are not supported in JSON.

Ensure all the tests pass with npm test

Run the server with npm start.

Developing

If you want to make changes, run npm run dev to automatically reload the server and run tests whenever a file is saved.

API Docs

Available here

License

This is licensed under AGPL-3.0-only.

sponsorblockserver's People

Contributors

ajayyy avatar andrewzlee avatar bershanskiy avatar choromanski avatar dainius14 avatar dependabot[bot] avatar detachhead avatar florianzahn avatar fosefx avatar haidang666 avatar joe-dowd avatar joedowdcap avatar logandark avatar mchangrh avatar mini-bomba avatar mruy avatar ndevtk avatar opl- avatar pdonias avatar peterdavehello avatar sashaxser avatar tag-epic avatar thignus avatar whizzzkid avatar zegnat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sponsorblockserver's Issues

Idea: Don't accept new users' submissions

  • New users submissions are hidden by default with a new column (not shadowHidden)
  • Every submission, check if there are submissions by other approved users that are 95% similar to their previous submissions (start out with 1, make this variable)
    • Unhide all user submissions making them "approved"

#423 should be improved to check if they are an approved submitter instead of just a submitter.

Moved from ajayyy/SponsorBlock#422

VIP segment time shift

How about adding a functionality that can shift segments startTime and endTime of a video by providing a timestamp and a duration in seconds to adjust the existing segments.

Use case:
Videos or livestreams that got edited after the segments got submitted.

Restrictions on what characters a username can contain.

The API doesn't currently restrict which characters a username can contain.
For example userId a5681b9905d953ff3bc6dfebda7c1fb1072b5c18e1fd3e201c33f107c9fe73b9
The username should be restricted to a set of characters.

Also it looks like the videoId has no constraints either (and is also a non-hashed text input) - this one should be easier to define as it's already defined by youtube.

Potential Bug: (mysql integration) ATTACH ? as privateDB

re: 986c9dc

databases.js line 41
getTopUsers.js line 47->49

The private db is mounted commonly with a set alias to facilitate joins.
This may not work with the two mounted mysql databases (I guess unless a common db name is used.)
Investigation needed into how mysql (and the mysql node.js library) handles database names defined in the connection object and how they link to inter-database joins (I say investigation as I'm no expert, feel free to chime in).

Also: Although not strictly needed - mysql specific tests could be implemented (though low priority unless mysql integration is considered for production usage (at the moment it's for third party use and maintained as a courtesy to me)).

TypeScript?

How about porting backend to TypeScript? Since FE is already written in TypeScript I don't think this needs any explanation why this is a good idea :) Could try to do it as part of Hacktoberfest

.

.

Preserve user's privacy with k-anonymity

As I understand, currently, the user submits a video ID (e.g. dQw4w9WgXcQ), and gets back a single JSON-object.

Here I propose to add a new endpoint, e.g. /api/anonymousGetVideoSponsorTimes that takes the first n (e.g. 5) characters of a hash (sha1 is fine) and returns a list of possible results, like in the example below.

This approach is used in Troy Hunt's Have I Been Pwned API; see https://www.troyhunt.com/ive-just-launched-pwned-passwords-version-2/ and https://blog.cloudflare.com/validating-leaked-passwords-with-k-anonymity/ for example.

input: { hash_prefix: <sha1sum("dQw4w9WgXcQ").substr(0,5)> }
(e.g. { hash_prefix: '3dd08' })

output:

[
 {
  videoID: 'dQw4w9WgXcQ',
  sponorTimes: array [float],
  UUIDs: array [string] //The ID for this sponsor time, used to submit votes
 },
 {
  videoID: 'ah20943fdhj7'
  sponorTimes: array [float],
  UUIDs: array [string] //The ID for this sponsor time, used to submit votes
 },
// ...
]

Since Youtube IDs are sparse, and furthermore SponsorBlock only has a small part of IDs indexed, each query will return only a small amount of results, if any. If result length were to get out of hands in the future, it would be easy to increase the number of input characters required.

For performance reasons, the database should grow a new column, sha1sum. A pseudo-SQL query for such a request might look like this:

SELECT * FROM sponsorTimes WHERE sha1sum LIKE '3dd08%'

Which hashing algorithm is used is not very important, as the user will only send a fraction of the hash to the server. SHA1 has a reasonable length, I'd say. (let's avoid MD5, though ;-) )

[Feature request] Archive down voted segments

Move down voted segments (-2) to a archive table, after X days, for performance improvement.
It should make the indexes on the table sponsorTimes smaller which will reduce the memory and CPU consumption

Support repeated category URL parameter

For the skipSegment route, currently we have to use categories=["sponsor", "interaction"] to check for multiple categories, using json in the categories parameter.

This looks… weird. Instead, it would be much more pleasant if we could do category=sponsor&category=interaction. A lot of http lib support this way for multiple values.

HTTP 504

Hi there, I think there might be an issue with the server as of right now. Not necessarily the software, but, I'm getting 504's when submitting timestamps:
image

This appears after pressing submit in the SponsorBlock popup in the YouTube video player.

Support for alternative sites

Add a new variable
"host" or "service", which would be an identifiable code for sites other than youtube

i.e:

{
  host: string, // Defaults to youtube, See [services list] // <---
  videoID: string,
  category: string, // Optional, defaults to "sponsor". See [the category list]
  categories: string[] // Optional. Use instead of "category" if you want multiple categories. Will look like ["sponsor","intro"]
}

This would allow to later extend the main extension to support LBRY, bitchute, etc.

Split votes into upvotes and downvotes

For old submissions, they can start with the current votes in the upvotes column and 0 in downvotes.

Things to consider:

Should the votes column remain for backwards compatibility with third-part systems using the public database?

Server issues: Leaderboard down

Hi, I've been using sponsorblock for a while, and the latest version has issues with Youtube silently adding a new block on Sponsorblock server on my end. Could this be resolved somehow?

I tried a fresh install, and still sponsorships play!

EDIT: Nevermind, The leaderboard is just temporarily and I do not like it, What happened really? Is it youtube like suggested or something else?

Cache similar segments in redis

This would allow the weighted randomness to still occur.

This cache should be cleared whenever a submission occurs. Votes won't affect the cache as that will still be pulled from the database.

Bulk submissions for trusted users

It would be nice if there was a feature in place where "trusted users" (Either defined by manually assigned VIP status or based on reputation and registration time) are allowed to submit segments in broader chunks.

Example:

"Trusted users" should be able when submitting a segment that it applies to the whole currently playing playlist. Especially useful for Let's Plays where the Youtuber uses a Intro and endcard throughout a few hundred videos which would be painfully slow to submit otherwise.

Internal server error when submitting Endcard section

Request was done with the Chrome extension.

{
  "videoID": "Bw4-t3yf7Qk",
  "userID": "<redacted>",
  "segments": [
    {
      "segment": [
        672,
        694.97
      ],
      "UUID": null,
      "category": "outro"
    }
  ]
}

And with fiddler, this is all I got:

HTTP/1.1 500 Internal Server Error
Server: nginx/1.14.0 (Ubuntu)
Date: Sun, 18 Oct 2020 04:55:49 GMT
Content-Type: text/plain; charset=utf-8
Content-Length: 21
Connection: keep-alive
X-Powered-By: Express
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Type, Accept
ETag: W/"15-/6VXivhc2MKdLfIkLcUE47K6aH0"

Internal Server Error

Debug JSON: https://gist.github.com/NickAcPT/254e254f7544930d2ae8a923148918a5

Segment types

As discussed earlier, there are multiple segment types besides sponsor spots. This issue tracks the API changes necessary to implement them.

There are many things that would ideally be changed, but also new features are implementable without those changes. The minimal set of changes is highlighted with Minimal change.

All renamings imply the creation of new API endpoints without removal of old ones methods

Database

  • NO Rename table sponsorTimes into segmentTimes
  • Minimal change Add type ENUM column with values (to be discussed):
    • null -- type unknown and for backwards compatibility
    • "intro" -- for vide intros, usually appear before channel logo (not all channels have this)
    • "sponsor" -- for currently tracked sponsors
    • "merch" -- channel's custom merch (not an outside sponsor)
    • "social" -- We are on Twitter/FaceBook/etc.
    • "buttons" -- Comment!/Like!/Subscribe!
    • "patreon" -- Only for patreon.com announcements that do not contain

Some of the categories might overlap ("sponsor" and "merch"; "social" and "buttons" and "patreon"), but this is intentional and enables finer granularity.

API

  • Minimal change To current GET /api/postVideoSponsorTimes add type which is a string (one of recognized types).
  • Create GET /api/videoSegmentTimes to be used instead of GET /api/getVideoSponsorTimes
    • It sends the same data, but with type.
  • Create POST /api/videoSegmentTimes to be used instead of GET /api/postVideoSponsorTimes
    • Use actual JSON instead of URL parameters, allow submitting multiple segments (for the same video). No mechanism for segment deletion yet. Schema:
      • userID -- same as now
      • videoID -- string, same as now
      • segments -- array of Objects like so:
        • startTime -- float, in seconds
        • endTime -- float, in seconds
        • type -- string, one of known types
  • Update GET /api/postVideoSponsorTimes to insert type = "sponsor"
  • Ensure GET /api/getVideoSponsorTimes returns only segments with type = "sponsor"

NOTE: BEFORE MERGING:

You need to update the table to insert the type column. Since SQLLite does not have ENUMs, it can be TEXT type.

ALTER TABLE sponsorTimes ADD "type" TEXT;
UPDATE sponsorTimes SET "type"="sponsor";

Endpoint to fetch all segments (not just the best ones)

This is a feature request.

It would be nice if there was an optional parameter to getVideoSponsorTimes to always return the highest voted "similar sponsor."
For use in programs that expect up-to-date and consistent data without downloading the updated db every time.

Similar segments revamp

  • Switch overlapping segments check to overlapping + 60% similar
  • Search all categories at once (no need to search categories individually)

Extra Idea:

  • Always search all categories, but only return segments for the passed categories. This will ensure highly voted self promotion are not sent to someone who only is fetching "sponsor"

Moved from ajayyy/SponsorBlock#421

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.