Code Monkey home page Code Monkey logo

context-mod's Introduction

title nav_order
Home
1

ContextMod Latest Release License: MIT Docker Pulls

ContextMod logo

Context Mod (CM) is an event-based, reddit moderation bot built on top of snoowrap and written in typescript.

It is designed to help fill in the gaps for automoderator in regard to more complex behavior with a focus on user-history based moderation.

An example of the above that Context Bot can do:

  • On a new submission, check if the user has also posted the same link in N number of other subreddits within a timeframe/# of posts
  • On a new submission or comment, check if the user has had any activity (sub/comment) in N set of subreddits within a timeframe/# of posts

In either instance Context Bot can then perform any action a moderator can (comment, report, remove, lock, etc...) against that user, comment, or submission.

Feature Highlights for Moderators:

Feature highlights for Developers and Hosting (Operators):

Table of Contents

How It Works

Each subreddit using the RCB bot configures its behavior via their own wiki page.

When a monitored Activity (new comment/submission, new modqueue item, etc.) is detected the bot runs through a list of Checks to determine what to do with the Activity from that Event. Each Check consists of :

Kind

Is this check for a submission or comment?

Rules

A list of Rules to run against the Activity. Triggered Rules can cause the whole Check to trigger and run its Actions

Actions

A list of Actions that describe what the bot should do with the Activity or Author of the activity (comment, remove, approve, etc.). The bot will run all Actions in this list.


The Checks for a subreddit are split up into Submission Checks and Comment Checks based on their kind. Each list of checks is run independently based on when events happen (submission or comment).

When an Event occurs all Checks of that type are run in the order they were listed in the configuration. When one check is triggered (an Action is performed) the remaining checks will not be run.


Learn more about the RCB lifecycle and core concepts in the docs.

Getting Started

Operators

This guide is for users who want to run their own bot on a ContextMod instance.

See the Operator's Getting Started Guide

Moderators

This guide is for reddit moderators who want to configure an existing CM bot to run on their subreddit.

See the Moderator's Getting Started Guide

Configuration and Documentation

Context Bot's configuration can be written in YAML (like automoderator) or JSON5. Its schema conforms to JSON Schema Draft 7. Additionally, many operator settings can be passed via command line or environmental variables.

Check the full docs for in-depth explanations of all concepts and examples

Web UI and Screenshots

Dashboard

CM comes equipped with a dashboard designed for use by both moderators and bot operators.

  • Authentication via Reddit OAuth -- only accessible if you are the bot operator or a moderator of a subreddit the bot moderates
  • Connect to multiple ContextMod instances (specified in configuration)
  • Monitor API usage/rates
  • Monitoring and administration per subreddit:
    • Start/stop/pause various bot components
    • View statistics on bot usage (# of events, checks run, actions performed) and cache usage
    • View various parts of your subreddit's configuration and manually update configuration
    • View real-time logs of what the bot is doing on your subreddit
    • Run bot on any permalink

Subreddit View

Bot Setup/Authentication

A bot oauth helper allows operators to define oauth credentials/permissions and then generate unique, one-time invite links that allow moderators to authenticate their own bots without operator assistance. Learn more about using the oauth helper.

Operator view/invite link generation:

Oauth View

Moderator view/invite and authorization:

Invite View

A similar helper and invitation experience is available for adding subreddits to an existing bot.

Subreddit Invite View

Configuration Editor

A built-in editor using monaco-editor makes editing configurations easy:

  • Automatic JSON or YAML syntax validation and formatting
  • Automatic Schema (subreddit or operator) validation
  • All properties are annotated via hover popups
  • Unauthenticated view via yourdomain.com/config
  • Authenticated view loads subreddit configurations by simple link found on the subreddit dashboard
  • Switch schemas to edit either subreddit or operator configurations

Configuration View

  • Overall stats (active bots/subreddits, api calls, per second/hour/minute activity ingest)
  • Over time graphs for events, per subreddit, and for individual rules/check/actions

Grafana Dashboard

License

MIT

context-mod's People

Contributors

foxxmd avatar mhfdoge avatar rysie avatar wchristian avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

context-mod's Issues

Add footer notice to content actions

If the bot performs an action that submits content (comment, report) it should contain a footer saying:

  • It's a bot
  • Link to the bot subreddit
  • Do not reply/DM directly to the bot for issues -- contact subreddit mods if sub issue, bot subreddit if technical issue

Run bot on permalink should open a popup

Either a modal or a window popup should open with the results. Currently there isn't really any user feedback for submitting a link apart from the link disappearing.

Toolbox usernote integration

Would like to integrate with toolbox usernotes to enable mod interaction tracking:

Integration requirements

  • Opt-in using subreddit configuration
  • Read (decode. uncompress) notes and store in cache
  • Write (encode, compress) notes and invalidate cache

Implementation

  • Add additional criteria to author filter/rule. Usernote criteria should be based on warning type, text, and count of notes (and/or some combination of all of the above)
  • Add new Action to add usernote to Author

Easy cloud deployment options

Implement cloud templates so new users can deploy a working bot without having to figure out infrastructure.

  • Heroku
  • EC2
  • Digital Ocean

Refactor logger labelling to be less cumbersome and more descriptive of context

Right now would need to pass all prefix content down from config builder all the way to action/rule if I want to see the context for some log statement on write. Need to refactor this so a logger can be created from a pool or something with prefix content already built. Basically make it more stateless to create a logger.

allow nuking a user

It'd be nice to not only be able to check a permalink but a whole user. This would be great for new spam as we could add the new spam rule and then run it against that whole account removing all their posts.

Remove node-canvas from docker

Current dockerfile dependencies stem from when CM was using node-canvas for image comparison. It has been replaced by sharp, which only depends on libvips and may already be built into alpine...need to test what can be removed.

Announce bot on reddit

TODO before public announcement:

Places to post announcement:

Implement media-based count rule

Reddit returns good information about Submissions with media links (youtube, vimeo, etc.)

Taking a page out of toolbox's playbook:

  • aggregate an author's submission history on secure_media.oembed.author_url
  • use large default window (100 or 200 submissions)
  • allow trigger on threshold of author_url count (variant with useSubmissionAsReference)

allow filtering web log

I'd be great to allow both filtering by a string/regex and enabling the types of messages being output for example.

It'd be nice to enable/disable Reddit API Stats: Initial 582 | Current 582 | Used ~0 | Events ~0.11/s in the web logs. Same for this line Run Stats: Checks 1 | Rules => Total: 1 Unique: 1 Cached: 0 Rolling Avg: ~null/s | Actions 0.

Stuck while trying to deploy to heroku

Used the heroku link and then provided it the needed envs.

The site shows a heroku error and when trying to connect via heroku run bash it never connects.
It's just forever showing Running bash on ⬢ context-mod... / connecting, run.4890 (Hobby).

Any ideas if this is a heroku issue or something with the deploy?

Add api documentation

Once api is stable document endpoints and provide samples for authentication/usage

Implement rule to check for temporal patterns in an author's history

Originally suggested as being able to check for a hiatus/gap here by u/SillyStranger5009

Some examples of things that could be checked (to build a rule against):

  • Within a date range if there are been a gap in author activity
  • Within a date range find the baseline for frequency of activity and...
    • if there is a std deviation or percentage increase below or above the baseline
    • if the baseline frequency matches a user-defined value

How to apply?

  • Allow specifying these sets of behaviors as criteria with AND/OR operands
  • Allow specifying if these events must occur 1 or N times (per each or in total)
  • Allow white/blacklisting subreddits to count events from

Implement image fingerprint database for detecting reposts

May be worthwhile to replicate repost sleuth bot on a (much) smaller level IE subreddit/bot level.

Things that would need to be or need to be considered...

  • Support more than just redis for database?
  • Definable (but optional) collections EX known spam, retired memes, all subreddit posts, etc...
  • UI requirements
    • Upload image by file or URL
    • Batch upload from local directory?
    • Progress indicator for processing
    • Stats (number of fingerprints, search peformance)
  • Rule refactoring to allow searching all, by collection, and/or user history
  • user-configurable if database should be shared across bot/instance

CACHING=redis doesn't effect subs

Enabling redis caching while in monolith mode I had assumed without a sub/op config the app would take this env and also cache reddit posts/comments but checking redis I'm not seeing anything but web sessions.

Use redis via env?

Is there an env to use redis? I had assumed PROVIDER_STORE=redis would work.

Reduce memory consumption and increase performance for image comparison

  • Determine memory usage for individual image (both raw and loaded into resemble)
  • Determine what impact converting image to smaller size has on cumulative memory usage, comparison speed, and general cpu usage
  • Potentially refactor image comparison object usage (repeat/recent reference objects) based on results from above

Write MVP documentation

  • How bot works
  • Minimum reqs
  • Link to json schema validation example
  • Some basic examples
  • ENV table and arguments
  • Usage/logging

Create "getting started" and "starter" configurations for new users

Create some well-documented common configurations that be used as a teaching tool as well as a valid jumping off point for new users.

  • Add basic rule and action examples
  • Add advanced concepts guide (caching, ordering, etc.)
  • Add at least one yaml example
  • Add full markdown templates to show off advanced mustache features
  • Add footer example
  • Add partials example once its implemented
  • Add at least one complete subreddit example (config, templates, really world config values...)
  • Add a "Getting Started" document to run through all of the above plus deployment

Implement image comparison

When

  • Using attribution, repeat, or recent activity
  • AND using submission as reference
  • AND submission is an image

Implement a way to compare submission image to images from submissions in history.

Notes:

  • What library?
  • Need to determine if submission URL is an image
    • MIME type of downloaded resource?
    • By extension ending?

uncaught exception Reddit returned a 404 for user history

2021-10-20T22:43:49+00:00 warn   : ~u/username~ {r/subreddit} [COM ID] [CHK low xp comment spam] Running rules failed due to uncaught exception Reddit returned a 404 for user history. Likely this user is shadowbanned.
SimpleError: Reddit returned a 404 for user history. Likely this user is shadowbanned.
    at Object.getAuthorActivities (CWD/src/Utils/SnoowrapUtils.js:92:19)
    at async cacheVal.cache.wrap.ttl (CWD/src/Subreddit/SubredditResources.js:340:24)

Implement detecting comment reply

If a submission or comment already has been replied to (at top level) from a moderator (or automod) then we don't want to also perform actions on it as we can assume it's already been manually actioned.

This shouldn't be an issue if the bot is using a fast poll time but want to cover all the bases.

Implement rule to check comments against top comments from other sources

Based on the description of karma farming from this thread.

Comments

Check the content of a comment activity against a list of "top" text comments based (and retrieved from) on the submission source:

  • if submission is external attempt to get comments using some scraping method:
    • implement youtube api call to get top comments
      • https://www.googleapis.com/youtube/v3/commentThreads?key=${API_KEY}&textFormat=plainText&order=relevance&part=snippet&filter=snippet&videoId={VIDEO_ID}&maxResults=100
    • implement twitter api to get top replies
  • if submission is reddit-based
    • check for cross-posted submissions and get top comments
    • search for other submissions with the same title and get top comments
    • allow user-defined white/blacklist for subreddits to include submissions from

For detecting a match:

  • use fuzzy searching with user-defined threshold for sameness
  • allow user-defined minimum character count

Submissions

  • Do a reddit search for submission title and use fuzzy searching with user-defined threshold for sameness
    • Allow user-defined white/blacklist for subreddits search

  • cache retrieved comments using a unique id based on submission source

  • Restricting subreddit search

    • Define subreddits to search with this syntax: https://reddit.com/r/mealtimevideos+videos/search?q=${QUERY}&restrict_sr=1
    • For blacklist just filter subreddits out of returned results
  • Search by submission url

    • Remove query string from submission url IE remove ?someQueryParam=... since reddit seems to only search by base url
    • Use url token in query to search by url IE .../search?q=url:${BASE_URL}

Add caching documentation

There are some good pointers in the ui but would be good to document in a readme along with hierarchy/tuning/etc.

Implement "public" rule result summaries

Right now all rule result summary data is uncensored IE thresholds, exact windows/totals are revealed in the text.

Need to implement user-facing summaries that redact specific, critical values so that spammers/bad-actors can't determine how to fly under the rader on triggering rules OR make sure there is enough raw data in the rule results for mods to build it themselves.

Rolling Avg: ~null/s

Noticed this in my logs.

Checks 1 | Rules => Total: 1 Unique: 1 Cached: 0 Rolling Avg: ~null/s | Actions 0

<html> element does not have a [lang] attribute

If a page doesn't specify a lang attribute, a screen reader assumes that the page is in the default language that the user chose when setting up the screen reader. If the page isn't actually in the default language, then the screen reader might not announce the page's text correctly. Learn more.

Shorten web log output

Instead of showing the full link at the end of the log maybe [COM ID] could link to it?

"non-actioned" events

I'd like to see a page with the results for every check that's run as well as a button to rerun the check on that post/comment.

[Web] Currently enabled rules

There needs to be a page/section in the web ui where you can see the rules parsed into nice html elements. Similar to how the actioned events page maybe?

Document and improve mustache templating

  • Provide more variables to contextual data in RuleResult that can be used for templating
  • On each rule document what data is available (using annotations so its available in schema)

Implement mustache partials in rules

So that users can pre-define mustache rendering fragments for a rule

  • Add a partials property to rule json that is parsed on startup
  • Allow partials property to be a string (one partial, name of rule) or object so many partials can be defined per rule
  • Partial can be string to render or wiki: discriminator that retrieves wiki page contents that should then render
  • Enforce partial name uniqueness

Improve json schema documentation

  • Add missing property annotations/comments
  • Add default values
  • Refactor interfaces to consolidate repeated properties (will make documentation easier going forward)

Implement configurable delay before processing

A mod may want any other bots processing activities to run before RCB. This could be achieved by:

  • delaying activity processing after initial retrieval
  • refreshing activity state after delay before continuing processing

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.