Code Monkey home page Code Monkey logo

pamaxie.scan_api's Introduction

Pamaxie

We have started development again. This is a long process and will take a while to get rolling. For now we will focus mostly on our web presence to drive more traffic to the project.

Rust Publish Docker image

Documentation for this project can be found at Pamaxies wiki. API Credentials can be created at Pamaxies website. Please let us know if the API misbehaves in any way. We will assist you as soon as possible.

Content Detection
Pamaxie was developed to ensure and verify the security of content and media on the internet. This was developed to allow developers of chat applications to moderate content automatically. For example by a neural network scanning images for certain properties, and we plan on supporting other types of media as well. The intention of this connection of Machine Learning and hand crafted algorithms is to create a service that allows the internet to be more secure. Our goal is to make the web more fun and safe to browse, and prevent users from seeing content on websites that they, may want to avoid.

Content Detection

We are developing this API with content hosters in mind. We will never share any data or sell it to 3rd parties. We will always treat all user data that we keep on our servers with the highest respect for privacy. This is one of the reasons that we chose to make this an open source project. Our API will be developed to be easy to interact with and stay as responsive as possible.

Content Detection

The end goal of the project is to make the internet a more secure place to browse. Our API can be either trained with your own data. If you don't want to train your own data, you can just access our API for free by just creating an account on our website.

Contribution

If you'd like to contribute to pamaxie, feel free to check out our wiki pages article on how to do so! We are always looking for people helping us, no matter what your current skill level is.

Please make sure to read our contribution guidelines and code of conduct to be aware how you're supposed to act.

We will release further updates on the wiki once the API is available for public signup

Possible thanks to funding by:

Federal Minstry Of Research and education Federal Minstry Of Research and education

Thanks to these partners helping us keep this project alive:

eclips.is

pamaxie.scan_api's People

Contributors

dawnruby avatar

Stargazers

 avatar

Forkers

prototypefund

pamaxie.scan_api's Issues

Validate current media detection

Validate the library we are using for media detection works as intended and can detect most image types we will be working with

Let users decide what kind of result they want

We currently deliver all results back.
This is really annoying if a user is just looking for one of the many things that we can scan for and increases scan time for them too as well as our load.
We should build a system that allows the user to "Query" for specific scan properties they look for and we just scan for them. For example, we just do a scan to detect if something contains porn, gore, or racy content not if it contains all other properties (like nazi symbolism or realism checks) we offer.

Implement perceptual hashing into scanning API

We require perceptual hashing and perceptual hash detection for our scanning API to be reliable and also work on a somewhat fast performance level.
This means images should be hashed with a perceptual hash which is then used to search through our Keys in our database to find the closest matching one and return it.

Rework Hashing to Utelize Blake

Currently we are using MD5 which is prone to fingerprinting attacks. This should be avoided. Please switch the hashing algorithm to Blake

Create dedicated content detection endpoint

We currently check for media types in our content detection.
While this is nice, a dedicated endpoint would make much more sense than scanning the data directly after detecting its content. Adding a separate endpoint would also allow users to decide if they want to continue scanning with our API or if they were just interested in the content type of the media they provided.
Adding this endpoint would also allow us to remove some of the current return data (like if an image is a png or an image is an image) since this would be redundant.

Implement Media Detection

We require media detection that works reliably:

A good to have would be changeable data specifications, but this is not strictly required.

Implement worker collection

We need collections for a server / worker system to distribute compute easily and allow customers to predict their own image data if our servers are too slow for them. This requires a worker collection that can be queried for specific work (e.g. something reached in from a customer) and works like a ring list so we can take the latest one out of it and start prediction on it.

Check if images are actually scanned after "taking" a job

We require a system that checks if once a client has taken a job it is actually returning a result at some point. Currently we just "assume" that if a client takes a job it is scanned and a result is posted to our API at some point.
I'm considering using a dead letter queue or something else to check if an image has been in the dead letter queue for a certain time and if it has just repost it to the processing queue. The time could be 1 minute or something around this time since we should handle scans within a minute of taking them.

This is a more complex issue and we require help in solving this. I feel like there has to be a better approach to handling clients failing to scan an image / potentially reporting why they couldn't to the API.

Migrate to other Real Time Communication Platform

We want to move away from Amazon's SQS system because it locks our users into using AWS.
The best solution I could find so far is RabbitMQ. We do not want to use Kafka because its clients favor java heavily and this is a no go for us.
If anyone has any recommendations for a system regarding this, besides RabbitMQ we are very much willing to try out multiple solutions to find the best one for our needs.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.