Code Monkey home page Code Monkey logo

cnft-spam-filter's Introduction

cnft-spam-filter

An open-source, lightweight, and portable spam classifier for cNFTs on Solana with 96% accuracy.

Can run anywhere that webassembly runs: on a server, in a lambda function, and even running entirely in your browser:

demo.1.mp4

Also included is the model training code and data, so you can train and bring your own model if the default model is not performing well.

Feature extraction is done with a combination of on-chain data and OCR using the tesseract.js library. Classification is done with naive bayes and a hand-picked set of spam and ham cNFTs.

Live Example

You can try a live (heavily rate limited) example of the library running on AWS Lambda here:

https://api.filtoor.xyz/classify?address=A1xhLVywcq6SeZnmRG1pUzoSWxVMpS6J5ShEbt3smQJr

Try a new cNFT by replacing the address={...} parameter. The classifier will either spit out "spam" or "ham" (or "error" if something went wrong).

If you'd like to use this API in your production project, please DM me to get set up!

Installation

First, install the library:

npm i cnft-spam-filter

then import the requisite function:

const { extractAndClassify } = require("cnft-spam-filter")

or

import { extractAndClassify } from "cnft-spam-filter"

Finally, call the function wherever you want to classify:

const classification = await extractAndClassify(assetId, rpcUrl);

Note that you'll need to bring your own rpcUrl that supports the DAS api--I recommend Helius for their generous free plan https://www.helius.dev/.

Examples

You can find a few lightweight examples of how to use the library in different environments in the /examples folder of the repository.

cnft-spam-filter aims to be portable, so you can run it in pretty much any environment that you want.

Training

You can train your own model and pass it to classify(tokens, model). Code for this is in the /train folder.

You'll see spam_ids.json and ham_ids.json there; these are the cNFTs used to train the model.

Testing

You can test the accuracy of a model using the code in the /test folder. Make sure that your training set and test set do not overlap. It should spit out a confusion matrix as well as all of the mistakes made:

10

Usage in Production

If you want to use cnft-spam-filter in production, I recommend setting up a caching layer so that you don't have to analyze each cNFT multiple times. This should be done at your own app level: you can use redis, a database, localstorage--whatever you want.

Contributing

Feel free to open pull requests to contribute if you think this is interesting! I will try to get to them as best as I can. There are definitely some tasks that need to be implemented.

License

All code is released under the MIT license -- go crazy.

Solana/USDC donations are appreciated but not required by any means:

solarnius.sol

cnft-spam-filter's People

Contributors

solarnius avatar sjm7 avatar

Stargazers

Toan Nhu avatar Thanh Le avatar Sal Samani avatar Nick avatar Stone Gao avatar Adefuye Olamide avatar Pratik Saria avatar maz avatar  avatar  avatar armariya avatar 256hax avatar Ready Worker One avatar  avatar Rónán avatar John Johnson avatar Armani Ferrante avatar Derked avatar Emanuel Posescu avatar  avatar

Watchers

Adefuye Olamide avatar  avatar

cnft-spam-filter's Issues

Better model

Right now the model is trained from 5 ham and 5 spam cNFTs... should probably make this more like 500/500 or 5000/5000

JSDoc

Need to document the main functions properly

Better test set

Should implement some sort of testing to see how effective the spam detection really is

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.