Code Monkey home page Code Monkey logo

whichx's Introduction

About Me

I am a software development engineer at Amazon Web Services, prototyping 'art of the possible' solutions for public sector and social good projects.

In my spare time I read, write, and maintain a few open-source projects.

Links

Stats

Top Langs Rudi's github stats

whichx's People

Contributors

dependabot[bot] avatar rudikershaw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

whichx's Issues

Validate limit on index.js file size in build.

I couldn't find a simple way of doing this in the build. Write custom validation in Js to prevent the dist/index.js from growing above a certain size. If this custom validation is good, consider publishing in a separate package.

Option to build model using word hashes for model anonymity

If your model contains user input it is possible that sensitive user information may make it into the model. If the model is exposed (for example in the browser) it may expose this information.

When creating a new WhichX objection, we should allow a configuration option to hash all words added to the model. This will also require that words are hashed during comparison so that they can effectively be compared.

  • Specify new configuration option to allow model hashing.
  • Check this configuration before adding to the model.
  • Check this configuration before classifying against the model.

Incorrect error message when adding single duplicate label

Two error criteria that now provide misleading error messages. These new issues are un-release in 3.0.0-SNAPSHOT.

  • whichx.addLabels('total') should specify that labels must be unique. It complains about the type, which is incorrect.
  • Further, 'total' is not just a duplicate, it is a reserved key-word. The same can be said of other properties of object.

The logic for adding a string label should be extracted to a private function so the error handling logic does not need to be duplicated.

Minify deliverable

One of the value adds of this classification library is that it is really really small. It's worth taking that one step further and making it as small as it can be.

  • Add some kind of minify to the build.
  • Ensure the minified version is published for new versions.
  • Ensure that the project JSDoc is still displayed in the IDE of a project using whichx as a dependency.
  • Add checks to ensure deliverable stays below a certain size to protect against bloat.

Add developer FAQ into a separate documentation markdown file

Title just about says it all. There are some details that need documenting, for example how to export/import a model or change the stop words for a different language.

  • Create a documentation folder.
  • Add a faq.md file to the folder.
  • Link to the FAQ from the main README.md.

Congrats and suggestion

Hello

I was glad to see that I was able to have a good outcome from your library to solve a classification problem at work.
I actually noticed that your library worked better than https://github.com/ttezel/bayes (about 400 stars) with the same training data.
So I would recommend that you make available by a better name your Naive Bayes code, so that more people can find it (and benefit from it) beyond the pets domain.

Congrats!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.