Code Monkey home page Code Monkey logo

mltype's People

Contributors

abhayaysola avatar harshcurious avatar jankrepl avatar rushankhan1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mltype's Issues

Provide pretrained models

It is necessary to provide some nice pretrained models

  • English - trained on wikipedia
  • Python - trained on some opensource library

Ideally, one could just write
mlt download model_name url and it would do it automatically.

Make mlflow an optional dependency

Not everybody wants to use it for logging and additionally it does not seem to be well supporeted on Windows. Ideally,
one would specify via mlt train option if they need to use it.

Improve the statistics

Currently, the statistics are very basic

  • WPM
  • Accuracy

It might be cool to add some more indicators

  • Keystrokes (see 10fastfingers)
  • speed evolution over the text (like typeracer) - plotting might be tricky (bashplotlib ?)
  • Average speed per character (one needs to condense this information)

Description for saved dictionary

Probably just another entry in the pickled dictionary. Additionally, one should make this description visible in mlt list. Finally, one should be able to optionally provide this description in mlt train.

Bash autocomplete

Would be nice if we could autocomplete the names of the existing models in mlt sample

mlt sample slow in fresh venv

The first time trying to run mlt sample in a fresh venv takes 20 seconds. What is extremely confusing is that there is no progress bar and it just seems to be hanging.

No need for copying when creating features

When creating the one hot encoding of characters we copy. However, there is no need to do that. But maybe it is good thing if somebody tries to manually changing the features.

Add config file

One could control things that are not controllable via CLI:

  • colors

Generate text samples at epoch end

Currently, the only metric is the loss itself and one has no idea how good the model is until the training is done and one can mlt sample. However, one can sample a few texts at the end of each epoch and store them as mlflow artifacts.

Speed up intro CLI

Currently, it takes around 3 seconds to load. Note that it might be related to #4

Fix multidevice issues

  • The sample_char and sample_text are broken when network parameters lie on a GPU
  • Make it explicit in load_model that we want to do CPU inference

Make mlt sample faster

It seems like most of the time is spent on importing pytorch_lightning and then mlflow.

Unroll LSTM correctly

Currently, the sample_text function requires a window_size since for each new character it starts from scratch - the complexity is O(window_size). However, correctly, we should also return the hidden states of lstm and then prediction of the next character is O(1). There are 3 benefits

  • speedup (for large window_size especially)
  • long memory - dah
  • we remove the window_size hyperparamter at inference time

Future: dependency conflict

With the current setup.py one gets

ERROR: pytorch-lightning 1.0.2 has requirement future>=0.17.1, but you'll have future 0.16.0 which is incompatible.

Allow newline to be viewable

Would be really cool for languages like Python and any other programming language.

Note sure what is done by curses currently when it is asked to addch a new liine

Make getch nonblocking

Whenever we race an opponent the whole screen waits for an input character and only then it refreshes.

Avoid end of the line confusion

Currently, ENTER is not seen as a special character and one might get confused at the end of the line. Why? Pressing enter will result in an error (unless ENTER in the dictionary).

Forbidden characters

Currently, a lot of the trained models contain characters that are not even available on the english keyboard:D

mlt train .... -f "~$@" could explicitly disallow characters when training.

Custom cache path for train

By default it is ~/.mltype. However, google colab does not display hidden folders which makes it impossible to see what is going in during the training.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.