Code Monkey home page Code Monkey logo

lmsa's Introduction

I am Ukrainian. While Russian army invades Ukraine, kills, rapes, destructs, and steals, I am in Kharkiv, my homecity

I am almost useless but I am volunteering to help those in need, and I am staying.

UPD: I've joined Ukrainian Armed Forces in March 2023.

🇺🇦 HELP UKRAINE WIN 🇺🇦

Let the text below become relevant again in other times. image


Developer and writer from Ukraine. Ruby programming language committer.

Mostly interested in lucid code and open data, and writing a Substack about it. The range of “my” topics are united by an urge to understand and explain. Or, the problems of knowledge acquiring (with code) and expressing meaning (with code).

Working on my first Ruby book, working title "41 Ruby Intuitions".

Recent/interesting work

Ruby programming language

Open data

  • Working to an API to the world's common knowledge (based on Wikipedia/Wikidata, but not limited to it):
    • First (discontinued) attempt: molybdenum-99 set of Ruby projects
    • Second (current) attempt: WikipediaQL Python library; writing on it: 0, 1, 2, TBC
  • Spylls: Python spellchecker, almost full port of Hunspell; an explanatory port to understand/show how it works. Series of articles: Rebuilding the spellchecker

Some Ruby libraries

  • time_calc: idiomatic, no-monkeypatching Time/Date math
  • saharspec: set of extensions for RSpec for DRYer specs
  • the_schema_is: ActiveRecord models annotation done right
  • yard-junk: YARD docs linter
  • whatthegem: Console tool for fetching information about gems (stats, usage, recent changes)
  • sho: Experimental "post-framework" views library

Fun and experiments


Full list of projects of various years

lmsa's People

Contributors

zverok avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

lmsa's Issues

hunspell port to Ruby

Project

Port Hunspell opensource spellchecker to pure Ruby.

Proposed code name: spelleology.

Plan

  1. Understand hunspell dictionaries format.
  2. Create Hunspell dictionary reader, using Hunspell's code & docs as a reference and its dictionary samples
  3. Create simplistic spell-checking solution (split text into words → remove punctuations → run against dictionary)
  4. Wrap into proper Ruby gem, with executable and library usage (ver. ~0.0.1)
  5. Further development directions:
    • profiling and optimization
    • CI-readiness (different output formats, Rake task)
    • supplementary tools (dictionary downloader from OO repository)
    • pluggable integration with Markdown parsers and other markups, for proper reporting of spelling problems positions in marked files.

Importance

Hunspell is currently the most popular open source spellchecking tool, having most of the actual dictionaries in its format. But the tool itself is pretty complicated C++ software, that is hard to integrate and use from Ruby.

Pure-Ruby Hunspell port can be easily integrated with other Ruby tools, like Markdown parsers (or even Ruby parser, imagine you can spellcheck your Rake task descriptions?), Jekyll, CI tools and so on.

Skills and domains

You'll need to be able to at least read C++ of hunspell's sources. And expect a lot of optimization practice.

bundle whatsup: Bundler extension to report dependency updates

Project

The idea is to have bundle whatsup command (currently, bundler will translate that to separate bundle-whatsup executable, if it is installed as a separate gem), which can for all, or for specified dependencies from your bundle, report "what have been changed in versions you've missed".

Plan

  1. Make a changelog parser which is working for most of the popular variations of format (because vandamme is not).
  2. Investigate ways for extracting the Changelog/History file for gem (through GitHub/Bitbucket/GitLab repo, if present; by gem downloading and unpacking; probably through rubydoc.info)
  3. Investigate ways to report the significant changes in a terse yet readable manner;
  4. Pack everything into library/executable.

Importance

The tool envisioned seems more fun, than a really important piece of infrastructure, yet can be pretty useful and popular. Powerful changelog parser able to consume more formats than currently existing ones can be useful on itself for code analysis tools.

Skills and domains

The most non-trivial parts are the reliable parsing of a fuzzy text format and robust changelog finder. Also, some knowledge of Bundler and gems infrastructure could be used or obtained in the process.

worldize: drawing of geographical data

Project

worldize is an early prototype of geographical drawing library, but when it was published, it received some good amount of attention.

There is a clear plan of development, but unfortunately, I never had enough time to continue.

Plan

  1. Make generic "map drawing" API (line from this geo point to that geo point, rectangle, polygon, write text near those coordinates, and so on): most of the work is already prototyped in the map branch of repo, yet never finished.
  2. Different slices of the map, not always entire world;
  3. Different globe projections and other map transformations;
  4. Map frame: title, legend and so on.
  5. Map tiles from open servers

Ideally, the (1) should be released as a new useful gem, with docs and some specs, and then small releases should be made every now and then, following what (possible) library users could want and what use cases would emerge.

Importance

While most of the today's visualizations are indeed done in a browser with D3.js or something like it, server-side generation still can be useful (for PDF reports, including in emails, visualizing simple scripting experiments, this kind of things).

Skills and domains

Ruby's graphical libraries, geographical calculations (they are HARD), external APIs (for tiles).

RUDE: RUby Documentation Effort

Project

The idea is pretty rough yet clean: parse docs from Ruby's official Repo (and from all tags, starting from 2.0) and format it in one informative static site (published with GitHub pages), auto-updated on new versions. Proposed differences from existing projects (say, ruby-doc.org):

  • Ruby-version agnostic URLs (instead of having .../2.4.2/Enumerable.html, have /Enumerable.html and effectively render all version-dependent differences there);
  • Better representation of "language core" docs (which are currently in doc/*.rdoc files of language repo);
  • more compact, easily browsable, modern representation.

Plan

  1. Parse ruby docs from source into RDoc or YARD internal structures;
  2. Store those structures into human-readable YAML for caching and investigating;
  3. Setup custom rendering for those structures, including handmade TOC for generic Ruby documentation from doc/ folder.
  4. Setup ruby/ruby repo as a submodule of docs repo, and, switching between version tags, generate YAML from each version;
  5. Render all versions docs into the same HTML files (like core/Enumerable.html), with version tags besides methods and JS switches to "show only version X".
  6. Wrap everything into a nicely documented set of Rake tasks and publish to GitHub pages, so anybody can fork, republish and play with styles and logic.

Importance

Current existing Ruby docs (docs.ruby-lang.org and ruby-doc.org) are not Googlable in a good way, due to conflicting versions:

image

Also, it is unclear from docs of some method, whether it is present in your Ruby version, so typical online browsing of some doc is like "Google for <class> <method> → go to URL → manually replace /2.5.0/ in URL with /2.3.1/ → ...)

Finally, really detailed and good written base Ruby docs (like Syntax guide) is not really visible and navigable through documentation sites.

Skills and domains

You'll do a lot of text parsing, preprocessing and formatting, probably hacking with RDoc and/or YARD internals, some basic UI design.

The repository of conference talks videos

Project

Gather (automatically and semi-automatically) as much talk videos from Ruby conferences as there are exist, and provide a nice browsing interface for them.

Plan

  1. Go through http://rubyconferences.org/past/
  2. For each conference, define in Ruby (with the help of wombat?) its site (or web.archive.org copy of it) parser to extract structured list of talks, speakers, topics and so on;
  3. For each conference, find (on site, or in linked YouTube/Confreaks playlist) list of talks, and define extractor for it, and matcher with conference's program;
  4. Define (with the help of Jekyll or another nice static site renderer) the rendering of this data, and provide extensive navigations by years, by speakers, by talk topics, keywords and so on; joining the same talk given on several conferences into "versions".
  5. Publish to GitHub pages
  6. Make sure to have a "framework" of scripts to update the site after future conferences
  7. Provide an ability for collaborative editing of talk classifications.

Importance

There are a lot of good material over there gathered through 25 years of Ruby, and a huge part if it is available online, providing learning, discovery and historical interest. Somebody just should do something about it.

Skills and domains

You will do a lot of automated web scraping, and site generation, as well as generalizing a lot of non-generic data in a bearable way.

magic_cloud: Ruby word cloud

Project

magic_cloud is somewhat outdated, yet production-used pretty word cloud generator. It can be made faster, more configurable, with cleaner API.

Plan

Just some semi-random thoughts:

  • Faster layouting (maybe with C or Rust extension, millions of bit operations aren't cheap);
  • More flexible API to configure sizes, fonts, colors, layouting randomness and so on;
  • Docs and specs (spec for random layouting, and for graphics are both incredibly challenging).

Importance

While most of the today's visualizations are indeed done in a browser with D3.js or something like it, server-side generation still can be useful (for PDF reports, including in emails, visualizing simple scripting experiments, this kind of things).

Skills and domains

This task will include a lot of tinkering with image libraries (maybe porting to vips?) and optimization (calculating precise word positions in "pretty randomness" is slooow).

languagetool port to Ruby

Project

LanguageTool is an opensource style and grammar checker, written in Java.

We want to port it to Ruby (codename proofreader).

Plan

Is already formalized as a GitHub project.

Importance

This project is much more ambitious than hunspell port, as LanguagTool's rules format (deeply nested XML) and logic are much more complicated than simple Hunspell's dictionaries; besides that, not all of the grammar checking logic is stored in data, some rules and services are represented by Java classes.

Nevertheless, the task is bearable, and first useful results could be achieved rather quickly. The gain for the community (integrating style checks into checking markdown, docs, running in CI pipelines) seem to be even more than with spellchecker.

Skills and domains

The task will include a reading of Java, obviously some linguistics stuff, a lot of optimizations and comprehensive text processing.

tlaw: generic HTTP API wrapper

Project

tlaw is a somewhat experimental library/DSL for creating fast, thin and reliable wrappers for HTTP APIs. Currently, it supports only GET APIs and lacks some deeper HTTP features (like headers passing or authorization), yet already proven to be a promising way to the "right" thickness balance.

The idea is to develop several generally useful TLAW-based wrappers for common APIs, adding new TLAW features on the way, when necessary.

Plan

  1. Go through lists of public APIs (like this one) and select those of general importance or interesting to you personally.
  2. Create TLAW-based wrappers for them and commit those to tlaws repo.
  3. Good wrapper should include:
    • Wrapper itself, with inline docs and proper hierarchical structure
    • Sanity tests for the wrapper (that it works and does what it is expected to do)
    • README/offline docs for it.

It is expected, that alongside the wrappers themselves, participants of the project also will/should:

  1. Develop some generic good ways of testing and describing small API wrappers;
  2. Enhance TLAW library with (no particular order):
    • Headers support
    • Authorization support
    • Non-GET requests support

Importance

Good abstractions layer for "thin" API wrappers, which can be developed by one person in a few hours and still be robust and useful, seems of general importance for the entire Ruby ecosystem. Also, some regular, small, well-documented wrappers for common APIs, working in the same manner on small yet solid foundation could become a project of huge popularity.

Skills and domains

This task will require a lot of experimenting with API and architecture, and serious intent to write less code, and rewrite a lot, and document a lot.

yard-junk: Yard documentation quality checks

Project

yard-junk is already established and production-used solution for checking the quality of YARD documentation of Ruby projects.

The current version only checks for errors is documentation (misused tags and such), but the next step would be to introduce Rubocop-style flexible quality checks, allowing to establish and maintain per-project preferred documentation style.

Plan

The plan is briefly discussed in this GitHub issue.

  1. Establish some "plugin architecture" for checks.
  2. Develop several new documentation quality checks incrementally.
  3. Work on configuratibility of checks.
  4. Work on usability of command-line tool and flexibility of its output.

Importance

YARD is the de-facto standard of Ruby documentation (alongside with RDoc, but yard as a tool supports RDoc too, so yard-junk can be useful for checking virtually any Ruby project's docs). Establishing a tool for docs quality checking (and auto-fixing if possible) seems to be of general use.

Skills and domains

While working on it, you'll need dive really deep into YARD tool internals, and create modular and pluggable architecture like Rubocop with its cops.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.