Code Monkey home page Code Monkey logo

milton's People

Contributors

dependabot[bot] avatar jasonrdsouza avatar mattx avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

Forkers

jasonrdsouza

milton's Issues

Replace Algolia with MeiliSearch

MeiliSearch looks pretty cool. Written in rust, and open source. Not sure about the resource constraints, but maybe we can run it on a free instance or something. Lots of interesting features in the docs.

This will probably become more relevant once we start brushing up on the Algolia free tier limits.

Fix relative links

The readability service should go through all relative links in the document and make them absolute.

Deduplicate URLs based on text

When a new article is submitted, check if an article with very high text similarity already exists, and if yes, don't index it.

This would be more robust than trying to deduplicate URLs manually.

Auto-archive Submitted Links

When a link is submitted to Milton, it would be nice if we automatically submitted it to a web archiver like this to ensure that we have a reference to the original content in the face of link rot.

PDF support

It would be cool if we could submit links to PDF's, and get a "reader" view similar to what happens for regular HTML webpages right now. Existing Milton-esque tools like Pocket don't support this use case for PDFs, which is especially frustrating on mobile/ small screens, where reading PDF's often require lots of side to side scrolling.

There are various open source libraries, of which Mozilla's PDF.js seems like the most promising, but Apache Tika is also interesting since it supports a lot of formats via a single interface, which would be useful if we wanted to extend this functionality to other formats in the future (Microsoft Word documents come to mind).

For extra credit, it would be even cooler if Milton was smart enough to fetch the paper when given an Arxiv or other paper aggregator link (similar to what it does for HN or Reddit?)

Index target page for aggregator links

When asked to index a comment page on an aggregator such as Reddit or Hacker News, we should index the thread the page is about, not the comment page.

We probably need to implement specialized logic for each aggregator.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.