Code Monkey home page Code Monkey logo

remarkableocrsync's Introduction

reMarkable tablet sync and page OCR

A moderate hack for syncing notebooks off the reMarkable, converting them to PDF form, and running OCR on the pages. Attempts to only convert changed pages. Uses AWS textract for OCR. Can sync from cloud using rmapi or directly with SSH-over-USB. Switches seamlessly between the two sync mechanisms. Only syncs one-way: down.

Loosely tested with a reMarkable2 on Linux and OSX(intel).

I admit this is not entirely end-user-friendly. If you know your way around a Unix shell, you should be ok.

I wrote this in a couple of evenings and don't have the time to support it properly. If you like it, help me make it better.

Requirements

  • [brew/apt/dnf] install imagemagick jq awscli
  • pip install boto3 pypdf2
  • rm2pdf built and installed in your path
  • rmapi built and installed in your path

Setup

AWS Textract Handwriting Recognition

aws configure

This may help (also look at pricing for OCR)

For Web API sync

The first time you run, the script will prompt you to get an authorization code from remarkable. That's all.

For SSH-over-USB sync

Set up passwordless ssh and rsync on your tablet

Example .ssh/config section:

Host remarkable
User root
ControlMaster no
ControlPath none
Hostname 10.11.99.1

Notebook Selection

Enter the names of the notebooks you want to sync, exactly as shown on the device, in notebooks.conf. Make sure you add a newline at the end of the file or the last notebook won't be processed. Example:

Quick sheets
My Other Notebook
Work Notes
    ‎‎

Usage

From the repo folder, update the notebook list per the above instructions and try running ./rmocrsync.sh ssh or ./rmocrsync.sh web. It should work out of the box.

If it completes successfully, take a look in the notebooks folder. You should have a folder of OCR text files (one file per page), and an annotated PDF that embeds the text in each page.

Note: The files in the meta folder are used to track changed pages across sync sessions. You probably shouldn't mess with these.

OCR Example

OCR text

remarkableocrsync's People

Contributors

opposablebrain avatar

Watchers

Matt Langston avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.