karlicoss / orger Goto Github PK

View Code? Open in Web Editor NEW

301.0 11.0 11.0 202 KB

Tool to convert data into searchable and interactive org-mode views

Home Page: https://beepb00p.xyz/orger.html

License: MIT License

Python 98.98% Shell 1.02%

org-mode pkm plaintext instapaper hypothesis pinboard kobo orger

orger's Introduction

Orger converts your data into a hierarchical Org-mode representation to allow for quick access and search.

I write in detail about usecases and motivation for it here, this readme is mostly the setup manual!

Installing

simplest: install from PyPi: pip3 install --user orger
After that you should be able to run orger modules via python3 -m:
```
python3 -m orger.modules.instapaper --help
    
```
editable install
This will allow you to quickly prototype and debug, the local changes to the code will be reflected immiedately.
- clone: https://github.com/karlicoss/orger /path/to/orger
- cd /path/to/orger
- pip3 install --user .
- after that you can use python3 -m orger.modules.modulename, same way as the previous section, or run modules/modulename.py directly
NOTE: most Orger modules are relying on HPI module for input data
Please refer to the HPI install guide/documentation and make sure the corresponding data providers work (e.g. via hpi doctor command).
[optional]: install pandoc, it might give you better org-mode outputs for some modules
If you do have pandoc installed, but don’t want the module to use it, pass --disable-pandoc flag to it.

Usage and examples

I usually run Orger modules overnight via cron.

see modules for all available modules
Most modules are using HPI package for accessing the data. You can learn about setting it up and using here.
several examples here
demonstration of Roam Research module, including a screencast
pocket_demo: documented literate demo

and a short short demo:

from orger import Mirror
from orger.inorganic import node, link
from orger.common import dt_heading

import my.coding.github as github_data

class Github(Mirror):
  def get_items(self):
    for event in github_data.get_events():
      yield node(dt_heading(event.dt, event.summary))

Github.main()

That ten line program, when run (./modules/github.py), results in a file Github.org:

# AUTOGENERATED BY /code/orger/github.py

* [2016-10-30 Sun 10:29] opened PR Add __enter__ and __exit__ to Pool stub
* [2016-11-10 Thu 09:29] opened PR Update gradle to 2.14.1 and gradle plugin to 2.1.1
* [2016-11-16 Wed 20:20] commented on issue Linker error makes it impossible to use a stack-provided ghc
* [2016-12-30 Fri 11:57] commented on issue Fix performance in the rare case of hashCode evaluating to zero
* [2019-09-21 Sat 16:51] commented on issue Tags containing letters outside of a-zA-Z
....

types of modules

Mirror
Mirror (old name StaticView): mirrors all data from a source, and generated from scratch every time, hence read only.

You can run such module with
```
./orger_module.py --to /path/to/output.org
      
```
Queue
Queue (old name InteractiveView): works as a queue, only previously unseen items from the data source are added to the output org-mode file.

To keep track of previously seen iteems, it’s using a separate JSON state file.

A typical usecase is a todo list, or a content processing queue. You can use such a module as you use any other org-mode file: schedule/refile/comment/set priorities, etc.

Typically you’d want to use these as a source of tasks for your todo list. See ip2org as an example.

You can run such a module as:
```
./orger_module.py --to /path/to/output.org
      
```
This will keep the state file in your user config dir (e.g. ~/.config/orger/).

Alternatively, you can pass the state file explicitly:
```
./orger_module.py --to /path/to/output.org --state /path/to/state.json
      
```

FAQ

Why are the files output by some modules read only?
Mirror type modules output read only files, so you don’t modify them by accident, they are overwritten every time.

If you want to temporary lift this restriction (e.g. to experiment with the format), you can use chmod +w, or M-x toggle-read-only in Emacs.
How is it different from Memacs?
The main reason Orger exists is because I discovered Memacs after I wrote Orger! One day we might merge them, or at least reuse org-mode formatting routines.

That said there are some differences at the moment:
- Memacs is more of a lifelogging utility, generating a linear output with the intent to be used with your org agenda
- Orger’s Mirror modules are meant to be more of a full local reflection of a data source, preserving the hierarchy as much as possible
- Orger’s Queue module: I believe they don’t have Memacs analogue (but please correct me if I’m wrong)
- Orger modules are slim and relying on HPI to encapsulate data access. But you can also use HPI with Memacs, please ping me if you set up such an integration!
I want active timestamps for org-agenda integration
Pass the --timestamp argument to the module, for example:
```
modules/polar.py --timestamps active
    
```

Similar projects

Memacs by novoid

orger's People

Contributors

Stargazers

Watchers

Forkers

gtrunsec iburunat spolakh hwiorn c1-g konstantindjairo ly774508966 uniwuni zuoquanxiong joelvaneenwyk liquescentremedies

orger's Issues

Hypothesis: too heavy file

Hi, I have been using orger to for my hypothesis database. However, the org file has become too big at this point.

Do you know of a way to start syncing into a second file after the file has become too big? (or start syncing into a second file from a predefined time point instead of syncing into the same file)? Thanks!

pandoc

Could I use orger to convert from docx or odt to org, just like pandoc?

Problem with sorting json state file

When I run this code for the first time, it works properly and creates the org file and
a json state file.

from orger import Queue
from orger.inorganic import node, link
from orger.common import todo

from canvas2org import assignments


class CanvasTodos(Queue):
    def get_items(self) -> Queue.Results:
        for assignment in assignments():
            yield assignment.id, todo( #assignment.id is an integer id
                dt=assignment.due_at_date,
                heading=assignment.name)


if __name__=="__main__":
    CanvasTodos.main()

When I run it the second time, it throws this error:

Traceback (most recent call last):
  File "/home/calvin/canvas-orgmode/main.py", line 27, in <module>
    CanvasTodos.main()
  File "/home/calvin/canvas-orgmode/env/lib/python3.9/site-packages/orger/org_view.py", line 258, in main
    inst._run(
  File "/home/calvin/canvas-orgmode/env/lib/python3.9/site-packages/orger/org_view.py", line 235, in _run
    state.feed(
  File "/home/calvin/canvas-orgmode/env/lib/python3.9/site-packages/orger/state.py", line 67, in feed
    self[key] = repr(value)
  File "/home/calvin/canvas-orgmode/env/lib/python3.9/site-packages/orger/state.py", line 48, in __setitem__
    json.dump(current, fo, indent=1, sort_keys=True)
  File "/home/calvin/miniconda3/lib/python3.9/json/__init__.py", line 179, in dump
    for chunk in iterable:
  File "/home/calvin/miniconda3/lib/python3.9/json/encoder.py", line 431, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/home/calvin/miniconda3/lib/python3.9/json/encoder.py", line 353, in _iterencode_dict
    items = sorted(dct.items())
TypeError: '<' not supported between instances of 'int' and 'str'

However, if I change the assignment.id to an assignment.name, a string value,
everything works properly and there are no errors.

The json state file generated by the first code looks like this:

{
 "458483": "OrgNode(heading='Assignment 1', todo='TODO', tags=(), scheduled=datetime.date(2022, 5, 27), properties={'CREATED': '[2022-05-27 Fri 06:59]'}, body=None, children=(), escaped=False)",
 "459994": "OrgNode(heading='Assignment 2', todo='TODO', tags=(), scheduled=datetime.date(2022, 6, 1), properties={'CREATED': '[2022-06-01 Wed 06:59]'}, body=None, children=(), escaped=False)"
}

It looks like the json sorter is having trouble comparing the original integers to the json integers encoded in strings.

Error upon installation

Hi! I'm finally getting to install this whole thing :)
After installing the package, I run

python3 -m orger.modules.instapaper --help

But get an error:

/Users/gahis/.local/lib/python3.7/site-packages/my/core/init.py:45: UserWarning: my.config package isn't found! (expected at /Users/gahis/.config/my). This is likely to result in issues.
  warnings.warn(f"my.config package isn't found! (expected at {mycfg_dir}). This is likely to result in issues.")
Traceback (most recent call last):
  File "/Users/gahis/opt/anaconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/Users/gahis/opt/anaconda3/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/gahis/.local/lib/python3.7/site-packages/orger/modules/instapaper.py", line 6, in <module>
    from my.instapaper import pages
  File "/Users/gahis/.local/lib/python3.7/site-packages/my/instapaper.py", line 7, in <module>
    from my.config import instapaper as config
ImportError: cannot import name 'instapaper' from 'my.config' (/Users/gahis/.local/lib/python3.7/site-packages/my/config/__init__.py)

Any help welcome, thanks.

Make it easier to set up orger

Some thoughts here karlicoss/kobuddy#1 (comment)

Include/support/contribute orgformat

Hi,

At least in #1 you mentioned positive effects in sharing code with https://github.com/novoid/Memacs

In the recent days, I've been busy migrating https://github.com/novoid/orgformat to a repository on its own. I've added good documentation, valuable unit tests to all functions and even added type annotations to the library itself (playing around with mypy which I recommend).

You may find some of the functions interesting for your project and I'm happy to receive pull requests of functions that you'll move out from your project to this general purpose Org mode library as well - if this is something you are thinking about.

README: similar projects

Hi,

https://github.com/novoid/Memacs is very similar to orger.

My README will feature a link to your project under "Similar Projects". You might also think of linking it back to Memacs and/or Memex (also mentioned in my README).

This way, people looking for a solution to their problem might find alternative projects that might match better their use-cases.

Improve annotation formatting

The heading structure right now ends up looking like a wall of text, not very easy to get an overall context of the highlights.

It would be nice to organize them according to chapters. So the structure would be:

Book name
** Chapter 1
highlight 1
** Chapter 1
highlight 2
** Chapter 2
highlight 1
...

Plus, the properties drawer can (maybe) have additional information about chapter progress.

No module named 'my'

Hi there,

I sucessfully did the install (via a git clone) and go no errors or warnings, but when I try to run a test, I get the following traceback:

➜  orger git:(master) python3 -m orger.modules.instapaper --help
Traceback (most recent call last):
  File "/usr/local/Cellar/[email protected]/3.9.0_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/local/Cellar/[email protected]/3.9.0_1/Frameworks/Python.framework/Versions/3.9/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/mediapathic/Library/Python/3.9/lib/python/site-packages/orger/modules/instapaper.py", line 6, in <module>
    from my.instapaper import pages
ModuleNotFoundError: No module named 'my'

I would assume that this has something to do with various old versions of python on my system (a perpetual issue), but calling the test without python3 doesn't work at all, so maybe I'm wrong on that?

Any idea what might be going on here?

Thanks!

Make space prefix of body optional

"inorganic.py" inserts a single space at the start of each line in a body. I guess this is intended to improve the readability of the raw org file. However, it interferes with use cases where I want to modify the org file and then convert it back to whatever the source format was.

In principle it would be possible to strip out the spaces at the beginning of lines before conversion, but that seems like something that would blow up sooner or later.

I propose to remove, or at least make the indentation optional. I configure org-mode to display headings to be displayed with indentation anyway, so the extra indentation does not help.