Code Monkey home page Code Monkey logo

roam-to-git's Introduction

Automatic RoamResearch backup

Roam Research backup roam-to-git tests.py

This script helps you backup your RoamResearch graphs!

This script automatically

  • Downloads a markdown archive of your RoamResearch workspace
  • Downloads a json archive of your RoamResearch workspace
  • Download the full EDN of your RoamResearch workspace
  • Unzips them to your git directory
  • Format your markdown, including with backlinks
  • Commits and push the difference to GitHub

What's new

V.02:

  • Use Selenium library, and roam-to-git seems to be much faster and stable πŸ”₯
  • Download the EDN archive

Demo

See it in action!. This repo is updated using roam-to-git.

πŸš€πŸš€ NEW πŸš€πŸš€ : The Unofficial backup of the official RoamResearch Help Database

Why to use it

  • You have a backup if RoamResearch loses some of your data.
  • You have a history of your notes.
  • You can browse your GitHub repository easily with a mobile device

Use it with GitHub Actions (recommended)

Note: Erik Newhard's guide shows an easy way of setting up GitHub Actions without using the CLI.

Create a (private) GitHub repository for all your notes

With gh: gh repo create notes (yes, it's private)

Or manually

Configure GitHub secrets

  • Go to github.com/your/repository/settings/secrets
Regarding Google Account Authorization

Due to the limitations of OAuth and complexities with tokens, we are unable to snapshot accounts that are set up with the Login with Google option as of now.

To set up backup in this case, you will need to create(not exactly) a native account from your old Google Account, which is as simple as using the reset password link found in Roam.

image

Once you've reset your password, use the following steps to finish setting up your backup!

Configuring GitHub Secrets

Add 3 (separate) secrets where the names are

ROAMRESEARCH_USER

ROAMRESEARCH_PASSWORD

ROAMRESEARCH_DATABASE

  • Refer to env.template for more information

  • when inserting the information, there is no need for quotations or assignments

image

Add GitHub action

cd notes
mkdir -p .github/workflows/
curl https://raw.githubusercontent.com/MatthieuBizien/roam-to-git-demo/master/.github/workflows/main.yml > \
    .github/workflows/main.yml
git add .github/workflows/main.yml
git commit -m "Add github/workflows/main.yml"
git push --set-upstream origin master

Check that the GitHub Action works

  • Go to github.com/your/repository/actions
  • Your CI job should start in a few seconds

Note:

If the backup does not automatically start, try pushing to the repository again

Use with GitLab CI

This section is based on this article on GitLab blog: https://about.gitlab.com/blog/2017/11/02/automating-boring-git-operations-gitlab-ci/

Create a project

Create a project for your notes. We will refer to it as YOUR_USER/YOUR_PROJECT.

Create key pair for pushing commits

Generate a new key pair that will be used by the CI job to push the new commits.

$ ssh-keygen -f gitlab-ci-commit
Generating public/private rsa key pair.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in gitlab-ci-commit-test
Your public key has been saved in gitlab-ci-commit-test.pub
The key fingerprint is:
SHA256:HoQUcbUPJU2Ur78EineqA6IVljk8ZD9XIxiGFUrBues agentydragon@pop-os
The key's randomart image is:
+---[RSA 3072]----+
|   .o=O*..o++.   |
|   .*+.o. o+o    |
|   +.=. .oo. .   |
|    X o..  o  .  |
|   . = oS   o.   |
|    + .. o ...   |
|   + . .o o ...  |
|  . E   .. o ..  |
|        .o.   .. |
+----[SHA256]-----+

DO NOT commit the private key (gitlab-ci-commit).

Add the public key as a deploy key

This step allows the CI job to push when identified by the public key.

Go to Project Settings β†’ Repository (https://gitlab.com/YOUR_USER/YOUR_PROJECT/-/settings/repository) β†’ Deploy Keys.

Paste the content of the public key file (gitlab-ci-commit-test.pub), and enable write access for the public key.

Click "Add Key".

Add the private key as a CI variable

This step gives the CI job the private key so it can authenticate against GitLab.

Go to Project Settings β†’ Pipelines (https://gitlab.com/YOUR_USER/YOUR_PROJECT/-/settings/ci_cd) β†’ Variables.

Click "Add variable", with name GIT_SSH_PRIV_KEY, and paste in the content of the private key file (gitlab-ci-commit). You probably want to mark "Protect". You might want to look up GitLab docs on protected branches.

Click "Add variable".

Also add the following variables with appropriate values:

  • ROAMRESEARCH_USER
  • ROAMRESEARCH_PASSWORD
  • ROAMRESEARCH_DATABASE

Create a gitlab_known_hosts file

In your repo, create and commit a gitlab_known_hosts file containing the needed SSH known_hosts entry/entries for the GitLab instance. It will be used by the CI job to check that it's talking to the right server.

This should work as of 2021-01-04:

# Generated by @agentydragon at 2021-01-04 at by `git fetch`ing a GitLab repo
# with an empty ~/.ssh/known_hosts file.

|1|zIQlCRxv+s9xhVCAfGL2nvaZqdY=|jbPpD9GNaS/9Z4iJzE9gw2XCo20= ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBFSMqzJeV9rUzU4kWitGjeR4PWSa29SPqJ1fVkhtj3Hw9xjLVXVYrU9QlYWrOLXBpQ6KWjbjTDTdDkoohFzgbEY=
|1|uj60xYhsW2vAM8BpQ+xZz51ZarQ=|BNIJlvu4rNcmrxd60fkqpChrf9A= ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBFSMqzJeV9rUzU4kWitGjeR4PWSa29SPqJ1fVkhtj3Hw9xjLVXVYrU9QlYWrOLXBpQ6KWjbjTDTdDkoohFzgbEY=

Create .gitlab-ci.yml

Create a .gitlab-ci.yml file:

backup:
  when: manual
  before_script:
    - 'which ssh-agent || ( apt-get update -y && apt-get install openssh-client -y )'
    - eval $(ssh-agent -s)
    - ssh-add <(echo "$GIT_SSH_PRIV_KEY")
    - git config --global user.email "[email protected]"
    - git config --global user.name "roam-to-git automated backup"
    - mkdir -p ~/.ssh
    - cat gitlab_known_hosts >> ~/.ssh/known_hosts

    # (Taken from: https://github.com/buildkite/docker-puppeteer/blob/master/Dockerfile)
    # We install Chrome to get all the OS level dependencies, but Chrome itself
    # is not actually used as it's packaged in the pyppeteer library.
    # Alternatively, we could include the entire dep list ourselves
    # (https://github.com/puppeteer/puppeteer/blob/master/docs/troubleshooting.md#chrome-headless-doesnt-launch-on-unix)
    # but that seems too easy to get out of date.
    - apt-get install -y wget gnupg ca-certificates
    - wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
    - echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list
    - apt-get update
    - apt-get install google-chrome-stable libxss1 python3-pip -y

    - pip3 install git+https://github.com/MatthieuBizien/roam-to-git.git

    # TODO(agentydragon): Create and publish Docker image with all deps already
    # installed.
  script:
    # Need to clone the repo again over SSH, since by default GitLab clones
    # the repo for CI over HTTPS, for which we cannot authenticate pushes via
    # pubkey.
    - git clone --depth=1 [email protected]:YOUR_USER/YOUR_PROJECT
    - cd YOUR_PROJECT

    # --no-sandbox needed because Chrome refuses to run as root without it.
    - roam-to-git --browser-arg=--no-sandbox .

Commit and push.

To run the pipeline

Go to https://gitlab.com/YOUR_USER/YOUR_PROJECT/-/jobs, press the Play button:

Shortly after you should see an automated commit added to master.

Scheduled backups

To run the script, say, every hour, go to the project's CI / CD β†’ Schedules (https://gitlab.com/YOUR_USER/YOUR_PROJECT/-/pipeline_schedules). Click "New schedule", fill out and submit (like this for every 15 minutes):

Use it locally

Note: if your file system is not case-sensitive, you will not backup notes that have the same name in different cases

Install Roam-To-Git

With pipx (if you don't know pipx, you should look at it, it's wonderful!)

pipx install git+https://github.com/MatthieuBizien/roam-to-git.git

Create a (private) GitHub repository for all your notes

With gh: gh repo create notes (yes, it's private)

Or manually

Then run git push --set-upstream origin master

Configure environment variables

  • curl https://raw.githubusercontent.com/MatthieuBizien/roam-to-git/master/env.template > notes/.env
  • Fill the .env file: vi .env
  • Ignore it: echo .env > notes/.gitignore; cd notes; git add .gitignore; git commit -m "Initial commit"

Manual backup

  • Run the script: roam-to-git notes/
  • Check your GitHub repository, it should be filled with your notes :)

Automatic backup

One-liner to run it with a cron every hour: echo "0 * * * * '$(which roam-to-git)' '$(pwd)/notes'" | crontab -

NB: there are issues on Mac with a crontab.

Debug

Making roam-to-git foolproof is hard, as it depends on Roam, on GitHub Action or the local environment, on software not very stable (pyppeteer we still love you πŸ˜‰ ) and on the correct user configuration.

For debugging, please try the following:

  • Check that the environment variables ROAMRESEARCH_USER, ROAMRESEARCH_PASSWORD, ROAMRESEARCH_DATABASE are correctly setup
  • Login into Roam using the username and the password. You may want to ask a new password if you have enabled Google Login, as it solved some user problems.
  • Run roam-to-git --debug to check the authentication and download work
  • Look at the traceback
  • Look for similar issues
  • If nothing else work, create a new issue with as many details as possible. I will try my best to understand and help you, no SLA promised πŸ˜‡

Task list

Backup all RoamResearch data

  • Download automatically from RoamResearch
  • Create Cron
  • Write detailed README
  • Publish the repository on GitHub
  • Download images (they currently visible in GitHub, but not in the archive so not saved in the repository πŸ˜•)

Format the backup to have a good UI

Link formatting to be compatible with GitHub markdown

  • Format [[links]]
  • Format #links
  • Format attribute::
  • Format [[ [[link 1]] [[link 2]] ]]
  • Format ((link))

Backlink formatting

  • Add backlinks reference to the notes files
  • Integrate the context into the backlink
  • Manage / in file names

Other formatting

  • Format {{TODO}} to be compatible with GitHub markdown
  • Format `{{query}}``

Make it for others

Some ideas, I don't need it, but PR welcome πŸ˜€

  • Test it/make it work on Windows
  • Pre-configure a CI server so it can run every hour without a computer Thanks @Stvad for #4!

roam-to-git's People

Contributors

adithyabsk avatar agentydragon avatar bkono avatar caffo avatar everruler12 avatar hezhizhen avatar ianqs avatar jborichevskiy avatar jrk avatar matthieubizien avatar mbakht avatar prayashm avatar sumukshashidhar avatar tdhtttt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

roam-to-git's Issues

Slashes in page names are incorrectly assumed to be folder separators in the script

Describe the bug
Slashes in page names are incorrectly assumed to be folder separators in the script.

When running as a GitHub Action, a leading forward slash in a page name creates an error running the script. It seems to try to create a file at the filesystem root ("/"), instead of in the local directory, which is denied.

For example, I had a page called "/r/PersonalFinanceCanada" related to the subreddit, which caused the below error. When I was able to run it, a page called "What to do / what to prioritize" ended up creating a folder called "What to do " and a file called " what to prioritize.md".

To Reproduce
Steps to reproduce the behaviour:

  1. Create a new page in your Roam database with a leading "/"
  2. Execute the backup script
  3. The script will fail, trying to write to a file at the root of the filesystem instead of in the local directory.

Expected behavior
Page names should not affect the location of the file when backed up to git. Pages with special characters in their name ("/", mainly) should have those characters escaped.

Traceback
https://gist.github.com/phildenhoff/e408ba8e8dbd89dfc158e774b8527570

2020-07-14T15:46:34.5618020Z PermissionError: [Errno 13] Permission denied: '/r'

Run roam-to-git --debug notes/ and report what you get.
It should open a Chrome front-end, an do the scrapping. The repository content will not be modified. If applicable, add screenshots to help explain your problem.

Please complete the following information:

  • OS: GitHub
  • Do you use Github Action? Yes
  • Do you use multiple Roam databases? No
  • Did roam-to-git previously work for you? When precisely did it stop working? No. Until I removed the leading slash on the page, roam-to-git did not work. However, removing that slash did make the program work.
  • Do backup runs work intermittently? As far as I can tell, they always work now.

Additional context
Add any other context about the problem here.

assert dot_button is not None

The Github Action worked flawlessly the first time (ie when I set it up) but has been failing every hourly execution since then.

2020-05-05 13:08:58.683 | DEBUG    | roam_to_git.scrapping:download_rr_archive:79 - Closed browser json
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.8.2/x64/bin/roam-to-git", line 11, in <module>
    load_entry_point('roam-to-git==0.1', 'console_scripts', 'roam-to-git')()
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/roam_to_git/__main__.py", line 71, in main
    scrap(markdown_zip_path, json_zip_path, config)
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 253, in scrap
    asyncio.get_event_loop().run_until_complete(asyncio.gather(*tasks))
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 69, in download_rr_archive
    return await _download_rr_archive(document, output_type, output_directory, config,
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 123, in _download_rr_archive
    assert dot_button is not None
AssertionError
2020-05-05 13:08:58.732 | DEBUG    | roam_to_git.scrapping:_kill_child_process:216 - Terminate child process [psutil.Process(pid=2935, name='chrome', started='13:07:00'), psutil.Process(pid=2941, name='chrome', started='13:07:01'), psutil.Process(pid=2943, name='chrome', started='13:07:01'), psutil.Process(pid=2966, name='chrome', started='13:07:04')]
##[error]Process completed with exit code 1.

Backup fails - may be dupe of #46 & #47

Describe the bug
A clear and concise description of what the bug is.

Backup from actions has begun to fail repeatedly after 200+ successful commits.

To Reproduce
Steps to reproduce the behavior:

  1. Either allow Git Action to run as scheduled or manual run Git Action

  2. See error in Gist here: https://gist.github.com/jmsidhu/875f9f89d5d8592cd463791168aa697c

Expected behavior
Expected action to run per design

Traceback
https://gist.github.com/jmsidhu/875f9f89d5d8592cd463791168aa697c

Run roam-to-git --debug notes/ and report what you get.
Not quite sure how to do this

Please complete the following information:

  • OS: Win10 & iOS
  • Do you use Github Action? Yes
  • Do you use multiple Roam databases? No
  • Does roam-to-git use to work for you? It used to
  • When precisely did it stopped to work? This morning, 9/15
  • Does some backup runs are still working? Consecutive backups are failing

Thank you

Just wanted to say thanks! Amazing piece of software. Good documentation. Works well. Set and forget! Thanks a bunch mate!

OSError: [Errno 30] Read-only file system: '/pdf hack.md'

Describe the bug
first run of roam=to=git failed with:

2020-08-11 16:59:03.797 | ERROR    | __main__:<module>:8 - An error has been caught in function '<module>', process 'MainProcess' (53930), thread 'MainThread' (4442066368):
Traceback (most recent call last):
> File "/Users/ccp/.local/bin/roam-to-git", line 8, in <module>
    sys.exit(main())
    β”‚   β”‚    β”” <function main at 0x104440700>
    β”‚   β”” <built-in function exit>
    β”” <module 'sys' (built-in)>
  File "/Users/ccp/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/roam_to_git/__main__.py", line 82, in main
    save_markdowns(git_path / "markdown", raws)
    β”‚              β”‚                      β”” {'February 9th, 2020.md': '- [[Areas]] \n- \n', 'Areas.md': '- [[Dartmouth]]\n- [[Home]]\n- \n', 'Dartmouth.md': '- [[Teachin...
    β”‚              β”” PosixPath('/Users/ccp/Development/notes')
    β”” <function save_markdowns at 0x103fdbaf0>
  File "/Users/ccp/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/roam_to_git/fs.py", line 57, in save_markdowns
    with dest.open("w", encoding="utf-8") as f:
         β”‚    β”‚                              β”” <_io.TextIOWrapper name='/Users/ccp/Development/notes/markdown/V: Note-taking While Reading.md' mode='w' encoding='utf-8'>
         β”‚    β”” <function Path.open at 0x103628e50>
         β”” PosixPath('/pdf hack.md')
  File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/pathlib.py", line 1218, in open
    return io.open(self, mode, buffering, encoding, errors, newline,
           β”‚  β”‚    β”‚     β”‚     β”‚          β”‚         β”‚       β”” None
           β”‚  β”‚    β”‚     β”‚     β”‚          β”‚         β”” None
           β”‚  β”‚    β”‚     β”‚     β”‚          β”” 'utf-8'
           β”‚  β”‚    β”‚     β”‚     β”” -1
           β”‚  β”‚    β”‚     β”” 'w'
           β”‚  β”‚    β”” PosixPath('/pdf hack.md')
           β”‚  β”” <built-in function open>
           β”” <module 'io' from '/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/io.py'>
  File "/usr/local/Cellar/[email protected]/3.8.5/Frameworks/Python.framework/Versions/3.8/lib/python3.8/pathlib.py", line 1074, in _opener
    return self._accessor.open(self, flags, mode)
           β”‚    β”‚              β”‚     β”‚      β”” 438
           β”‚    β”‚              β”‚     β”” 16778753
           β”‚    β”‚              β”” PosixPath('/pdf hack.md')
           β”‚    β”” <member '_accessor' of 'Path' objects>
           β”” PosixPath('/pdf hack.md')

To Reproduce

Traceback
Please use http://gist.github.com/ or similar, and report the last line here.

Run roam-to-git --debug notes/ and report what you get.
It should open a Chrome front-end, an do the scrapping. The repository content will not be modified. If applicable, add screenshots to help explain your problem.

Please complete the following information:

  • OS: MAC OS Catalina
  • Do you use Github Action? Tried
  • Do you use multiple Roam databases? no
  • Does roam-to-git use to work for you? When precisely did it stopped to work? first time
  • Does some backup runs are still working? don't think so

Additional context
Add any other context about the problem here.

Backup fails with "Permission denied: '/r'"

Describe the bug
Backup fails with this error:

2020-09-24 17:10:18.090 | DEBUG    | roam_to_git.scrapping:download_rr_archive:76 - Closing browser markdown
2020-09-24 17:10:18.126 | DEBUG    | roam_to_git.scrapping:download_rr_archive:78 - Closed browser markdown
2020-09-24 17:10:18.171 | DEBUG    | roam_to_git.scrapping:download_rr_archive:76 - Closing browser json
2020-09-24 17:10:18.215 | DEBUG    | roam_to_git.scrapping:download_rr_archive:78 - Closed browser json
2020-09-24 17:10:18.215 | DEBUG    | roam_to_git.scrapping:scrap:253 - Scrapping finished
2020-09-24 17:10:18.231 | DEBUG    | roam_to_git.fs:save_markdowns:50 - Saving markdown to /home/runner/work/notes/notes/markdown
2020-09-24 17:10:18.257 | ERROR    | __main__:<module>:11 - An error has been caught in function '<module>', process 'MainProcess' (3168), thread 'MainThread' (140306047739712):
Traceback (most recent call last):
> File "/opt/hostedtoolcache/Python/3.8.5/x64/bin/roam-to-git", line 11, in <module>
    load_entry_point('roam-to-git==0.1', 'console_scripts', 'roam-to-git')()
    β”” <function load_entry_point at 0x7f9b87ed0f70>
  File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/roam_to_git/__main__.py", line 82, in main
    save_markdowns(git_path / "markdown", raws)
    β”‚              β”‚                      β”” {'general advice.md': '- [[applause lights]]\n    - If you can invert a piece of advice and it still makes sense, it might be...
    β”‚              β”” PosixPath('/home/runner/work/notes/notes')
    β”” <function save_markdowns at 0x7f9b8679a160>
  File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/roam_to_git/fs.py", line 54, in save_markdowns
    dest.parent.mkdir(parents=True, exist_ok=True)  # Needed if a new directory is used
    β”‚    β”” <property object at 0x7f9b87b69680>
    β”” PosixPath('/r/drama.md')
  File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/pathlib.py", line 1284, in mkdir
    self._accessor.mkdir(self, mode)
    β”‚    β”‚               β”‚     β”” 511
    β”‚    β”‚               β”” PosixPath('/r')
    β”‚    β”” <member '_accessor' of 'Path' objects>
    β”” PosixPath('/r')

PermissionError: [Errno 13] Permission denied: '/r'
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.8.5/x64/bin/roam-to-git", line 11, in <module>
    load_entry_point('roam-to-git==0.1', 'console_scripts', 'roam-to-git')()
  File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/loguru/_logger.py", line 1149, in catch_wrapper
    return function(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/roam_to_git/__main__.py", line 82, in main
    save_markdowns(git_path / "markdown", raws)
  File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/roam_to_git/fs.py", line 54, in save_markdowns
    dest.parent.mkdir(parents=True, exist_ok=True)  # Needed if a new directory is used
  File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/pathlib.py", line 1284, in mkdir
    self._accessor.mkdir(self, mode)
PermissionError: [Errno 13] Permission denied: '/r'
Error: Process completed with exit code 1.

I'm using Github Actions and only have one database. This is my first attempt at using it. Any help would be much appreciated!

Exporting of media

It would probably very useful if images and PDF stored on Roam would be exported as well.

Via this thread (closed forum, but props where they are due), they look like an "easy" regex to find:

"string": "![](https://firebasestorage.googleapis.com/[..]/o/imgs%2Fapp%2FMY-GRAPH%2F-[IMAGEID].png?[AUTH_DELETED]

Alas, I don't know how complicated it would actually be to have the attachments downloaded just because they're easy to find.

Backup step failing with "No such file or directory: '/prod/local/cloudStorage'"

Describe the bug
Since the last few days, the backup keeps failing in the backup step with the following error
2020-06-21T00:13:41.2018676Z FileNotFoundError: [Errno 2] No such file or directory: '/prod/local/cloudStorage'
It happens both on Github actions and locally on my latpop.

Traceback
Full-stack trace here: https://0bin.net/paste/zTN4wwwS3NuyKtX8#CYIsVbTZPAK4dwy36L3v1jpdcRX0VwRf-vpPwfqu+5H

Run roam-to-git --debug notes/ and report what you get.
This is the log of github action with --debug added: https://0bin.net/paste/LKsEnalQb++EvmIG#0g5e4wrSNGwihI5nZj8u-bMzNC2BXFR/aiP2K6eKvtf

Please complete the following information:

  • OS: MacOs on my laptop
  • Do you use Github Action? Yes
  • Do you use multiple Roam databases? Yes. But I'm only backing up one DB. I have set the DB name in both the GitHub action secret and in the env file on my laptop.
  • Does roam-to-git use to work for you? When precisely did it stopped to work?
    Yes, I started using it on 12th June. It worked correctly for two days. It then started failing beginning 15th June
  • Does some backup runs are still working? No. All runs consistently fail.

Seeing `CERTIFICATE_VERIFY_FAILED` on first run

Describe the bug

Seeing CERTIFICATE_VERIFY_FAILED on first run.

To Reproduce

  • run roam-to-git

Expected behavior

  • roam-to-git completes successfully

Traceback

2020-07-21 10:46:41.161 | ERROR    | __main__:<module>:11 - An error has been caught in function '<module>', process 'MainProcess' (5334), thread 'MainThread' (4312686016):
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/urllib3/connectionpool.py", line 670, in urlopen
    httplib_response = self._make_request(
                       β”‚    β”” <function HTTPConnectionPool._make_request at 0x7fc8e8419040>
                       β”” <urllib3.connectionpool.HTTPSConnectionPool object at 0x7fc90833e130>
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/urllib3/connectionpool.py", line 381, in _make_request
    self._validate_conn(conn)
    β”‚    β”‚              β”” <urllib3.connection.HTTPSConnection object at 0x7fc9083577f0>
    β”‚    β”” <function HTTPSConnectionPool._validate_conn at 0x7fc8e8419550>
    β”” <urllib3.connectionpool.HTTPSConnectionPool object at 0x7fc90833e130>
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/urllib3/connectionpool.py", line 976, in _validate_conn
    conn.connect()
    β”‚    β”” <function HTTPSConnection.connect at 0x7fc8e8405e50>
    β”” <urllib3.connection.HTTPSConnection object at 0x7fc9083577f0>
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/urllib3/connection.py", line 361, in connect
    self.sock = ssl_wrap_socket(
    β”‚    β”‚      β”” <function ssl_wrap_socket at 0x7fc8d8128280>
    β”‚    β”” None
    β”” <urllib3.connection.HTTPSConnection object at 0x7fc9083577f0>
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/urllib3/util/ssl_.py", line 377, in ssl_wrap_socket
    return context.wrap_socket(sock, server_hostname=server_hostname)
           β”‚       β”‚           β”‚                     β”” 'storage.googleapis.com'
           β”‚       β”‚           β”” <socket.socket [closed] fd=-1, family=AddressFamily.AF_INET6, type=SocketKind.SOCK_STREAM, proto=6>
           β”‚       β”” <function SSLContext.wrap_socket at 0x7fc8f818b5e0>
           β”” <ssl.SSLContext object at 0x7fc90834b3c0>
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 500, in wrap_socket
    return self.sslsocket_class._create(
           β”‚    β”‚               β”” <classmethod object at 0x7fc8f818c610>
           β”‚    β”” <class 'ssl.SSLSocket'>
           β”” <ssl.SSLContext object at 0x7fc90834b3c0>
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 1040, in _create
    self.do_handshake()
    β”‚    β”” <function SSLSocket.do_handshake at 0x7fc8f81924c0>
    β”” <ssl.SSLSocket [closed] fd=-1, family=AddressFamily.AF_INET6, type=SocketKind.SOCK_STREAM, proto=0>
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 1309, in do_handshake
    self._sslobj.do_handshake()
    β”‚    β”” None
    β”” <ssl.SSLSocket [closed] fd=-1, family=AddressFamily.AF_INET6, type=SocketKind.SOCK_STREAM, proto=0>

ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1108)

Run roam-to-git --debug notes/ and report what you get.

  • N/A, Chrome has not been downloaded yet

Please complete the following information:

  • OS: latest macOS
  • Do you use Github Action? no
  • Do you use multiple Roam databases? no
  • Does roam-to-git use to work for you? When precisely did it stopped to work? no
  • Does some backup runs are still working? no

Assertion error "assert self.user"

Describe the bug
Seems like the user information isn't being extracted from the secrets? I'm DEFINITELY setting the credentials correctly as per #41

To Reproduce
I don't know how to reproduce it since it doesn't seem like anyone else is having the issue

Expected behavior
Not running into the error

Traceback

Run roam-to-git --skip-git .
  roam-to-git --skip-git .
  shell: /bin/bash -e {0}
  env:
    pythonLocation: /opt/hostedtoolcache/Python/3.8.5/x64
    ROAMRESEARCH_USER: 
    ROAMRESEARCH_PASSWORD: 
    ROAMRESEARCH_DATABASE: 
2020-08-11 22:07:08.984 | DEBUG    | roam_to_git.__main__:main:53 - No secret found at /home/runner/work/Roam-Research-Backup/Roam-Research-Backup/.env
2020-08-11 22:07:08.984 | ERROR    | __main__:<module>:11 - An error has been caught in function '<module>', process 'MainProcess' (3027), thread 'MainThread' (139709369939776):
Traceback (most recent call last):
> File "/opt/hostedtoolcache/Python/3.8.5/x64/bin/roam-to-git", line 11, in <module>
    load_entry_point('roam-to-git==0.1', 'console_scripts', 'roam-to-git')()
    β”” <function load_entry_point at 0x7f109b2b7f70>
  File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/roam_to_git/__main__.py", line 58, in main
    config = Config(args.database, debug=args.debug, sleep_duration=float(args.sleep_duration))
             β”‚      β”‚    β”‚               β”‚    β”‚                           β”‚    β”” 2.0
             β”‚      β”‚    β”‚               β”‚    β”‚                           β”” Namespace(database=None, debug=False, directory='.', skip_fetch=False, skip_git=True, skip_push=False, sleep_duration=2.0)
             β”‚      β”‚    β”‚               β”‚    β”” False
             β”‚      β”‚    β”‚               β”” Namespace(database=None, debug=False, directory='.', skip_fetch=False, skip_git=True, skip_push=False, sleep_duration=2.0)
             β”‚      β”‚    β”” None
             β”‚      β”” Namespace(database=None, debug=False, directory='.', skip_fetch=False, skip_git=True, skip_push=False, sleep_duration=2.0)
             β”” <class 'roam_to_git.scrapping.Config'>
  File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 39, in __init__
    assert self.user
           β”‚    β”” ''
           β”” <roam_to_git.scrapping.Config object at 0x7f1097c8b490>

Run roam-to-git --debug notes/ and report what you get.
It should open a Chrome front-end, an do the scrapping. The repository content will not be modified. If applicable, add screenshots to help explain your problem.

Please complete the following information:

  • OS: [e.g. MacOs, Linux] Linux
  • Do you use Github Action? Yes
  • Do you use multiple Roam databases? No
  • Does roam-to-git use to work for you? When precisely did it stopped to work?
  • Does some backup runs are still working? Nope :(

Additional context
Add any other context about the problem here.

ROAMRESEARCH_PASSWORD - name is invalid

Describe the bug
Unable to add roamresearch_password in all caps - comes up with error - name is invalid

To Reproduce

  1. Add Secret
  2. Name: ROAMRESEARCH_PASSWORD
  3. Value: Password
  4. Add secret β†’ Name is invalid - not added

Expected behavior
Should add like the others
It does add if I change password into lowercase - but not sure if this affects scripts downstream

Please complete the following information:

  • OS: MacOS Catalina
  • Do you use multiple Roam databases? Yes

Check roam-to-git version in GitHub action

Hi! Thanks for writing this.

I'm feeling a little bit nervous that the GitHub action template says: pip install git+https://github.com/MatthieuBizien/roam-to-git.git

This means that is @MatthieuBizien 's GitHub account ever gets compromised, the action that everyone using this script has executing with credentials to read from their personal Roam database will be executing potentially malicious code.

I don't know whether anything really that useful can be done against this.
What I'm personally going to do is just change my GitHub action to say:

pip install git+https://github.com/MatthieuBizien/roam-to-git.git@8628900c8429d2af8d7c2406b9eed94f23db9f8c

(that's HEAD as of time of me writing this). Skimming the code, it looks OK to me. (I don't know the state of Git hash security those days. Maybe a really motivated attacker could put in malicious code with the same Git hash. Oh well, fingers crossed...)

Maybe:

  • roam-to-git could have signed releases.
  • The GitHub action could try to jail the process to communicate only with Roam Research website?

I really don't know if this is even worth solving at this point but wanted to bring it up just in case.

[macOS] command to backup fails as crontab only

Describe the bug
On macOS Catalina. The same command that I run from the console (and works) fails when run as crontab.

To Reproduce

  1. Run the backup command from the command line (I anonymized e-mail address and graph name):
/Users/daniel/.local/bin/roam-to-git /Users/daniel/BrainBackup
2020-08-17 10:40:18.015 | INFO     | roam_to_git.__main__:main:50 - Loading secrets from /Users/daniel/BrainBackup/.env
2020-08-17 10:40:18.020 | DEBUG    | roam_to_git.scrapping:download_rr_archive:55 - Creating browser
2020-08-17 10:40:18.400 | DEBUG    | roam_to_git.scrapping:download_rr_archive:55 - Creating browser
2020-08-17 10:40:19.052 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:92 - Configure downloads to /var/folders/nh/jr3vkxsx14d3c4pss49krrcc0000gn/T/tmphp6ahdp0
2020-08-17 10:40:19.091 | DEBUG    | roam_to_git.scrapping:signin:183 - Opening signin page
2020-08-17 10:40:19.147 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:92 - Configure downloads to /var/folders/nh/jr3vkxsx14d3c4pss49krrcc0000gn/T/tmptfrz3qf0
2020-08-17 10:40:19.186 | DEBUG    | roam_to_git.scrapping:signin:183 - Opening signin page
2020-08-17 10:40:23.731 | DEBUG    | roam_to_git.scrapping:signin:187 - Fill email '[email protected]'
2020-08-17 10:40:23.778 | DEBUG    | roam_to_git.scrapping:signin:187 - Fill email '[email protected]'
2020-08-17 10:40:28.943 | DEBUG    | roam_to_git.scrapping:signin:192 - Fill password
2020-08-17 10:40:28.994 | DEBUG    | roam_to_git.scrapping:signin:192 - Fill password
2020-08-17 10:40:30.103 | DEBUG    | roam_to_git.scrapping:signin:197 - Click on sign-in
2020-08-17 10:40:30.154 | DEBUG    | roam_to_git.scrapping:signin:197 - Click on sign-in
2020-08-17 10:40:32.363 | DEBUG    | roam_to_git.scrapping:go_to_database:207 - Load database from url 'https://roamresearch.com/#/app/myroam'
2020-08-17 10:40:32.407 | DEBUG    | roam_to_git.scrapping:go_to_database:207 - Load database from url 'https://roamresearch.com/#/app/myroam'
2020-08-17 10:40:33.415 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:102 - Wait for interface to load
2020-08-17 10:40:33.438 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:102 - Wait for interface to load
2020-08-17 10:40:39.683 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:130 - Launch download popup
2020-08-17 10:40:39.705 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:130 - Launch download popup
2020-08-17 10:40:42.043 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:144 - Checking download type
2020-08-17 10:40:42.072 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:144 - Checking download type
2020-08-17 10:40:42.091 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:159 - Downloading output of type markdown
2020-08-17 10:40:42.116 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:148 - Changing output type to json
2020-08-17 10:40:42.433 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:164 - Wait download of markdown to /var/folders/nh/jr3vkxsx14d3c4pss49krrcc0000gn/T/tmphp6ahdp0
2020-08-17 10:40:43.435 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:174 - File /var/folders/nh/jr3vkxsx14d3c4pss49krrcc0000gn/T/tmphp6ahdp0/Roam-Export-1597653643328.zip found for markdown
2020-08-17 10:40:44.436 | DEBUG    | roam_to_git.scrapping:download_rr_archive:76 - Closing browser markdown
2020-08-17 10:40:44.489 | DEBUG    | roam_to_git.scrapping:download_rr_archive:78 - Closed browser markdown
2020-08-17 10:40:46.480 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:159 - Downloading output of type json
2020-08-17 10:40:46.816 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:164 - Wait download of json to /var/folders/nh/jr3vkxsx14d3c4pss49krrcc0000gn/T/tmptfrz3qf0
2020-08-17 10:40:47.818 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:174 - File /var/folders/nh/jr3vkxsx14d3c4pss49krrcc0000gn/T/tmptfrz3qf0/Roam-Export-1597653647235.zip found for json
2020-08-17 10:40:48.821 | DEBUG    | roam_to_git.scrapping:download_rr_archive:76 - Closing browser json
2020-08-17 10:40:48.867 | DEBUG    | roam_to_git.scrapping:download_rr_archive:78 - Closed browser json
2020-08-17 10:40:48.867 | DEBUG    | roam_to_git.scrapping:scrap:253 - Scrapping finished
2020-08-17 10:40:48.872 | DEBUG    | roam_to_git.fs:save_markdowns:50 - Saving markdown to /Users/daniel/BrainBackup/markdown
2020-08-17 10:40:48.885 | DEBUG    | roam_to_git.fs:unzip_and_save_json_archive:62 - Saving json to /Users/daniel/BrainBackup/json
2020-08-17 10:40:48.945 | DEBUG    | roam_to_git.fs:save_markdowns:50 - Saving markdown to /Users/daniel/BrainBackup/formatted
2020-08-17 10:40:49.056 | DEBUG    | roam_to_git.fs:commit_git_directory:79 - Committing git repository /Users/daniel/BrainBackup/.git
2020-08-17 10:40:49.141 | DEBUG    | roam_to_git.fs:push_git_repository:85 - Pushing to origin
  1. Add the same command as crontab (crontab -e)
0 * * * * /Users/daniel/.local/bin/roam-to-git /Users/daniel/BrainBackup

Expected behavior
I would expect the same output and a successful backup.

What I get (I just take the latest one I received in my mailbox):

Subject: Cron <daniel@mymacbook> /Users/daniel/.local/bin/roam-to-git /Users/daniel/BrainBackup
X-Cron-Env: <SHELL=/bin/sh>
X-Cron-Env: <PATH=/usr/bin:/bin>
X-Cron-Env: <LOGNAME=daniel>
X-Cron-Env: <USER=daniel>
Date: Sun, 16 Aug 2020 22:00:01 +0200 (CEST)

2020-08-16 22:00:01.019 | INFO     | roam_to_git.__main__:main:50 - Loading secrets from /Users/daniel/BrainBackup/.env
2020-08-16 22:00:01.026 | DEBUG    | roam_to_git.scrapping:download_rr_archive:55 - Creating browser
2020-08-16 22:00:01.389 | DEBUG    | roam_to_git.scrapping:download_rr_archive:55 - Creating browser
2020-08-16 22:00:03.232 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:92 - Configure downloads to /tmp/tmpij4r1sjm
2020-08-16 22:00:03.272 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:92 - Configure downloads to /tmp/tmpcg12_9fi
2020-08-16 22:00:03.336 | DEBUG    | roam_to_git.scrapping:signin:183 - Opening signin page
2020-08-16 22:00:03.372 | DEBUG    | roam_to_git.scrapping:signin:183 - Opening signin page
2020-08-16 22:00:10.268 | DEBUG    | roam_to_git.scrapping:signin:187 - Fill email '[email protected]'
2020-08-16 22:00:10.572 | DEBUG    | roam_to_git.scrapping:signin:187 - Fill email '[email protected]'
2020-08-16 22:00:22.892 | DEBUG    | roam_to_git.scrapping:download_rr_archive:76 - Closing browser json
2020-08-16 22:00:22.903 | DEBUG    | roam_to_git.scrapping:download_rr_archive:76 - Closing browser markdown
2020-08-16 22:00:23.059 | DEBUG    | roam_to_git.scrapping:download_rr_archive:78 - Closed browser json
2020-08-16 22:00:23.060 | ERROR    | __main__:<module>:8 - An error has been caught in function '<module>', process 'MainProcess' (20024), thread 'MainThread' (4372716992):
Traceback (most recent call last):
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/pyppeteer/execution_context.py", line 85, in evaluateHandle
    'userGesture': True,

pyppeteer.errors.NetworkError: Protocol error (Runtime.evaluate): Cannot find context with specified id


During handling of the above exception, another exception occurred:


Traceback (most recent call last):
> File "/Users/daniel/.local/bin/roam-to-git", line 8, in <module>
    sys.exit(main())
    β”‚   β”‚    β”” <function main at 0x7fb9100ba290>
    β”‚   β”” <built-in function exit>
    β”” <module 'sys' (built-in)>
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/roam_to_git/__main__.py", line 76, in main
    scrap(markdown_zip_path, json_zip_path, config)
    β”‚     β”‚                  β”‚              β”” <roam_to_git.scrapping.Config object at 0x7fb928052b50>
    β”‚     β”‚                  β”” PosixPath('/tmp/tmpcg12_9fi')
    β”‚     β”” PosixPath('/tmp/tmpij4r1sjm')
    β”” <function scrap at 0x7fb9100ba0e0>
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/roam_to_git/scrapping.py", line 252, in scrap
    asyncio.get_event_loop().run_until_complete(asyncio.gather(*tasks))
    β”‚       β”‚                                   β”‚       β”‚       β”” [<coroutine object download_rr_archive at 0x7fb9100ba560>, <coroutine object download_rr_archive at 0x7fb9100ba7a0>]
    β”‚       β”‚                                   β”‚       β”” <function gather at 0x7fb9507d2c20>
    β”‚       β”‚                                   β”” <module 'asyncio' from '/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/__init__.py'>
    β”‚       β”” <built-in function get_event_loop>
    β”” <module 'asyncio' from '/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/__init__.py'>
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete
    return future.result()
           β”‚      β”” <method 'result' of '_asyncio.Future' objects>
           β”” <_GatheringFuture finished exception=NetworkError('Execution context was destroyed, most likely because of a navigation.')>
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/roam_to_git/scrapping.py", line 69, in download_rr_archive
    return await _download_rr_archive(document, output_type, output_directory, config)
                 β”‚                    β”‚         β”‚            β”‚                 β”” <roam_to_git.scrapping.Config object at 0x7fb928052b50>
                 β”‚                    β”‚         β”‚            β”” PosixPath('/tmp/tmpcg12_9fi')
                 β”‚                    β”‚         β”” 'json'
                 β”‚                    β”” <pyppeteer.page.Page object at 0x7fb950b1a5d0>
                 β”” <function _download_rr_archive at 0x7fb93009c9e0>
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/roam_to_git/scrapping.py", line 97, in _download_rr_archive
    await signin(document, config, sleep_duration=config.sleep_duration)
          β”‚      β”‚         β”‚                      β”‚      β”” 2.0
          β”‚      β”‚         β”‚                      β”” <roam_to_git.scrapping.Config object at 0x7fb928052b50>
          β”‚      β”‚         β”” <roam_to_git.scrapping.Config object at 0x7fb928052b50>
          β”‚      β”” <pyppeteer.page.Page object at 0x7fb950b1a5d0>
          β”” <function signin at 0x7fb93009c830>
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/roam_to_git/scrapping.py", line 188, in signin
    email_elem = await document.querySelector("input[name='email']")
                       β”‚        β”” <function Page.querySelector at 0x7fb9100a49e0>
                       β”” <pyppeteer.page.Page object at 0x7fb950b1a5d0>
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/pyppeteer/page.py", line 371, in querySelector
    return await frame.querySelector(selector)
                 β”‚     β”‚             β”” "input[name='email']"
                 β”‚     β”” <function Frame.querySelector at 0x7fb910063320>
                 β”” <pyppeteer.frame_manager.Frame object at 0x7fb8f0010d90>
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/pyppeteer/frame_manager.py", line 316, in querySelector
    document = await self._document()
                     β”‚    β”” <function Frame._document at 0x7fb9100633b0>
                     β”” <pyppeteer.frame_manager.Frame object at 0x7fb8f0010d90>
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/pyppeteer/frame_manager.py", line 326, in _document
    document = (await context.evaluateHandle('document')).asElement()
                      β”‚       β”” <function ExecutionContext.evaluateHandle at 0x7fb928071710>
                      β”” <pyppeteer.execution_context.ExecutionContext object at 0x7fb8f00218d0>
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/pyppeteer/execution_context.py", line 88, in evaluateHandle
    _rewriteError(e)
    β”” <function _rewriteError at 0x7fb92806b710>
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/pyppeteer/execution_context.py", line 237, in _rewriteError
    raise type(error)(msg)
               β”‚      β”” 'Execution context was destroyed, most likely because of a navigation.'
               β”” NetworkError('Protocol error (Runtime.evaluate): Cannot find context with specified id')

pyppeteer.errors.NetworkError: Execution context was destroyed, most likely because of a navigation.
Traceback (most recent call last):
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/pyppeteer/execution_context.py", line 85, in evaluateHandle
    'userGesture': True,
pyppeteer.errors.NetworkError: Protocol error (Runtime.evaluate): Cannot find context with specified id

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/daniel/.local/bin/roam-to-git", line 8, in <module>
    sys.exit(main())
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/loguru/_logger.py", line 1149, in catch_wrapper
    return function(*args, **kwargs)
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/roam_to_git/__main__.py", line 76, in main
    scrap(markdown_zip_path, json_zip_path, config)
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/roam_to_git/scrapping.py", line 252, in scrap
    asyncio.get_event_loop().run_until_complete(asyncio.gather(*tasks))
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete
    return future.result()
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/roam_to_git/scrapping.py", line 69, in download_rr_archive
    return await _download_rr_archive(document, output_type, output_directory, config)
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/roam_to_git/scrapping.py", line 97, in _download_rr_archive
    await signin(document, config, sleep_duration=config.sleep_duration)
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/roam_to_git/scrapping.py", line 188, in signin
    email_elem = await document.querySelector("input[name='email']")
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/pyppeteer/page.py", line 371, in querySelector
    return await frame.querySelector(selector)
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/pyppeteer/frame_manager.py", line 316, in querySelector
    document = await self._document()
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/pyppeteer/frame_manager.py", line 326, in _document
    document = (await context.evaluateHandle('document')).asElement()
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/pyppeteer/execution_context.py", line 88, in evaluateHandle
    _rewriteError(e)
  File "/Users/daniel/.local/pipx/venvs/roam-to-git/lib/python3.7/site-packages/pyppeteer/execution_context.py", line 237, in _rewriteError
    raise type(error)(msg)
pyppeteer.errors.NetworkError: Execution context was destroyed, most likely because of a navigation.
2020-08-16 22:00:23.086 | DEBUG    | roam_to_git.scrapping:_kill_child_process:215 - Terminate child process [psutil.Process(pid=20032, name='Chromium', status='zombie', started='22:00:01')]

Traceback
Please use http://gist.github.com/ or similar, and report the last line here.

Run roam-to-git --debug notes/ and report what you get.

/Users/daniel/.local/bin/roam-to-git --debug /Users/daniel/BrainBackup
2020-08-17 10:51:15.974 | INFO     | roam_to_git.__main__:main:50 - Loading secrets from /Users/daniel/BrainBackup/.env
2020-08-17 10:51:15.977 | DEBUG    | roam_to_git.scrapping:download_rr_archive:55 - Creating browser
2020-08-17 10:51:17.654 | DEBUG    | roam_to_git.scrapping:signin:183 - Opening signin page
2020-08-17 10:51:22.112 | DEBUG    | roam_to_git.scrapping:signin:187 - Fill email '[email protected]'
Future exception was never retrieved
future: <Future finished exception=NetworkError('Protocol error (Runtime.releaseObject): Cannot find context with specified id')>
pyppeteer.errors.NetworkError: Protocol error (Runtime.releaseObject): Cannot find context with specified id
2020-08-17 10:51:27.398 | DEBUG    | roam_to_git.scrapping:signin:192 - Fill password
2020-08-17 10:51:28.572 | DEBUG    | roam_to_git.scrapping:signin:197 - Click on sign-in
2020-08-17 10:51:30.821 | DEBUG    | roam_to_git.scrapping:go_to_database:207 - Load database from url 'https://roamresearch.com/#/app/myroam'
2020-08-17 10:51:31.919 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:102 - Wait for interface to load
2020-08-17 10:51:38.225 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:130 - Launch download popup
2020-08-17 10:51:40.586 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:144 - Checking download type
2020-08-17 10:51:40.635 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:159 - Downloading output of type markdown
2020-08-17 10:51:40.970 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:164 - Wait download of markdown to /var/folders/nh/jr3vkxsx14d3c4pss49krrcc0000gn/T/tmpy8e_6hso
2020-08-17 10:51:40.970 | DEBUG    | roam_to_git.scrapping:download_rr_archive:55 - Creating browser
2020-08-17 10:51:42.647 | DEBUG    | roam_to_git.scrapping:signin:183 - Opening signin page
2020-08-17 10:51:46.838 | DEBUG    | roam_to_git.scrapping:signin:187 - Fill email '[email protected]'
Future exception was never retrieved
future: <Future finished exception=NetworkError('Protocol error (Runtime.releaseObject): Cannot find context with specified id')>
pyppeteer.errors.NetworkError: Protocol error (Runtime.releaseObject): Cannot find context with specified id
2020-08-17 10:51:52.303 | DEBUG    | roam_to_git.scrapping:signin:192 - Fill password
2020-08-17 10:51:53.458 | DEBUG    | roam_to_git.scrapping:signin:197 - Click on sign-in
2020-08-17 10:51:55.713 | DEBUG    | roam_to_git.scrapping:go_to_database:207 - Load database from url 'https://roamresearch.com/#/app/myroam'
2020-08-17 10:51:56.786 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:102 - Wait for interface to load
2020-08-17 10:52:03.071 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:130 - Launch download popup
2020-08-17 10:52:05.426 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:144 - Checking download type
2020-08-17 10:52:05.475 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:148 - Changing output type to json
2020-08-17 10:52:09.842 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:159 - Downloading output of type json
2020-08-17 10:52:10.183 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:164 - Wait download of json to /var/folders/nh/jr3vkxsx14d3c4pss49krrcc0000gn/T/tmp77q3gvgy
2020-08-17 10:52:10.183 | WARNING  | roam_to_git.scrapping:scrap:249 - Exiting without updating the git repository, because we can't get the downloads with the option --debug
2020-08-17 10:52:10.183 | DEBUG    | roam_to_git.__main__:main:78 - waiting for the download...

Please complete the following information:

  • OS: macOS Catalina, latest version
  • Do you use Github Action? No, a self-hosted Gitea server.
  • Do you use multiple Roam databases? No. My account has just one associated graph.
  • Does roam-to-git use to work for you? When precisely did it stopped to work? Works perfectly when triggered manually.
  • Does some backup runs are still working? I guess? They look fine.

Additional context
Add any other context about the problem here.

Umlauts break "formatted" links

Describe the bug
While pages with umlauts get exported just fine (nice!), the conversion to formatted markdown notes stumbles upon them and breaks the conversion to a link right before them and truncates everything after.

To Reproduce

  1. create a tag or link to a page with an umlaut
  2. have it exported
  3. have a look at the formatted / compare it to raw markdown

Expected behavior
The whole tag / link should be converted.

Please complete the following information:

  • OS: macOS
  • Do you use Github Action? yes
  • Do you use multiple Roam databases? yes
  • Does roam-to-git use to work for you? When precisely did it stopped to work? yes (it didn't stop)
  • Does some backup runs are still working? yes (only affects conversion to formatted)

Comparison:

Raw:
01_raw

Formatted:
02_formatted

Unrelated:
I thank you soooo much for Roam to Git. Without this GitHub Action, I wouldn't have the confidence to actually use Roam. This is a so good functionality it belongs right into Roam's core.

What am I missing?

I am trying to get this set up on a private repository. I set up a repo called 'notes', and copied over the action from the demo repo. It ran fine the first time, but now the scheduled runs are failing with the error message:

.github#L1
repository 'https://github.com/stevenhill1/notes/' not found

I can't see what I have done wrong.

Seeing An Assertion

I've not been able to backup at all. Keep running into this issue via the Action.

`2020-05-14 17:01:30.850 | DEBUG | roam_to_git.scrapping:_download_rr_archive:91 - Configure downloads to /tmp/tmpjneln7kw
2020-05-14 17:01:30.872 | DEBUG | roam_to_git.scrapping:signin:181 - Opening signin page
2020-05-14 17:01:30.885 | DEBUG | roam_to_git.scrapping:signin:181 - Opening signin page
2020-05-14 17:01:35.739 | DEBUG | roam_to_git.scrapping:signin:185 - Fill email ''
2020-05-14 17:01:35.835 | DEBUG | roam_to_git.scrapping:signin:185 - Fill email '
'
2020-05-14 17:01:37.006 | DEBUG | roam_to_git.scrapping:signin:190 - Fill password
2020-05-14 17:01:37.115 | DEBUG | roam_to_git.scrapping:signin:190 - Fill password
2020-05-14 17:01:38.181 | DEBUG | roam_to_git.scrapping:signin:195 - Click on sign-in
2020-05-14 17:01:38.285 | DEBUG | roam_to_git.scrapping:signin:195 - Click on sign-in
2020-05-14 17:01:40.419 | DEBUG | roam_to_git.scrapping:_download_rr_archive:101 - Wait for interface to load
2020-05-14 17:01:40.512 | DEBUG | roam_to_git.scrapping:_download_rr_archive:101 - Wait for interface to load
2020-05-14 17:05:05.206 | DEBUG | roam_to_git.scrapping:download_rr_archive:75 - Closing browser markdown
2020-05-14 17:05:05.206 | DEBUG | roam_to_git.scrapping:download_rr_archive:75 - Closing browser json
2020-05-14 17:05:05.235 | DEBUG | roam_to_git.scrapping:download_rr_archive:77 - Closed browser markdown
2020-05-14 17:05:05.252 | DEBUG | roam_to_git.scrapping:download_rr_archive:77 - Closed browser json
2020-05-14 17:05:05.253 | ERROR | main::11 - An error has been caught in function '', process 'MainProcess' (3394), thread 'MainThread' (139937934346048):
Traceback (most recent call last):

File "/opt/hostedtoolcache/Python/3.8.2/x64/bin/roam-to-git", line 11, in
load_entry_point('roam-to-git==0.1', 'console_scripts', 'roam-to-git')()
β”” <function load_entry_point at 0x7f45d2ac65e0>
File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/roam_to_git/main.py", line 76, in main
scrap(markdown_zip_path, json_zip_path, config)
β”‚ β”‚ β”‚ β”” <roam_to_git.scrapping.Config object at 0x7f45cf4fd0d0>
β”‚ β”‚ β”” PosixPath('/tmp/tmpjneln7kw')
β”‚ β”” PosixPath('/tmp/tmpv0t13sqw')
β”” <function scrap at 0x7f45cf4f9a60>
File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 250, in scrap
asyncio.get_event_loop().run_until_complete(asyncio.gather(*tasks))
β”‚ β”‚ β”‚ β”‚ β”” [<coroutine object download_rr_archive at 0x7f45cf825dc0>, <coroutine object download_rr_archive at 0x7f45cf92b140>]
β”‚ β”‚ β”‚ β”” <function gather at 0x7f45d1472160>
β”‚ β”‚ β”” <module 'asyncio' from '/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/asyncio/init.py'>
β”‚ β””

β”” <module 'asyncio' from '/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/asyncio/__init__.py'>

File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
β”‚ β”” <method 'result' of '_asyncio.Future' objects>
β”” <_GatheringFuture finished exception=AssertionError('All roads leads to Roam, but that one is too long. Try again when Roam s...
File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 68, in download_rr_archive
return await _download_rr_archive(document, output_type, output_directory, config)
β”‚ β”‚ β”‚ β”‚ β”” <roam_to_git.scrapping.Config object at 0x7f45cf4fd0d0>
β”‚ β”‚ β”‚ β”” PosixPath('/tmp/tmpv0t13sqw')
β”‚ β”‚ β”” 'markdown'
β”‚ β”” <pyppeteer.page.Page object at 0x7f45cedbb1f0>
β”” <function _download_rr_archive at 0x7f45cf4f9820>
File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 119, in _download_rr_archive
assert dot_button is not None, "All roads leads to Roam, but that one is too long. Try "
β”” None

AssertionError: All roads leads to Roam, but that one is too long. Try again when Roam servers are faster.`

Github Workflow backup does not start an action

Describe the bug

I've followed all the instructions but when I push the repo I do not see anything being run in the "Actions" tab.

To Reproduce

  1. Create a new private repo
  2. Add secrets like in the env.template (but obviously I changed it to my credentials)
  3. Add the github action
  4. Check that the github action works
  • It has been 15 minutes and my github action has not yet kicked off yet. As far as I can tell nothing has run and I don't know if there are any debug logs that I can look for?

Expected behavior
Backup action to run

Traceback

  • None? I'm using Github actions

Please complete the following information:

  • OS: [e.g. MacOs, Linux] Linux
  • Do you use Github Action? Yes
  • Do you use multiple Roam databases? No
  • Does roam-to-git use to work for you? When precisely did it stopped to work? Haven't done it - I assume I can do either method and that I don't need to use the local one to get the actions to run
  • Does some backup runs are still working? Nope

Additional context
Add any other context about the problem here.


My directory structure looks like this

my_dir/
  | .github/
    | workflows/
      | main.yml
  | .git

where my main.yml contains

name: "Roam Research backup"

on:
  push:
    branches:
      - master
  schedule:
    -   cron: "0 * * * *"

jobs:
  backup:
    runs-on: ubuntu-latest
    name: Backup
    timeout-minutes: 15
    steps:
      -   uses: actions/checkout@v2
      -   name: Set up Python 3.8
          uses: actions/setup-python@v1
          with:
            python-version: 3.8

      -   name: Setup dependencies
          run: |
            pip install git+https://github.com/MatthieuBizien/roam-to-git.git
      -   name: Run backup
          run: roam-to-git --skip-git .
          env:
            ROAMRESEARCH_USER: ${{ secrets.ROAMRESEARCH_USER }}
            ROAMRESEARCH_PASSWORD: ${{ secrets.ROAMRESEARCH_PASSWORD }}
            ROAMRESEARCH_DATABASE: ${{ secrets.ROAMRESEARCH_DATABASE }}

      -   name: Commit changes
          uses: elstudio/actions-js-build/commit@v3
          with:
            commitMessage: Automated snapshot

getting assertion error on self.user

Can't seem to debug this error. Any help would be appreciated! Thanks for this amazing tool.

  • First time using roam-to-git, running on macOS.
  • I just set up the Github action exactly as specified, and this is the first time it is running the action.
  • Setting the env variables using Github secrets.

Output from the Github Actions UI when it runs:

  roam-to-git --skip-git .
  shell: /bin/bash -e {0}
  env:
    pythonLocation: /opt/hostedtoolcache/Python/3.8.3/x64
    ROAMRESEARCH_USER: 
    ROAMRESEARCH_PASSWORD: 
    ROAMRESEARCH_DATABASE: 
2020-06-12 03:40:20.028 | DEBUG    | roam_to_git.__main__:main:53 - No secret found at /home/runner/work/roam-notes/roam-notes/.env
2020-06-12 03:40:20.028 | ERROR    | __main__:<module>:11 - An error has been caught in function '<module>', process 'MainProcess' (3049), thread 'MainThread' (140371337385792):
Traceback (most recent call last):

> File "/opt/hostedtoolcache/Python/3.8.3/x64/bin/roam-to-git", line 11, in <module>
    load_entry_point('roam-to-git==0.1', 'console_scripts', 'roam-to-git')()
    β”” <function load_entry_point at 0x7faabb81f5e0>
  File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/roam_to_git/__main__.py", line 58, in main
    config = Config(args.database, debug=args.debug, sleep_duration=float(args.sleep_duration))
             β”‚      β”‚    β”‚               β”‚    β”‚                           β”‚    β”” 2.0
             β”‚      β”‚    β”‚               β”‚    β”‚                           β”” Namespace(database=None, debug=False, directory='.', skip_fetch=False, skip_git=True, skip_push=False, sleep_duration=2.0)
             β”‚      β”‚    β”‚               β”‚    β”” False
             β”‚      β”‚    β”‚               β”” Namespace(database=None, debug=False, directory='.', skip_fetch=False, skip_git=True, skip_push=False, sleep_duration=2.0)
             β”‚      β”‚    β”” None
             β”‚      β”” Namespace(database=None, debug=False, directory='.', skip_fetch=False, skip_git=True, skip_push=False, sleep_duration=2.0)
             β”” <class 'roam_to_git.scrapping.Config'>
  File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 39, in __init__
    assert self.user
           β”‚    β”” ''
           β”” <roam_to_git.scrapping.Config object at 0x7faab81aa8b0>

AssertionError: assert self.user
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.8.3/x64/bin/roam-to-git", line 11, in <module>
    load_entry_point('roam-to-git==0.1', 'console_scripts', 'roam-to-git')()
  File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/loguru/_logger.py", line 1210, in catch_wrapper
    return function(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/roam_to_git/__main__.py", line 58, in main
    config = Config(args.database, debug=args.debug, sleep_duration=float(args.sleep_duration))
  File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 39, in __init__
    assert self.user
AssertionError
##[error]Process completed with exit code 1.

All files were deleted during a backup

4 hours ago there was a commit that deleted the entire contents of my DB, and 3 hours ago there's a commit that appears to restore everything.

Backup logs from the commit that deleted everything:

Run roam-to-git --skip-git .
2020-05-08 14:10:43.784 | DEBUG    | roam_to_git.__main__:main:53 - No secret found at /home/runner/work/roam/roam/.env
2020-05-08 14:10:43.799 | DEBUG    | roam_to_git.scrapping:download_rr_archive:54 - Creating browser
[W:pyppeteer.chromium_downloader] start chromium download.
Download may take a few minutes.

  0%|          | 0/108773488 [00:00<?, ?it/s]
 18%|β–ˆβ–Š        | 19896320/108773488 [00:00<00:00, 198908852.59it/s]
 35%|β–ˆβ–ˆβ–ˆβ–Œ      | 38307840/108773488 [00:00<00:00, 194190292.74it/s]
 54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž    | 58327040/108773488 [00:00<00:00, 195925297.80it/s]
 71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–  | 77660160/108773488 [00:00<00:00, 192858088.73it/s]
 90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 98324480/108773488 [00:00<00:00, 196760428.21it/s]
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 108773488/108773488 [00:00<00:00, 198165231.91it/s]
[W:pyppeteer.chromium_downloader] 
chromium download done.
[W:pyppeteer.chromium_downloader] chromium extracted to: /home/runner/.local/share/pyppeteer/local-chromium/588429
2020-05-08 14:10:49.574 | DEBUG    | roam_to_git.scrapping:download_rr_archive:54 - Creating browser
2020-05-08 14:10:50.119 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:91 - Configure downloads to /tmp/tmp3cr8s8ho
2020-05-08 14:10:50.130 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:91 - Configure downloads to /tmp/tmp5fxbr42q
2020-05-08 14:10:50.153 | DEBUG    | roam_to_git.scrapping:signin:181 - Opening signin page
2020-05-08 14:10:50.165 | DEBUG    | roam_to_git.scrapping:signin:181 - Opening signin page
2020-05-08 14:10:54.844 | DEBUG    | roam_to_git.scrapping:signin:185 - Fill email '***'
2020-05-08 14:10:55.096 | DEBUG    | roam_to_git.scrapping:signin:185 - Fill email '***'
2020-05-08 14:10:56.015 | DEBUG    | roam_to_git.scrapping:signin:190 - Fill password
2020-05-08 14:10:56.283 | DEBUG    | roam_to_git.scrapping:signin:190 - Fill password
2020-05-08 14:10:56.936 | DEBUG    | roam_to_git.scrapping:signin:195 - Click on sign-in
2020-05-08 14:10:57.195 | DEBUG    | roam_to_git.scrapping:signin:195 - Click on sign-in
2020-05-08 14:10:59.164 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:101 - Wait for interface to load
2020-05-08 14:10:59.416 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:101 - Wait for interface to load
2020-05-08 14:11:24.210 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:128 - Launch download popup
2020-05-08 14:11:25.485 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:128 - Launch download popup
2020-05-08 14:11:26.568 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:142 - Checking download type
2020-05-08 14:11:26.591 | DEBUG    | roam_to_git.scrapping:download_rr_archive:75 - Closing browser json
2020-05-08 14:11:26.614 | DEBUG    | roam_to_git.scrapping:download_rr_archive:77 - Closed browser json
2020-05-08 14:11:26.614 | ERROR    | __main__:<module>:11 - An error has been caught in function '<module>', process 'MainProcess' (2906), thread 'MainThread' (140148425639744):
Traceback (most recent call last):
> File "/opt/hostedtoolcache/Python/3.8.2/x64/bin/roam-to-git", line 11, in <module>
    load_entry_point('roam-to-git==0.1', 'console_scripts', 'roam-to-git')()
    β”” <function load_entry_point at 0x7f76d4ee55e0>
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/roam_to_git/__main__.py", line 76, in main
    scrap(markdown_zip_path, json_zip_path, config)
    β”‚     β”‚                  β”‚              β”” <roam_to_git.scrapping.Config object at 0x7f76d18ec0a0>
    β”‚     β”‚                  β”” PosixPath('/tmp/tmp5fxbr42q')
    β”‚     β”” PosixPath('/tmp/tmp3cr8s8ho')
    β”” <function scrap at 0x7f76d18e8a60>
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 250, in scrap
    asyncio.get_event_loop().run_until_complete(asyncio.gather(*tasks))
    β”‚       β”‚                                   β”‚       β”‚       β”” [<coroutine object download_rr_archive at 0x7f76d1c14dc0>, <coroutine object download_rr_archive at 0x7f76d1d1b140>]
    β”‚       β”‚                                   β”‚       β”” <function gather at 0x7f76d3860160>
    β”‚       β”‚                                   β”” <module 'asyncio' from '/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/asyncio/__init__.py'>
    β”‚       β”” <built-in function get_event_loop>
    β”” <module 'asyncio' from '/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/asyncio/__init__.py'>
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
           β”‚      β”” <method 'result' of '_asyncio.Future' objects>
           β”” <_GatheringFuture finished exception=AssertionError()>
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 68, in download_rr_archive
    return await _download_rr_archive(document, output_type, output_directory, config)
                 β”‚                    β”‚         β”‚            β”‚                 β”” <roam_to_git.scrapping.Config object at 0x7f76d18ec0a0>
                 β”‚                    β”‚         β”‚            β”” PosixPath('/tmp/tmp5fxbr42q')
                 β”‚                    β”‚         β”” 'json'
                 β”‚                    β”” <pyppeteer.page.Page object at 0x7f76d12244c0>
                 β”” <function _download_rr_archive at 0x7f76d18e8820>
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 143, in _download_rr_archive
    button, button_text = await get_dropdown_button()
                                β”” <function _download_rr_archive.<locals>.get_dropdown_button at 0x7f76d0f91670>
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 136, in get_dropdown_button
    assert dropdown_button is not None
           β”” None

AssertionError: assert dropdown_button is not None
2020-05-08 14:11:26.633 | DEBUG    | roam_to_git.scrapping:_kill_child_process:213 - Terminate child process [psutil.Process(pid=2925, name='chrome', started='14:10:46'), psutil.Process(pid=2931, name='chrome', started='14:10:47'), psutil.Process(pid=2952, name='chrome', started='14:10:49'), psutil.Process(pid=2963, name='chrome', started='14:10:49'), psutil.Process(pid=2933, name='chrome', started='14:10:47'), psutil.Process(pid=2951, name='chrome', started='14:10:49')]

[Win10] Documentation Name: Solved!

Hey, guys! After hours struggling with roam-to-git and win10, and I finally found a work around (isn't a beautiful solution, but is a effective one). So, Who is this for?

  1. Using Windows 10
  2. When debugging everything should work flawless: roam-to-git open chrome and downloads your files, put doesn't unzip
  3. When using the normal command (roam-to-git notes/) getting some error like: can't create path for "the_name_of_your_note?"

The problem is: when we are using roam research, we just put symbols like: "; ? and |. But when unzipping to send to git, the files need to be saved in windows, and this piece of shit can't create files with names like: 'What's the capital of France?.md.', because it reserve [characters] Therefore, roam-to-git can't do its magic.


Solution

Just substitute the reserved characters to others. Problem: if you have a lot of files in roam research this will be a huge pain (Probably there's some smart way to do it. I went to brute force). My suggestions:

? β‡’ ?(full width)
: β‡’ - (hyphen)
| β‡’ - (hyphen)
" β‡’ ' (simple quote)

I thing these are the most frequent. If you need to use other characters, you can search for alternatives on the character map app (win10 native).


Appendix:
The following reserved characters:

< (less than)
'>' (greater than)
: (colon)
" (double quote)
/ (forward slash)
\ (backslash)
| (vertical bar or pipe)
? (question mark)
'*' (asterisk)

Formatted view can't handle "complex" page names

Thank you for building this - it is glorious.

The issue I noticed is that formatted view does not properly support the complex page names (and I use them extensively).
What I mean by complex page names is the names that contain references to the other pages. For example [[[[search]] [[algorithm]]]] is a complex page name that links to both [[search]] and [[algorithm]] the exporter currently does not parse/render such links properly

To add another example - I have a page [[Respecting [[level of [[abstraction]]]]]] (looks scary but blessedly you can hide brackets :p)
currently exporter renders the link as follows:
/blob/master/formatted/Respecting%20%5B%5Blevel%20of%20%5B%5Babstraction.md (the address is broken and does not in fact point to the page because of the missing brackets).
Also it is attaching the link to the abstraction (innermost page) instead of the whole thing.

All backups starting from ~16 hours ago are failing with NetworkError: Execution context was destroyed, most likely because of a navigation.

See a multitude of WF runs here: https://github.com/Stvad/roam-notes-workflow/actions

Example log:

2020-05-24T20:16:16.8253367Z [W:pyppeteer.chromium_downloader] chromium extracted to: /home/runner/.local/share/pyppeteer/local-chromium/588429
2020-05-24T20:16:18.5849601Z 2020-05-24 20:16:18.584 | DEBUG    | roam_to_git.scrapping:download_rr_archive:54 - Creating browser
2020-05-24T20:16:19.1361020Z 2020-05-24 20:16:19.135 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:91 - Configure downloads to /tmp/tmpv9fg_q9r
2020-05-24T20:16:19.1477421Z 2020-05-24 20:16:19.147 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:91 - Configure downloads to /tmp/tmplzquag8l
2020-05-24T20:16:19.1712150Z 2020-05-24 20:16:19.170 | DEBUG    | roam_to_git.scrapping:signin:181 - Opening signin page
2020-05-24T20:16:19.1837365Z 2020-05-24 20:16:19.183 | DEBUG    | roam_to_git.scrapping:signin:181 - Opening signin page
2020-05-24T20:16:24.0680539Z 2020-05-24 20:16:24.067 | DEBUG    | roam_to_git.scrapping:signin:185 - Fill email '***'
2020-05-24T20:16:24.1400072Z 2020-05-24 20:16:24.139 | DEBUG    | roam_to_git.scrapping:signin:185 - Fill email '***'
2020-05-24T20:16:24.9369538Z 2020-05-24 20:16:24.936 | DEBUG    | roam_to_git.scrapping:signin:190 - Fill password
2020-05-24T20:16:25.0070283Z 2020-05-24 20:16:25.006 | DEBUG    | roam_to_git.scrapping:signin:190 - Fill password
2020-05-24T20:16:26.4999082Z 2020-05-24 20:16:26.499 | DEBUG    | roam_to_git.scrapping:signin:195 - Click on sign-in
2020-05-24T20:16:26.5701911Z 2020-05-24 20:16:26.569 | DEBUG    | roam_to_git.scrapping:signin:195 - Click on sign-in
2020-05-24T20:16:28.7279610Z 2020-05-24 20:16:28.727 | DEBUG    | roam_to_git.scrapping:go_to_database:205 - Load database from url 'https://roamresearch.com/#/app/***'
2020-05-24T20:16:28.7834035Z 2020-05-24 20:16:28.782 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:101 - Wait for interface to load
2020-05-24T20:16:28.7986602Z 2020-05-24 20:16:28.798 | DEBUG    | roam_to_git.scrapping:go_to_database:205 - Load database from url 'https://roamresearch.com/#/app/***'
2020-05-24T20:16:28.8578568Z 2020-05-24 20:16:28.857 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:101 - Wait for interface to load
2020-05-24T20:16:29.5277156Z 2020-05-24 20:16:29.527 | DEBUG    | roam_to_git.scrapping:download_rr_archive:75 - Closing browser markdown
2020-05-24T20:16:29.6028381Z 2020-05-24 20:16:29.602 | DEBUG    | roam_to_git.scrapping:download_rr_archive:75 - Closing browser json
2020-05-24T20:16:30.9261565Z 2020-05-24 20:16:30.925 | DEBUG    | roam_to_git.scrapping:download_rr_archive:77 - Closed browser markdown
2020-05-24T20:16:30.9356856Z 2020-05-24 20:16:30.926 | ERROR    | __main__:<module>:11 - An error has been caught in function '<module>', process 'MainProcess' (2691), thread 'MainThread' (140435242338112):
2020-05-24T20:16:30.9357586Z Traceback (most recent call last):
2020-05-24T20:16:30.9358093Z 
2020-05-24T20:16:30.9360296Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/pyppeteer/execution_context.py", line 99, in evaluateHandle
2020-05-24T20:16:30.9361514Z     _obj = await self._client.send('Runtime.callFunctionOn', {
2020-05-24T20:16:30.9362580Z                  Γ’β€β€š    Γ’β€β€š       Ò”” <function CDPSession.send at 0x7fb999b105e0>
2020-05-24T20:16:30.9363511Z                  Γ’β€β€š    Ò”” <pyppeteer.connection.CDPSession object at 0x7fb998ec7370>
2020-05-24T20:16:30.9364390Z                  Ò”” <pyppeteer.execution_context.ExecutionContext object at 0x7fb998edf9a0>
2020-05-24T20:16:30.9364854Z 
2020-05-24T20:16:30.9365356Z pyppeteer.errors.NetworkError: Protocol error (Runtime.callFunctionOn): Cannot find context with specified id
2020-05-24T20:16:30.9365723Z 
2020-05-24T20:16:30.9366046Z 
2020-05-24T20:16:30.9366456Z During handling of the above exception, another exception occurred:
2020-05-24T20:16:30.9366766Z 
2020-05-24T20:16:30.9367075Z 
2020-05-24T20:16:30.9367473Z Traceback (most recent call last):
2020-05-24T20:16:30.9367796Z 
2020-05-24T20:16:30.9368558Z > File "/opt/hostedtoolcache/Python/3.8.3/x64/bin/roam-to-git", line 11, in <module>
2020-05-24T20:16:30.9369415Z     load_entry_point('roam-to-git==0.1', 'console_scripts', 'roam-to-git')()
2020-05-24T20:16:30.9370278Z     Ò”” <function load_entry_point at 0x7fb99c8a15e0>
2020-05-24T20:16:30.9371337Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/roam_to_git/__main__.py", line 76, in main
2020-05-24T20:16:30.9372125Z     scrap(markdown_zip_path, json_zip_path, config)
2020-05-24T20:16:30.9372971Z     Γ’β€β€š     Γ’β€β€š                  Γ’β€β€š              Ò”” <roam_to_git.scrapping.Config object at 0x7fb99922d8b0>
2020-05-24T20:16:30.9374212Z     Γ’β€β€š     Γ’β€β€š                  Ò”” PosixPath('/tmp/tmplzquag8l')
2020-05-24T20:16:30.9374991Z     Γ’β€β€š     Ò”” PosixPath('/tmp/tmpv9fg_q9r')
2020-05-24T20:16:30.9375757Z     Ò”” <function scrap at 0x7fb9992344c0>
2020-05-24T20:16:30.9376722Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 250, in scrap
2020-05-24T20:16:30.9377305Z     asyncio.get_event_loop().run_until_complete(asyncio.gather(*tasks))
2020-05-24T20:16:30.9378223Z     Γ’β€β€š       Γ’β€β€š                                   Γ’β€β€š       Γ’β€β€š       Ò”” [<coroutine object download_rr_archive at 0x7fb99923e540>, <coroutine object download_rr_archive at 0x7fb99923e140>]
2020-05-24T20:16:30.9380260Z     Γ’β€β€š       Γ’β€β€š                                   Γ’β€β€š       Ò”” <function gather at 0x7fb99b223310>
2020-05-24T20:16:30.9382183Z     Γ’β€β€š       Γ’β€β€š                                   Ò”” <module 'asyncio' from '/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/asyncio/__init__.py'>
2020-05-24T20:16:30.9384002Z     Γ’β€β€š       Ò”” <built-in function get_event_loop>
2020-05-24T20:16:30.9385838Z     Ò”” <module 'asyncio' from '/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/asyncio/__init__.py'>
2020-05-24T20:16:30.9387481Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
2020-05-24T20:16:30.9388921Z     return future.result()
2020-05-24T20:16:30.9391000Z            Γ’β€β€š      Ò”” <method 'result' of '_asyncio.Future' objects>
2020-05-24T20:16:30.9393194Z            Ò”” <_GatheringFuture finished exception=NetworkError('Execution context was destroyed, most likely because of a navigation.')>
2020-05-24T20:16:30.9395283Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 68, in download_rr_archive
2020-05-24T20:16:30.9396865Z     return await _download_rr_archive(document, output_type, output_directory, config)
2020-05-24T20:16:30.9398755Z                  Γ’β€β€š                    Γ’β€β€š         Γ’β€β€š            Γ’β€β€š                 Ò”” <roam_to_git.scrapping.Config object at 0x7fb99922d8b0>
2020-05-24T20:16:30.9403948Z                  Γ’β€β€š                    Γ’β€β€š         Γ’β€β€š            Ò”” PosixPath('/tmp/tmpv9fg_q9r')
2020-05-24T20:16:30.9404986Z                  Γ’β€β€š                    Γ’β€β€š         Ò”” 'markdown'
2020-05-24T20:16:30.9405688Z                  Γ’β€β€š                    Ò”” <pyppeteer.page.Page object at 0x7fb998ec7550>
2020-05-24T20:16:30.9406348Z                  Ò”” <function _download_rr_archive at 0x7fb999234280>
2020-05-24T20:16:30.9407222Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 105, in _download_rr_archive
2020-05-24T20:16:30.9407974Z     dot_button = await document.querySelector(".bp3-icon-more")
2020-05-24T20:16:30.9408686Z                        Γ’β€β€š        Ò”” <function Page.querySelector at 0x7fb999298f70>
2020-05-24T20:16:30.9409376Z                        Ò”” <pyppeteer.page.Page object at 0x7fb998ec7550>
2020-05-24T20:16:30.9410181Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/pyppeteer/page.py", line 371, in querySelector
2020-05-24T20:16:30.9410601Z     return await frame.querySelector(selector)
2020-05-24T20:16:30.9411214Z                  Γ’β€β€š     Γ’β€β€š             Ò”” '.bp3-icon-more'
2020-05-24T20:16:30.9411861Z                  Γ’β€β€š     Ò”” <function Frame.querySelector at 0x7fb999261c10>
2020-05-24T20:16:30.9412764Z                  Ò”” <pyppeteer.frame_manager.Frame object at 0x7fb998efc400>
2020-05-24T20:16:30.9413589Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/pyppeteer/frame_manager.py", line 317, in querySelector
2020-05-24T20:16:30.9414058Z     value = await document.querySelector(selector)
2020-05-24T20:16:30.9414661Z                   Γ’β€β€š        Γ’β€β€š             Ò”” '.bp3-icon-more'
2020-05-24T20:16:30.9415316Z                   Γ’β€β€š        Ò”” <function ElementHandle.querySelector at 0x7fb9995db550>
2020-05-24T20:16:30.9416005Z                   Ò”” <pyppeteer.element_handle.ElementHandle object at 0x7fb998edf7c0>
2020-05-24T20:16:30.9417123Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/pyppeteer/element_handle.py", line 358, in querySelector
2020-05-24T20:16:30.9417560Z     handle = await self.executionContext.evaluateHandle(
2020-05-24T20:16:30.9418161Z                    Γ’β€β€š    Ò”” <property object at 0x7fb999699bd0>
2020-05-24T20:16:30.9418859Z                    Ò”” <pyppeteer.element_handle.ElementHandle object at 0x7fb998edf7c0>
2020-05-24T20:16:30.9419631Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/pyppeteer/execution_context.py", line 108, in evaluateHandle
2020-05-24T20:16:30.9420053Z     _rewriteError(e)
2020-05-24T20:16:30.9420644Z     Ò”” <function _rewriteError at 0x7fb99969a0d0>
2020-05-24T20:16:30.9421411Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/pyppeteer/execution_context.py", line 237, in _rewriteError
2020-05-24T20:16:30.9421844Z     raise type(error)(msg)
2020-05-24T20:16:30.9422490Z                Γ’β€β€š      Ò”” 'Execution context was destroyed, most likely because of a navigation.'
2020-05-24T20:16:30.9423226Z                Ò”” NetworkError('Protocol error (Runtime.callFunctionOn): Cannot find context with specified id')
2020-05-24T20:16:30.9423508Z 
2020-05-24T20:16:30.9423858Z pyppeteer.errors.NetworkError: Execution context was destroyed, most likely because of a navigation.
2020-05-24T20:16:30.9424200Z Traceback (most recent call last):
2020-05-24T20:16:30.9424913Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/pyppeteer/execution_context.py", line 99, in evaluateHandle
2020-05-24T20:16:30.9425772Z     _obj = await self._client.send('Runtime.callFunctionOn', {
2020-05-24T20:16:30.9426253Z pyppeteer.errors.NetworkError: Protocol error (Runtime.callFunctionOn): Cannot find context with specified id
2020-05-24T20:16:30.9426514Z 
2020-05-24T20:16:30.9426804Z During handling of the above exception, another exception occurred:
2020-05-24T20:16:30.9427060Z 
2020-05-24T20:16:30.9427359Z Traceback (most recent call last):
2020-05-24T20:16:30.9428091Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/bin/roam-to-git", line 11, in <module>
2020-05-24T20:16:30.9428776Z     load_entry_point('roam-to-git==0.1', 'console_scripts', 'roam-to-git')()
2020-05-24T20:16:30.9429593Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/loguru/_logger.py", line 1210, in catch_wrapper
2020-05-24T20:16:30.9430289Z     return function(*args, **kwargs)
2020-05-24T20:16:30.9432358Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/roam_to_git/__main__.py", line 76, in main
2020-05-24T20:16:30.9432617Z     scrap(markdown_zip_path, json_zip_path, config)
2020-05-24T20:16:30.9433178Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 250, in scrap
2020-05-24T20:16:30.9433417Z     asyncio.get_event_loop().run_until_complete(asyncio.gather(*tasks))
2020-05-24T20:16:30.9433705Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
2020-05-24T20:16:30.9433923Z     return future.result()
2020-05-24T20:16:30.9434487Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 68, in download_rr_archive
2020-05-24T20:16:30.9434751Z     return await _download_rr_archive(document, output_type, output_directory, config)
2020-05-24T20:16:30.9435308Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 105, in _download_rr_archive
2020-05-24T20:16:30.9435763Z     dot_button = await document.querySelector(".bp3-icon-more")
2020-05-24T20:16:30.9436289Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/pyppeteer/page.py", line 371, in querySelector
2020-05-24T20:16:30.9436523Z     return await frame.querySelector(selector)
2020-05-24T20:16:30.9437051Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/pyppeteer/frame_manager.py", line 317, in querySelector
2020-05-24T20:16:30.9437521Z     value = await document.querySelector(selector)
2020-05-24T20:16:30.9438109Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/pyppeteer/element_handle.py", line 358, in querySelector
2020-05-24T20:16:30.9438342Z     handle = await self.executionContext.evaluateHandle(
2020-05-24T20:16:30.9438896Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/pyppeteer/execution_context.py", line 108, in evaluateHandle
2020-05-24T20:16:30.9439111Z     _rewriteError(e)
2020-05-24T20:16:30.9439636Z   File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/pyppeteer/execution_context.py", line 237, in _rewriteError
2020-05-24T20:16:30.9439875Z     raise type(error)(msg)
2020-05-24T20:16:30.9440107Z pyppeteer.errors.NetworkError: Execution context was destroyed, most likely because of a navigation.
2020-05-24T20:16:30.9440690Z 2020-05-24 20:16:30.942 | DEBUG    | roam_to_git.scrapping:_kill_child_process:213 - Terminate child process [psutil.Process(pid=2734, name='chrome', started='20:16:17')]
2020-05-24T20:16:30.9902387Z ##[error]Process completed with exit code 1.

Initial backup attempt error: Process completed with exit code 1.

Describe the bug
I followed the directions for installing with GitHub Actions, and after running for 8m 6s the action ended in an error: Process completed with exit code 1. (Check failure on line 1 in .github)

To Reproduce
Steps to reproduce the behavior:
I just followed the directions in the readme for GitHub Actions. I created the three separate secrets.

Expected behavior
The backup to complete.

Traceback
Process completed with exit code 1. (Check failure on line 1 in .github)

Run roam-to-git --debug notes/ and report what you get.
(I don't believe this feature is possible with the GitHub Actions method?)

Please complete the following information:

  • OS: MacOS
  • Do you use Github Action? Yes
  • Do you use multiple Roam databases? Yes
  • Does roam-to-git use to work for you? When precisely did it stopped to work? No
  • Does some backup runs are still working? No

Additional context
Thank you for any and all help in advance! I will try it locally now, but figured the bug report was at least useful regardless.

Namespaced pages are missing in formatted output

Describe the bug
While browsing the snapshot of my database I have noticed that namespaced pages are missing in formatted output directory, but are available in markdown output directory.

To Reproduce
Steps to reproduce the behavior:

  1. Create namespaced page, e.g. Roam Research/Change Log
  2. Wait for the automated snapshot to finish.
  3. Browse formatted directory.
  4. See that the namespaced page is missing.

Expected behavior
The namespaced page should be exported to formatted output using the same directory snapshot as markdown output.

Additional context
I've created few pages to illustrate the issue in the demo dabatase:

Also, see snapshot commit.

[remote rejected] master -> master (refusing to allow an OAuth App to create or update workflow

Describe the bug
When doing git push, I get this error message:
! [remote rejected] master -> master (refusing to allow an OAuth App to create or update workflow .github/workflows/main.yml without workflow scope)

To Reproduce

  1. git push --set-upstream origin master

Expected behavior
A clear and concise description of what you expected to happen.

Traceback
Please use http://gist.github.com/ or similar, and report the last line here.

Run roam-to-git --debug notes/ and report what you get.
! [remote rejected] master -> master (refusing to allow an OAuth App to create or update workflow .github/workflows/main.yml without workflow scope)

Please complete the following information:

  • OS: [MacOs]
  • Do you use Github Action? No
  • Do you use multiple Roam databases? No
  • Does roam-to-git use to work for you? When precisely did it stopped to work? Never worked
  • Does some backup runs are still working? No

Additional context
Add any other context about the problem here.

login through google?

I log in using google for Roam - is that supported? I assume I don't use my google password. Is there a way in Roam to add a regular login to an account that already exists?

No Google Account Authentication

Related to "Backup fails" #47

It would be great if you could update the README to explain that Google Account integration is currently not available. The password must be set manually via "forgotten password". See below for more details. Thanks for the amazing tool.

Having the same issue also - I've only discovered this tool as of yesterday so never had a working back up.

The cause actually turned out to be because I was using the Google Account authentication rather than just a plain email and password.

In order to set it to the email and password, you have to go through the "forgotten password" process. Currently there's no way of setting your password in your account setting, which is something that's expected in most UI's with account details nowadays.

Hope this helps someone! πŸ˜…

Originally posted by @emettely in #47 (comment)

Unzip fails

Describe the bug
roam-to-git fails when run with GitHub actions.

To Reproduce
Steps to reproduce the behavior:
Start the backup.

Expected behavior
Run the backup.

Traceback

Run roam-to-git --skip-git .
2020-07-31 15:27:48.613 | DEBUG    | roam_to_git.__main__:main:53 - No secret found at /home/runner/work/notes/notes/.env
2020-07-31 15:27:48.615 | DEBUG    | roam_to_git.scrapping:download_rr_archive:55 - Creating browser
[W:pyppeteer.chromium_downloader] start chromium download.
Download may take a few minutes.
  0%|          | 0/108773488 [00:00<?, ?it/s]
  6%|β–Œ         | 6686720/108773488 [00:00<00:01, 66762209.15it/s]
 19%|β–ˆβ–‰        | 21186560/108773488 [00:00<00:01, 79635894.98it/s]
 34%|β–ˆβ–ˆβ–ˆβ–      | 36884480/108773488 [00:00<00:00, 93369311.19it/s]
 48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š     | 52295680/108773488 [00:00<00:00, 105887183.63it/s]
 62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–   | 67112960/108773488 [00:00<00:00, 103089937.41it/s]
75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–  | 81397760/108773488 [00:00<00:00, 112412818.82it/s]
 88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 95375360/108773488 [00:00<00:00, 119398460.96it/s]
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 108773488/108773488 [00:00<00:00, 127783948.48it/s]
[W:pyppeteer.chromium_downloader] 
chromium download done.
[W:pyppeteer.chromium_downloader] chromium extracted to: /home/runner/.local/share/pyppeteer/local-chromium/588429
2020-07-31 15:27:54.933 | DEBUG    | roam_to_git.scrapping:download_rr_archive:55 - Creating browser
2020-07-31 15:27:55.461 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:92 - Configure downloads to /tmp/tmpcdjcmlv6
2020-07-31 15:27:55.473 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:92 - Configure downloads to /tmp/tmplhhksw1h
2020-07-31 15:27:55.496 | DEBUG    | roam_to_git.scrapping:signin:183 - Opening signin page
2020-07-31 15:27:55.508 | DEBUG    | roam_to_git.scrapping:signin:183 - Opening signin page
2020-07-31 15:28:00.998 | DEBUG    | roam_to_git.scrapping:signin:187 - Fill email '***'
2020-07-31 15:28:01.051 | DEBUG    | roam_to_git.scrapping:signin:187 - Fill email '***'
2020-07-31 15:28:04.091 | DEBUG    | roam_to_git.scrapping:signin:192 - Fill password
2020-07-31 15:28:04.285 | DEBUG    | roam_to_git.scrapping:signin:192 - Fill password
2020-07-31 15:28:04.714 | DEBUG    | roam_to_git.scrapping:signin:197 - Click on sign-in
2020-07-31 15:28:04.914 | DEBUG    | roam_to_git.scrapping:signin:197 - Click on sign-in
2020-07-31 15:28:06.952 | DEBUG    | roam_to_git.scrapping:go_to_database:207 - Load database from url 'https://roamresearch.com/#/app/***'
2020-07-31 15:28:07.143 | DEBUG    | roam_to_git.scrapping:go_to_database:207 - Load database from url 'https://roamresearch.com/#/app/***'
2020-07-31 15:28:07.884 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:102 - Wait for interface to load
2020-07-31 15:28:07.990 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:102 - Wait for interface to load
2020-07-31 15:28:10.113 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:130 - Launch download popup
2020-07-31 15:28:10.202 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:130 - Launch download popup
2020-07-31 15:28:12.457 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:144 - Checking download type
2020-07-31 15:28:12.457 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:144 - Checking download type
2020-07-31 15:28:12.501 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:159 - Downloading output of type markdown
2020-07-31 15:28:12.538 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:144 - Checking download type
2020-07-31 15:28:12.583 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:148 - Changing output type to json
2020-07-31 15:28:12.809 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:164 - Wait download of markdown to /tmp/tmpcdjcmlv6
2020-07-31 15:28:13.812 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:174 - File /tmp/tmpcdjcmlv6/Roam-Export-1596209293568.zip found for markdown
2020-07-31 15:28:14.813 | DEBUG    | roam_to_git.scrapping:download_rr_archive:76 - Closing browser markdown
2020-07-31 15:28:15.129 | DEBUG    | roam_to_git.scrapping:download_rr_archive:78 - Closed browser markdown
2020-07-31 15:28:17.220 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:159 - Downloading output of type json
2020-07-31 15:28:17.524 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:164 - Wait download of json to /tmp/tmplhhksw1h
2020-07-31 15:28:18.528 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:174 - File /tmp/tmplhhksw1h/Roam-Export-1596209297749.zip found for json
2020-07-31 15:28:19.530 | DEBUG    | roam_to_git.scrapping:download_rr_archive:76 - Closing browser json
2020-07-31 15:28:19.562 | DEBUG    | roam_to_git.scrapping:download_rr_archive:78 - Closed browser json
2020-07-31 15:28:19.562 | DEBUG    | roam_to_git.scrapping:scrap:253 - Scrapping finished
2020-07-31 15:28:19.564 | ERROR    | __main__:<module>:11 - An error has been caught in function '<module>', process 'MainProcess' (3172), thread 'MainThread' (140397351790400):
Traceback (most recent call last):
> File "/opt/hostedtoolcache/Python/3.8.5/x64/bin/roam-to-git", line 11, in <module>
    load_entry_point('roam-to-git==0.1', 'console_scripts', 'roam-to-git')()
    β”” <function load_entry_point at 0x7fb0ca124f70>
  File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/roam_to_git/__main__.py", line 81, in main
    raws = unzip_markdown_archive(markdown_zip_path)
           β”‚                      β”” PosixPath('/tmp/tmpcdjcmlv6')
           β”” <function unzip_markdown_archive at 0x7fb0c8969310>
  File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/roam_to_git/fs.py", line 43, in unzip_markdown_archive
    contents = {file.filename: zip_file.read(file.filename).decode()
                               β”‚        β”” <function ZipFile.read at 0x7fb0cbfbe790>
                               β”” <zipfile.ZipFile [closed]>
  File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/roam_to_git/fs.py", line 45, in <dictcomp>
    if not file.is_dir()}
           β”‚    β”” <function ZipInfo.is_dir at 0x7fb0cbfb98b0>
           β”” <unprintable ZipInfo object>
  File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/zipfile.py", line 551, in is_dir
    return self.filename[-1] == '/'
           β”‚    β”” <member 'filename' of 'ZipInfo' objects>
           β”” <unprintable ZipInfo object>

IndexError: string index out of range
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.8.5/x64/bin/roam-to-git", line 11, in <module>
    load_entry_point('roam-to-git==0.1', 'console_scripts', 'roam-to-git')()
  File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/loguru/_logger.py", line 1149, in catch_wrapper
    return function(*args, **kwargs)
  File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/roam_to_git/__main__.py", line 81, in main
    raws = unzip_markdown_archive(markdown_zip_path)
  File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/roam_to_git/fs.py", line 43, in unzip_markdown_archive
    contents = {file.filename: zip_file.read(file.filename).decode()
  File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/roam_to_git/fs.py", line 45, in <dictcomp>
    if not file.is_dir()}
  File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/zipfile.py", line 551, in is_dir
    return self.filename[-1] == '/'
IndexError: string index out of range
##[error]Process completed with exit code 1.
 Commit changes0s
  Post Run actions/checkout@v20s
  Complete job

Please complete the following information:

  • Do you use Github Action? Yes.
  • Do you use multiple Roam databases? No.
  • Does roam-to-git use to work for you? When precisely did it stopped to work? No, first setup.
  • Does some backup runs are still working? No.

An error has been caught in function '<module>', process 'MainProcess'

I am getting the following error and cannot resolve.

2020-08-27 02:28:03.015 | DEBUG | roam_to_git.scrapping:scrap:253 - Scrapping finished 2020-08-27 02:28:03.101 | DEBUG | roam_to_git.fs:save_markdowns:50 - Saving markdown to /home/runner/work/RoamNotes/RoamNotes/markdown 2020-08-27 02:28:03.138 | ERROR | __main__:<module>:11 - An error has been caught in function '<module>', process 'MainProcess' (2999), thread 'MainThread' (140571112994624):

Has anyone else seen this? Any ideas on what to try?

Backup fails

Describe the bug
I've setup the Action to run but it fails.

To Reproduce
Steps to reproduce the behavior:
Followed the instructions to set up the action and secrets, but backup fails.

Expected behavior
Backup to succeed.

Traceback
##[section]Starting: Request a runner to run this job
Can't find any online and idle self-hosted runner in current repository that matches the required labels: 'ubuntu-latest'
Can't find any online and idle self-hosted runner in current repository's account/organization that matches the required labels: 'ubuntu-latest'
Found online and idle hosted runner in current repository's account/organization that matches the required labels: 'ubuntu-latest'
##[section]Finishing: Request a runner to run this job
Current runner version: '2.273.0'
##[group]Operating System
Ubuntu
18.04.5
LTS
##[endgroup]
##[group]Virtual Environment
Environment: ubuntu-18.04
Version: 20200817.1
Included Software: https://github.com/actions/virtual-environments/blob/ubuntu18/20200817.1/images/linux/Ubuntu1804-README.md
##[endgroup]
Prepare workflow directory
Prepare all required actions
Download action repository 'actions/checkout@v2'
Download action repository 'actions/setup-python@v1'
Download action repository 'elstudio/actions-js-build@v3'
##[group]Build container for action use: '/home/runner/work/_actions/elstudio/actions-js-build/v3/commit/Dockerfile'.
##[command]/usr/bin/docker build -t 3b3ac6:5d85631f2edb4c238a2db0c2dc76337b -f "/home/runner/work/_actions/elstudio/actions-js-build/v3/commit/Dockerfile" "/home/runner/work/_actions/elstudio/actions-js-build/v3/commit"
Sending build context to Docker daemon 77.31kB

Step 1/13 : FROM alpine:3.10
---> be4e4bea2c2e
Step 2/13 : LABEL version="1.0.0"
---> Running in 0adffe704988
Removing intermediate container 0adffe704988
---> 64b635ca3eb0
Step 3/13 : LABEL repository="https://github.com/elstudio/actions-js-build"
---> Running in 37ac4670e420
Removing intermediate container 37ac4670e420
---> 5fd55af56296
Step 4/13 : LABEL homepage="https://github.com/elstudio/actions-js-build"
---> Running in 448b9c199a7f
Removing intermediate container 448b9c199a7f
---> 5f6e95f76b47
Step 5/13 : LABEL maintainer="el-studio Actions [email protected]"
---> Running in 0eaceba85109
Removing intermediate container 0eaceba85109
---> b6e56198e83b
Step 6/13 : LABEL com.github.actions.name="GitHub Action for git commit"
---> Running in a92546333669
Removing intermediate container a92546333669
---> ca210162d4f1
Step 7/13 : LABEL com.github.actions.description="Commits any changed files and pushes the result back to origin."
---> Running in 3ce79b11e3b2
Removing intermediate container 3ce79b11e3b2
---> df5e3cee6c86
Step 8/13 : LABEL com.github.actions.icon="git-commit"
---> Running in e81ffe676943
Removing intermediate container e81ffe676943
---> 232836aa8db4
Step 9/13 : LABEL com.github.actions.color="green"
---> Running in 351320b5c397
Removing intermediate container 351320b5c397
---> 416e50091176
Step 10/13 : COPY LICENSE README.md THIRD_PARTY_NOTICE.md /
---> e38a8c328057
Step 11/13 : RUN apk --update --no-cache add git
---> Running in 3cb66673d15d
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.10/community/x86_64/APKINDEX.tar.gz
(1/6) Installing ca-certificates (20191127-r2)
(2/6) Installing nghttp2-libs (1.39.2-r1)
(3/6) Installing libcurl (7.66.0-r0)
(4/6) Installing expat (2.2.8-r0)
(5/6) Installing pcre2 (10.33-r0)
(6/6) Installing git (2.22.4-r0)
Executing busybox-1.30.1-r3.trigger
Executing ca-certificates-20191127-r2.trigger
OK: 21 MiB in 20 packages
Removing intermediate container 3cb66673d15d
---> 53efe912012e
Step 12/13 : COPY "entrypoint.sh" "/entrypoint.sh"
---> e66cab7335ac
Step 13/13 : ENTRYPOINT ["/entrypoint.sh"]
---> Running in c5193f57d696
Removing intermediate container c5193f57d696
---> 3625bb579e7f
Successfully built 3625bb579e7f
Successfully tagged 3b3ac6:5d85631f2edb4c238a2db0c2dc76337b
##[endgroup]
##[group]Run actions/checkout@v2
with:
repository: quickfold/roam-backup
token: ***
ssh-strict: true
persist-credentials: true
clean: true
fetch-depth: 1
lfs: false
submodules: false
##[endgroup]
Syncing repository: quickfold/roam-backup
##[group]Getting Git version info
Working directory is '/home/runner/work/roam-backup/roam-backup'
[command]/usr/bin/git version
git version 2.28.0
##[endgroup]
Deleting the contents of '/home/runner/work/roam-backup/roam-backup'
##[group]Initializing the repository
[command]/usr/bin/git init /home/runner/work/roam-backup/roam-backup
Initialized empty Git repository in /home/runner/work/roam-backup/roam-backup/.git/
[command]/usr/bin/git remote add origin https://github.com/quickfold/roam-backup
##[endgroup]
##[group]Disabling automatic garbage collection
[command]/usr/bin/git config --local gc.auto 0
##[endgroup]
##[group]Setting up auth
[command]/usr/bin/git config --local --name-only --get-regexp core.sshCommand
[command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'core.sshCommand' && git config --local --unset-all 'core.sshCommand' || :
[command]/usr/bin/git config --local --name-only --get-regexp http.https://github.com/.extraheader
[command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'http.https://github.com/.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :
[command]/usr/bin/git config --local http.https://github.com/.extraheader AUTHORIZATION: basic ***
##[endgroup]
##[group]Fetching the repository
[command]/usr/bin/git -c protocol.version=2 fetch --no-tags --prune --progress --no-recurse-submodules --depth=1 origin +01363b876c7eca53f45ee1ac353e03ce16fa2b09:refs/remotes/origin/master
remote: Enumerating objects: 7, done.
remote: Counting objects: 14% (1/7)
remote: Counting objects: 28% (2/7)
remote: Counting objects: 42% (3/7)
remote: Counting objects: 57% (4/7)
remote: Counting objects: 71% (5/7)
remote: Counting objects: 85% (6/7)
remote: Counting objects: 100% (7/7)
remote: Counting objects: 100% (7/7), done.
remote: Compressing objects: 25% (1/4)
remote: Compressing objects: 50% (2/4)
remote: Compressing objects: 75% (3/4)
remote: Compressing objects: 100% (4/4)
remote: Compressing objects: 100% (4/4), done.
remote: Total 7 (delta 0), reused 0 (delta 0), pack-reused 0
From https://github.com/quickfold/roam-backup

  • [new ref] 01363b876c7eca53f45ee1ac353e03ce16fa2b09 -> origin/master
    ##[endgroup]
    ##[group]Determining the checkout info
    ##[endgroup]
    ##[group]Checking out the ref
    [command]/usr/bin/git checkout --progress --force -B master refs/remotes/origin/master
    Reset branch 'master'
    Branch 'master' set up to track remote branch 'master' from 'origin'.
    ##[endgroup]
    [command]/usr/bin/git log -1
    commit 01363b876c7eca53f45ee1ac353e03ce16fa2b09
    Author: quickfold [email protected]
    Date: Sat Aug 29 17:22:06 2020 +0800

    Create main.yml
    ##[group]Run actions/setup-python@v1
    with:
    python-version: 3.8
    architecture: x64
    ##[endgroup]
    Successfully setup CPython (3.8.5)
    ##[group]Run pip install git+https://github.com/MatthieuBizien/roam-to-git.git
    οΏ½[36;1mpip install git+https://github.com/MatthieuBizien/roam-to-git.gitοΏ½[0m
    shell: /bin/bash -e {0}
    env:
    pythonLocation: /opt/hostedtoolcache/Python/3.8.5/x64
    ##[endgroup]
    Collecting git+https://github.com/MatthieuBizien/roam-to-git.git
    Cloning https://github.com/MatthieuBizien/roam-to-git.git to /tmp/pip-req-build-7g6retmo
    Collecting gitpython>=3.1.*
    Downloading GitPython-3.1.7-py3-none-any.whl (158 kB)
    Collecting loguru==0.4.*
    Downloading loguru-0.4.1-py3-none-any.whl (54 kB)
    Collecting pyppeteer>=0.0.25
    Downloading pyppeteer-0.2.2-py3-none-any.whl (145 kB)
    Collecting python-dotenv>=0.10.*
    Downloading python_dotenv-0.14.0-py2.py3-none-any.whl (17 kB)
    Collecting psutil>=5.6.0
    (460 kB)
    Collecting gitdb<5,>=4.0.1
    Downloading gitdb-4.0.5-py3-none-any.whl (63 kB)
    Collecting websockets<9.0,>=8.1
    Downloading websockets-8.1-cp38-cp38-manylinux2010_x86_64.whl (78 kB)
    Collecting tqdm<5.0.0,>=4.42.1
    Downloading tqdm-4.48.2-py2.py3-none-any.whl (68 kB)
    Collecting pyee<8.0.0,>=7.0.1
    Downloading pyee-7.0.2-py2.py3-none-any.whl (12 kB)
    Collecting appdirs<2.0.0,>=1.4.3
    Downloading appdirs-1.4.4-py2.py3-none-any.whl (9.6 kB)
    Collecting urllib3<2.0.0,>=1.25.8
    Downloading urllib3-1.25.10-py2.py3-none-any.whl (127 kB)
    Collecting smmap<4,>=3.0.1
    Downloading smmap-3.0.4-py2.py3-none-any.whl (25 kB)
    Using legacy 'setup.py install' for roam-to-git, since package 'wheel' is not installed.
    Using legacy 'setup.py install' for psutil, since package 'wheel' is not installed.
    Installing collected packages: smmap, gitdb, gitpython, loguru, websockets, tqdm, pyee, appdirs, urllib3, pyppeteer, python-dotenv, psutil, roam-to-git
    Running setup.py install for psutil: started
    Running setup.py install for psutil: finished with status 'done'
    Running setup.py install for roam-to-git: started
    Running setup.py install for roam-to-git: finished with status 'done'
    Successfully installed appdirs-1.4.4 gitdb-4.0.5 gitpython-3.1.7 loguru-0.4.1 psutil-5.7.2 pyee-7.0.2 pyppeteer-0.2.2 python-dotenv-0.14.0 roam-to-git-0.1 smmap-3.0.4 tqdm-4.48.2 urllib3-1.25.10 websockets-8.1
    ##[group]Run roam-to-git --skip-git .
    οΏ½[36;1mroam-to-git --skip-git .οΏ½[0m
    shell: /bin/bash -e {0}
    env:
    pythonLocation: /opt/hostedtoolcache/Python/3.8.5/x64
    ROAMRESEARCH_USER: ***
    ROAMRESEARCH_PASSWORD: ***
    ROAMRESEARCH_DATABASE: ***
    ##[endgroup]
    2020-08-29 12:08:55.093 | INFO | roam_to_git.main:main:50 - Loading secrets from /home/runner/work/roam-backup/roam-backup/.env
    2020-08-29 12:08:55.095 | DEBUG | roam_to_git.scrapping:download_rr_archive:55 - Creating browser
    [W:pyppeteer.chromium_downloader] start chromium download.
    Download may take a few minutes.

0%| | 0/108773488 [00:00<?, ?it/s]
1%| | 624640/108773488 [00:00<00:17, 6192838.93it/s]
1%|▏ | 1546240/108773488 [00:00<00:15, 6860969.46it/s]
3%|β–Ž | 2867200/108773488 [00:00<00:13, 8006397.85it/s]
4%|▍ | 4689920/108773488 [00:00<00:10, 9604925.84it/s]
7%|β–‹ | 7086080/108773488 [00:00<00:08, 11690501.80it/s]
9%|β–‰ | 9789440/108773488 [00:00<00:07, 14075623.74it/s]
13%|β–ˆβ–Ž | 13998080/108773488 [00:00<00:05, 17576645.10it/s]
19%|β–ˆβ–Š | 20275200/108773488 [00:00<00:03, 22416747.68it/s]
27%|β–ˆβ–ˆβ–‹ | 29777920/108773488 [00:00<00:02, 29068466.44it/s]
39%|β–ˆβ–ˆβ–ˆβ–Š | 41953280/108773488 [00:01<00:01, 35251238.88it/s]
46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 50339840/108773488 [00:01<00:01, 40919085.99it/s]
54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 58726400/108773488 [00:01<00:01, 47055718.48it/s]
65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 70932480/108773488 [00:01<00:00, 57625668.93it/s]
73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 79063040/108773488 [00:01<00:00, 58795778.75it/s]
85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 92282880/108773488 [00:01<00:00, 69242396.98it/s]
93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 101120000/108773488 [00:01<00:00, 73387058.43it/s]
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 108773488/108773488 [00:01<00:00, 59685707.90it/s]
[W:pyppeteer.chromium_downloader]
chromium download done.
[W:pyppeteer.chromium_downloader] chromium extracted to: /home/runner/.local/share/pyppeteer/local-chromium/588429
2020-08-29 12:09:03.369 | DEBUG | roam_to_git.scrapping:download_rr_archive:55 - Creating browser
2020-08-29 12:09:03.931 | DEBUG | roam_to_git.scrapping:_download_rr_archive:92 - Configure downloads to /tmp/tmps7r9ciu6
2020-08-29 12:09:03.943 | DEBUG | roam_to_git.scrapping:_download_rr_archive:92 - Configure downloads to /tmp/tmp1q33m0i7
2020-08-29 12:09:03.977 | DEBUG | roam_to_git.scrapping:signin:183 - Opening signin page
2020-08-29 12:09:03.988 | DEBUG | roam_to_git.scrapping:signin:183 - Opening signin page
2020-08-29 12:09:09.431 | DEBUG | roam_to_git.scrapping:signin:187 - Fill email ''
2020-08-29 12:09:09.460 | DEBUG | roam_to_git.scrapping:signin:187 - Fill email '
'
2020-08-29 12:09:13.202 | DEBUG | roam_to_git.scrapping:signin:192 - Fill password
2020-08-29 12:09:13.922 | DEBUG | roam_to_git.scrapping:signin:197 - Click on sign-in
2020-08-29 12:09:13.981 | DEBUG | roam_to_git.scrapping:signin:192 - Fill password
2020-08-29 12:09:14.688 | DEBUG | roam_to_git.scrapping:signin:197 - Click on sign-in
2020-08-29 12:09:16.159 | DEBUG | roam_to_git.scrapping:go_to_database:207 - Load database from url 'https://roamresearch.com/#/app/'
2020-08-29 12:09:16.913 | DEBUG | roam_to_git.scrapping:_download_rr_archive:102 - Wait for interface to load
2020-08-29 12:09:16.942 | DEBUG | roam_to_git.scrapping:go_to_database:207 - Load database from url 'https://roamresearch.com/#/app/
'
2020-08-29 12:09:17.723 | DEBUG | roam_to_git.scrapping:_download_rr_archive:102 - Wait for interface to load
2020-08-29 12:16:01.849 | DEBUG | roam_to_git.scrapping:download_rr_archive:76 - Closing browser markdown
2020-08-29 12:16:01.882 | DEBUG | roam_to_git.scrapping:download_rr_archive:78 - Closed browser markdown
2020-08-29 12:16:01.883 | ERROR | main::11 - An error has been caught in function '', process 'MainProcess' (3110), thread 'MainThread' (140230760871744):
Traceback (most recent call last):

File "/opt/hostedtoolcache/Python/3.8.5/x64/bin/roam-to-git", line 11, in
load_entry_point('roam-to-git==0.1', 'console_scripts', 'roam-to-git')()
β”” <function load_entry_point at 0x7f8a007aaf70>
File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/roam_to_git/main.py", line 76, in main
scrap(markdown_zip_path, json_zip_path, config)
β”‚ β”‚ β”‚ β”” <roam_to_git.scrapping.Config object at 0x7f8a034f77c0>
β”‚ β”‚ β”” PosixPath('/tmp/tmp1q33m0i7')
β”‚ β”” PosixPath('/tmp/tmps7r9ciu6')
β”” <function scrap at 0x7f89fd17f9d0>
File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 252, in scrap
asyncio.get_event_loop().run_until_complete(asyncio.gather(*tasks))
β”‚ β”‚ β”‚ β”‚ β”” [<coroutine object download_rr_archive at 0x7f89fd4ef8c0>, <coroutine object download_rr_archive at 0x7f89fda107c0>]
β”‚ β”‚ β”‚ β”” <function gather at 0x7f89ff128550>
β”‚ β”‚ β”” <module 'asyncio' from '/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/asyncio/init.py'>
β”‚ β””
β”” <module 'asyncio' from '/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/asyncio/init.py'>
File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
β”‚ β”” <method 'result' of '_asyncio.Future' objects>
β”” <_GatheringFuture finished exception=AssertionError('All roads leads to Roam, but that one is too long. Try again when Roam s...
File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 69, in download_rr_archive
return await _download_rr_archive(document, output_type, output_directory, config)
β”‚ β”‚ β”‚ β”‚ β”” <roam_to_git.scrapping.Config object at 0x7f8a034f77c0>
β”‚ β”‚ β”‚ β”” PosixPath('/tmp/tmps7r9ciu6')
β”‚ β”‚ β”” 'markdown'
β”‚ β”” <pyppeteer.page.Page object at 0x7f89fcf49df0>
β”” <function _download_rr_archive at 0x7f89fd17f790>
File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 121, in _download_rr_archive
assert dot_button is not None, "All roads leads to Roam, but that one is too long. Try "
β”” None

AssertionError: All roads leads to Roam, but that one is too long. Try again when Roam servers are faster.
Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.8.5/x64/bin/roam-to-git", line 11, in
load_entry_point('roam-to-git==0.1', 'console_scripts', 'roam-to-git')()
File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/loguru/_logger.py", line 1149, in catch_wrapper
return function(*args, **kwargs)
File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/roam_to_git/main.py", line 76, in main
scrap(markdown_zip_path, json_zip_path, config)
File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 252, in scrap
asyncio.get_event_loop().run_until_complete(asyncio.gather(*tasks))
File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 69, in download_rr_archive
return await _download_rr_archive(document, output_type, output_directory, config)
File "/opt/hostedtoolcache/Python/3.8.5/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 121, in _download_rr_archive
assert dot_button is not None, "All roads leads to Roam, but that one is too long. Try "
AssertionError: All roads leads to Roam, but that one is too long. Try again when Roam servers are faster.
2020-08-29 12:16:01.893 | DEBUG | roam_to_git.scrapping:_kill_child_process:215 - Terminate child process [psutil.Process(pid=3368, name='chrome', status='sleeping', started='12:09:02'), psutil.Process(pid=3379, name='chrome', status='sleeping', started='12:09:02'), psutil.Process(pid=3401, name='chrome', status='sleeping', started='12:09:02'), psutil.Process(pid=3411, name='chrome', status='sleeping', started='12:09:02'), psutil.Process(pid=3381, name='chrome', status='sleeping', started='12:09:02'), psutil.Process(pid=3400, name='chrome', status='sleeping', started='12:09:02')]
##[error]Process completed with exit code 1.
Post job cleanup.
[command]/usr/bin/git version
git version 2.28.0
[command]/usr/bin/git config --local --name-only --get-regexp core.sshCommand
[command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'core.sshCommand' && git config --local --unset-all 'core.sshCommand' || :
[command]/usr/bin/git config --local --name-only --get-regexp http.https://github.com/.extraheader
http.https://github.com/.extraheader
[command]/usr/bin/git config --local --unset-all http.https://github.com/.extraheader
[command]/usr/bin/git submodule foreach --recursive git config --local --name-only --get-regexp 'http.https://github.com/.extraheader' && git config --local --unset-all 'http.https://github.com/.extraheader' || :
Cleaning up orphan processes

Please complete the following information:

  • OS: Win10
  • Do you use Github Action? yes
  • Do you use multiple Roam databases? no
  • Does roam-to-git use to work for you? When precisely did it stopped to work? never worked
  • Does some backup runs are still working? no

Additional context

Few minor suggestions

Hi. Just tried it -- awesome tool, especially thanks for keeping everything tidy and modular!
My usecase is exporting all Roam Research data into a JSON (I want to use it in other tools), and I was pleased to be able to use your library for that in a very straightforward way :)

I've figured none of my suggestions are worthy separate issues so piled under a single one. Hopefully it's ok.

  1. Add export to the repository topics? That way if someone searches for 'roam export' your repository would pop up. ('backup'/'export' are pretty interchangeable)

  2. The library logs in stdout, which might be annoying if someone tries to use it with other tools, it's better to log in stderr. I've hacked it in my script, but would be nice to use print(..., file=sys.stderr) throughtout the library instead?

  3. The call to patch_pyppeteer -- do you think it makes sense to simply move it inside download_rr_archive? Then could remove the call from main too.

  4. Expose a method to simply return the JSON? E.g scrapping.export_json(...) -> str.

    The JSON contains everything the user potentially needs for exporting notes, and if the library exposes it, the export would be as easy as.

    It requires extracting the json from zip, but zipfile library is standard, so that wouldn't create any extra dependencies. It's just four lines of code. But I'd understand if you'd rather not complicate your module.

  5. Add username,passwords as arguments for Config?

    E.g it would look like:

    def __init__(self, database: Optional[str], debug: bool, user: str=os.environ.get("ROAMRESEARCH_USER"), password: str=os.environ.get("ROAMRESERACH_PASSWORD")):
        self.user = user
        self.password = password
        assert self.user
        assert self.password
    

    That way you won't have to necessarily use environment variables for passing username/password. Not a biggie, but sometimes it's more convenient.

Let me know what you think, and I'm happy to contribute the changes you approve!

Detect potential duplicates

Roam does not seems to have an advanced de-duplication algorithm for notes.

  1. A space at the end of a note title is not voluntary most of the time
  2. Unicode is not normalized

Eg. for 1.: [[Charlène]] and [[Charlène]]. Looks identical, but if we print the bytes, they are different, b'Charle\xcc\x80ne' versus b'Charl\xc3\xa8ne'. If I use unicodedata.normalize on both, they become identical.

We could detect them and save the list of errors in a dedicated files.

Backup via Action failing since paywall went up

Running roam-to-git backup via Actions. Since the paywall has gone up, the backup fails while waiting for the interface to load. Assume it's because the login flow has changed.

2020-06-10T17:02:57.2539326Z 2020-06-10 17:02:57.253 | DEBUG | roam_to_git.scrapping:_download_rr_archive:101 - Wait for interface to load
2020-06-10T17:02:57.4390657Z 2020-06-10 17:02:57.438 | DEBUG | roam_to_git.scrapping:_download_rr_archive:101 - Wait for interface to load
2020-06-10T17:09:42.3139891Z 2020-06-10 17:09:42.313 | DEBUG | roam_to_git.scrapping:download_rr_archive:75 - Closing browser markdown
2020-06-10T17:09:42.3146177Z 2020-06-10 17:09:42.314 | DEBUG | roam_to_git.scrapping:download_rr_archive:75 - Closing browser json
2020-06-10T17:09:42.3441770Z 2020-06-10 17:09:42.343 | DEBUG | roam_to_git.scrapping:download_rr_archive:77 - Closed browser markdown
2020-06-10T17:09:42.3722804Z 2020-06-10 17:09:42.371 | DEBUG | roam_to_git.scrapping:download_rr_archive:77 - Closed browser json
2020-06-10T17:09:42.3786501Z 2020-06-10 17:09:42.372 | ERROR | main::11 - An error has been caught in function '', process 'MainProcess' (3978), thread 'MainThread' (139625180018496):
2020-06-10T17:09:42.3787119Z Traceback (most recent call last):
2020-06-10T17:09:42.3787532Z
2020-06-10T17:09:42.3788453Z > File "/opt/hostedtoolcache/Python/3.8.3/x64/bin/roam-to-git", line 11, in
2020-06-10T17:09:42.3788866Z load_entry_point('roam-to-git==0.1', 'console_scripts', 'roam-to-git')()
2020-06-10T17:09:42.3789345Z β”” <function load_entry_point at 0x7efd011015e0>
2020-06-10T17:09:42.3790877Z File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/roam_to_git/main.py", line 76, in main
2020-06-10T17:09:42.3791187Z scrap(markdown_zip_path, json_zip_path, config)
2020-06-10T17:09:42.3791633Z β”‚ β”‚ β”‚ β”” <roam_to_git.scrapping.Config object at 0x7efcfda8d820>
2020-06-10T17:09:42.3792002Z β”‚ β”‚ β”” PosixPath('/tmp/tmpuwmqnsjb')
2020-06-10T17:09:42.3792335Z β”‚ β”” PosixPath('/tmp/tmpf1f99acm')
2020-06-10T17:09:42.3792661Z β”” <function scrap at 0x7efcfda954c0>
2020-06-10T17:09:42.3793172Z File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 251, in scrap
2020-06-10T17:09:42.3793399Z asyncio.get_event_loop().run_until_complete(asyncio.gather(*tasks))
2020-06-10T17:09:42.3793922Z β”‚ β”‚ β”‚ β”‚ β”” [<coroutine object download_rr_archive at 0x7efcfde7de40>, <coroutine object download_rr_archive at 0x7efcfe317dc0>]
2020-06-10T17:09:42.3794344Z β”‚ β”‚ β”‚ β”” <function gather at 0x7efcffa7f310>
2020-06-10T17:09:42.3794883Z β”‚ β”‚ β”” <module 'asyncio' from '/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/asyncio/init.py'>
2020-06-10T17:09:42.3795237Z β”‚ β””
2020-06-10T17:09:42.3795641Z β”” <module 'asyncio' from '/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/asyncio/init.py'>
2020-06-10T17:09:42.3795924Z File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
2020-06-10T17:09:42.3796123Z return future.result()
2020-06-10T17:09:42.3796491Z β”‚ β”” <method 'result' of '_asyncio.Future' objects>
2020-06-10T17:09:42.3796954Z β”” <_GatheringFuture finished exception=AssertionError('All roads leads to Roam, but that one is too long. Try again when Roam s...
2020-06-10T17:09:42.3797787Z File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 68, in download_rr_archive
2020-06-10T17:09:42.3798020Z return await _download_rr_archive(document, output_type, output_directory, config)
2020-06-10T17:09:42.3798483Z β”‚ β”‚ β”‚ β”‚ β”” <roam_to_git.scrapping.Config object at 0x7efcfda8d820>
2020-06-10T17:09:42.3798906Z β”‚ β”‚ β”‚ β”” PosixPath('/tmp/tmpf1f99acm')
2020-06-10T17:09:42.3799273Z β”‚ β”‚ β”” 'markdown'
2020-06-10T17:09:42.3799662Z β”‚ β”” <pyppeteer.page.Page object at 0x7efcfd3867f0>
2020-06-10T17:09:42.3800034Z β”” <function _download_rr_archive at 0x7efcfda95280>
2020-06-10T17:09:42.3800537Z File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 120, in _download_rr_archive
2020-06-10T17:09:42.3800876Z assert dot_button is not None, "All roads leads to Roam, but that one is too long. Try "
2020-06-10T17:09:42.3801212Z β”” None
2020-06-10T17:09:42.3801314Z
2020-06-10T17:09:42.3801498Z AssertionError: All roads leads to Roam, but that one is too long. Try again when Roam servers are faster.
2020-06-10T17:09:42.3803728Z Traceback (most recent call last):
2020-06-10T17:09:42.3804481Z File "/opt/hostedtoolcache/Python/3.8.3/x64/bin/roam-to-git", line 11, in
2020-06-10T17:09:42.3804919Z load_entry_point('roam-to-git==0.1', 'console_scripts', 'roam-to-git')()
2020-06-10T17:09:42.3805567Z File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/loguru/_logger.py", line 1210, in catch_wrapper
2020-06-10T17:09:42.3805784Z return function(*args, **kwargs)
2020-06-10T17:09:42.3806249Z File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/roam_to_git/main.py", line 76, in main
2020-06-10T17:09:42.3806503Z scrap(markdown_zip_path, json_zip_path, config)
2020-06-10T17:09:42.3807079Z File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 251, in scrap
2020-06-10T17:09:42.3807296Z asyncio.get_event_loop().run_until_complete(asyncio.gather(*tasks))
2020-06-10T17:09:42.3807548Z File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
2020-06-10T17:09:42.3807764Z return future.result()
2020-06-10T17:09:42.3808266Z File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 68, in download_rr_archive
2020-06-10T17:09:42.3808499Z return await _download_rr_archive(document, output_type, output_directory, config)
2020-06-10T17:09:42.3809012Z File "/opt/hostedtoolcache/Python/3.8.3/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 120, in _download_rr_archive
2020-06-10T17:09:42.3809259Z assert dot_button is not None, "All roads leads to Roam, but that one is too long. Try "
2020-06-10T17:09:42.3809472Z AssertionError: All roads leads to Roam, but that one is too long. Try again when Roam servers are faster.
2020-06-10T17:09:42.4315148Z ##[error]Process completed with exit code 1.
2020-06-10T17:09:42.4370242Z Post job cleanup.

pyppeteer.errors.ElementHandleError: Evaluation failed: TypeError: Cannot read property 'textContent' of null

I've been running this every 10 min for the last day #4 This is an occasional error I observe

Downloading output of type markdown
Checking download type
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.8.2/x64/bin/roam-to-git", line 11, in <module>
    load_entry_point('roam-to-git==0.1', 'console_scripts', 'roam-to-git')()
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/roam_to_git/__main__.py", line 68, in main
    scrap(markdown_zip_path, json_zip_path, config)
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 188, in scrap
    asyncio.get_event_loop().run_until_complete(asyncio.gather(*tasks))
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 107, in download_rr_archive
    button, button_text = await get_dropdown_button()
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 101, in get_dropdown_button
    dropdown_button_text = await get_text(document, dropdown_button)
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 25, in get_text
    text = await page.evaluate('(element) => element.textContent', b)
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/pyppeteer/page.py", line 1158, in evaluate
    return await frame.evaluate(pageFunction, *args, force_expr=force_expr)
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/pyppeteer/frame_manager.py", line 294, in evaluate
    return await context.evaluate(
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/pyppeteer/execution_context.py", line 54, in evaluate
    handle = await self.evaluateHandle(
  File "/opt/hostedtoolcache/Python/3.8.2/x64/lib/python3.8/site-packages/pyppeteer/execution_context.py", line 113, in evaluateHandle
    raise ElementHandleError('Evaluation failed: {}'.format(
pyppeteer.errors.ElementHandleError: Evaluation failed: TypeError: Cannot read property 'textContent' of null
    at __pyppeteer_evaluation_script__:1:23

What to put as password for Roam account created via Google (no password)?

Describe the bug
I am trying to set this up, but I have no Roam password as I created my Roam account via Google Auth sign-in.

To Reproduce
Steps to reproduce the behavior:

  1. Go to https://roamresearch.com/#/signup
  2. Create an account by clicking "Or sign up with Google"
  3. Create account via Google

Expected behavior
I don't know what to put in for my ROAMRESEARCH_PASSWORD key

Traceback
n/a

Please complete the following information:

  • OS: MacOS Catalina
  • Do you use Github Action? I would but don't know what to put for this secret
  • Do you use multiple Roam databases? No
  • Does roam-to-git use to work for you? When precisely did it stopped to work?
  • Does some backup runs are still working?

/ in page name creates a directory

Describe the bug
My Someday/Maybe page ends up being {formatted/markdown}/Someday/Maybe.md.

Expected behavior
I was expecting a file named something like Someday%!@#$@#^!@3@@#$Maybe.md with the %!@#$@#^!@3@@#$ part representing a / in some obscure encoding. :)

Please complete the following information:

  • Github Actions

Backup fails 1-2 times per day

I get "Run failed for master" at least once per day for my workflow of Roam Research backup. The message I get is "Process completed with exit code 1".

Any idea of what this issue is related to? Does it have anything to do with how often backup is done?

download_rr_archive: timeout in scrapping.py

I changed the number of iterations at line 103 to 500, but it still times out. Refreshing my db doesn't take that long, so maybe there's something else wrong. Or is it simply due to the long DB load-times these days in Roam? Is this supposed to work and I'm missing something?
Thanks for making this script available: it's awesome to see how puppeteer can be used via Python!

2020-05-10 11:30:43.538 | DEBUG    | roam_to_git.scrapping:download_rr_archive:54 - Creating browser
2020-05-10 11:30:44.320 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:91 - Configure downloads to /var/folders/dn/6sb0pjt535qfrtnc1jyvk9lh0000gn/T/tmp11gi5ais
2020-05-10 11:30:44.354 | DEBUG    | roam_to_git.scrapping:signin:181 - Opening signin page
2020-05-10 11:30:49.194 | DEBUG    | roam_to_git.scrapping:signin:185 - Fill email '[email protected]'
2020-05-10 11:30:50.540 | DEBUG    | roam_to_git.scrapping:signin:190 - Fill password
2020-05-10 11:30:51.395 | DEBUG    | roam_to_git.scrapping:signin:195 - Click on sign-in
2020-05-10 11:30:53.648 | DEBUG    | roam_to_git.scrapping:go_to_database:205 - Load database from url 'https://roamresearch.com/#/app/xxxxxxx'
2020-05-10 11:30:53.704 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:101 - Wait for interface to load
2020-05-10 11:48:01.981 | DEBUG    | roam_to_git.scrapping:download_rr_archive:75 - Closing browser json
2020-05-10 11:48:02.024 | DEBUG    | roam_to_git.scrapping:download_rr_archive:77 - Closed browser json
Traceback (most recent call last):
  File "./roamresearch.py", line 61, in <module>
    main()
  File "./roamresearch.py", line 49, in main
    asyncio.get_event_loop().run_until_complete(atask)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 584, in run_until_complete
    return future.result()
  File "/Users/pchalasani/Dropbox-personal/GitForks/Roam/roam-to-git/roam_to_git/scrapping.py", line 68, in download_rr_archive
    return await _download_rr_archive(document, output_type, output_directory, config)
  File "/Users/pchalasani/Dropbox-personal/GitForks/Roam/roam-to-git/roam_to_git/scrapping.py", line 119, in _download_rr_archive
    assert dot_button is not None, "All roads leads to Roam, but that one is too long. Try " \
AssertionError: All roads leads to Roam, but that one is too long. Try again when Roam servers are faster.

Timeout on signin step

When running for the first time (locally, with a pipx install on macOS 10.15.4), I consistently get pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 30000 ms exceeded. seemingly in the first "Opening signin page" step (logs below).

This same setup, with the --debug flag, shows the Chromium GUI as expected and works with no long lags, right up through the end (where I believe it fails to actually unpack the zip files with a warning that this doesn't work in --debug mode).

2020-04-20 17:53:44.195 | INFO     | roam_to_git.__main__:main:43 - Loading secrets from /tmp/roamtest/.env
2020-04-20 17:53:44.198 | DEBUG    | roam_to_git.scrapping:download_rr_archive:54 - Creating browser
2020-04-20 17:53:44.329 | DEBUG    | roam_to_git.scrapping:download_rr_archive:54 - Creating browser
2020-04-20 17:53:44.899 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:95 - Configure downloads to /var/folders/_y/wm_9h7f555x60zdv6k4f74gr0000gn/T/tmp8nfgjem2
2020-04-20 17:53:44.911 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:95 - Configure downloads to /var/folders/_y/wm_9h7f555x60zdv6k4f74gr0000gn/T/tmpvev98ff8
2020-04-20 17:53:44.936 | DEBUG    | roam_to_git.scrapping:signin:184 - Opening signin page
2020-04-20 17:53:44.948 | DEBUG    | roam_to_git.scrapping:signin:184 - Opening signin page
2020-04-20 17:54:14.940 | DEBUG    | roam_to_git.scrapping:download_rr_archive:77 - Closing browser markdown
[E:pyppeteer.connection] connection unexpectedly closed
Task exception was never retrieved
future: <Task finished name='Task-60' coro=<Connection._async_send() done, defined at /Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/pyppeteer/connection.py:69> exception=InvalidStateError('invalid state')>
Traceback (most recent call last):
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/websockets/protocol.py", line 827, in transfer_data
    message = await self.read_message()
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/websockets/protocol.py", line 895, in read_message
    frame = await self.read_data_frame(max_size=self.max_size)
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/websockets/protocol.py", line 971, in read_data_frame
    frame = await self.read_frame(max_size)
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/websockets/protocol.py", line 1047, in read_frame
    frame = await Frame.read(
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/websockets/framing.py", line 105, in read
    data = await reader(2)
  File "/usr/local/Cellar/[email protected]/3.8.2/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/streams.py", line 721, in readexactly
    raise exceptions.IncompleteReadError(incomplete, n)
asyncio.exceptions.IncompleteReadError: 0 bytes read on a total of 2 expected bytes

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/pyppeteer/connection.py", line 73, in _async_send
    await self.connection.send(msg)
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/websockets/protocol.py", line 555, in send
    await self.ensure_open()
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/websockets/protocol.py", line 803, in ensure_open
    raise self.connection_closed_exc()
websockets.exceptions.ConnectionClosedError: code = 1006 (connection closed abnormally [internal]), no reason

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/pyppeteer/connection.py", line 79, in _async_send
    await self.dispose()
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/pyppeteer/connection.py", line 170, in dispose
    await self._on_close()
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/pyppeteer/connection.py", line 151, in _on_close
    cb.set_exception(_rewriteError(
asyncio.exceptions.InvalidStateError: invalid state
2020-04-20 17:54:14.953 | DEBUG    | roam_to_git.scrapping:download_rr_archive:79 - Closed browser markdown
Traceback (most recent call last):
  File "/Users/jrk/.local/bin/roam-to-git", line 8, in <module>
    sys.exit(main())
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/roam_to_git/__main__.py", line 69, in main
    scrap(markdown_zip_path, json_zip_path, config)
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 253, in scrap
    asyncio.get_event_loop().run_until_complete(asyncio.gather(*tasks))
  File "/usr/local/Cellar/[email protected]/3.8.2/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 69, in download_rr_archive
    return await _download_rr_archive(document, output_type, output_directory, config,
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 100, in _download_rr_archive
    await signin(document, config, sleep_duration=sleep_duration)
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 185, in signin
    await document.goto('https://roamresearch.com/#/signin')
  File "/Users/jrk/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/pyppeteer/page.py", line 862, in goto
    raise error
pyppeteer.errors.TimeoutError: Navigation Timeout Exceeded: 30000 ms exceeded.
2020-04-20 17:54:14.962 | DEBUG    | roam_to_git.scrapping:_kill_child_process:216 - Terminate child process [psutil.Process(pid=91694, name='Chromium', started='17:53:44'), psutil.Process(pid=91696, name='Chromium', started='17:53:44')]

Public mode export

Would be great if we can push out some of the exported notes for public consumption. One option of how it may work:

  • Add "only formatted" export (so raw notes won't be exposed)
  • Add an ability to filter export based on on presence of some tag/page
  • Run the result on top of GitHub pages repo.

Consistent failure: download_rr_archive - ValueError: not enough values to unpack

Starting this night it consistently fails with the following error:

Launch popup
Traceback (most recent call last):
  File "/Users/sitalov/.local/bin/roam-to-git", line 8, in <module>
    sys.exit(main())
  File "/Users/sitalov/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/roam_to_git/__main__.py", line 68, in main
    scrap(markdown_zip_path, json_zip_path, config)
  File "/Users/sitalov/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 188, in scrap
    asyncio.get_event_loop().run_until_complete(asyncio.gather(*tasks))
  File "/usr/local/Cellar/[email protected]/3.8.2/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/Users/sitalov/.local/pipx/venvs/roam-to-git/lib/python3.8/site-packages/roam_to_git/scrapping.py", line 95, in download_rr_archive
    export_all, = [b for b in divs_pb3 if await get_text(document, b) == 'export all']
ValueError: not enough values to unpack (expected 1, got 0)

Starting today, the backup almost allways times out (15+min)

See the log of the runs here: https://github.com/Stvad/roam-notes-workflow/actions

An example run result (https://github.com/Stvad/roam-notes-workflow/runs/665482134?check_suite_focus=true)

Looks like

100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 108773488/108773488 [00:00<00:00, 136886394.78it/s]
[W:pyppeteer.chromium_downloader] 
chromium download done.
[W:pyppeteer.chromium_downloader] chromium extracted to: /home/runner/.local/share/pyppeteer/local-chromium/588429
2020-05-12 01:43:11.518 | DEBUG    | roam_to_git.scrapping:download_rr_archive:54 - Creating browser
2020-05-12 01:43:12.066 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:91 - Configure downloads to /tmp/tmpqo3ac2yi
2020-05-12 01:43:12.078 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:91 - Configure downloads to /tmp/tmpcg50uiot
2020-05-12 01:43:12.101 | DEBUG    | roam_to_git.scrapping:signin:181 - Opening signin page
2020-05-12 01:43:12.113 | DEBUG    | roam_to_git.scrapping:signin:181 - Opening signin page
2020-05-12 01:43:16.447 | DEBUG    | roam_to_git.scrapping:signin:185 - Fill email '***'
2020-05-12 01:43:16.487 | DEBUG    | roam_to_git.scrapping:signin:185 - Fill email '***'
2020-05-12 01:43:17.247 | DEBUG    | roam_to_git.scrapping:signin:190 - Fill password
2020-05-12 01:43:17.284 | DEBUG    | roam_to_git.scrapping:signin:190 - Fill password
2020-05-12 01:43:18.900 | DEBUG    | roam_to_git.scrapping:signin:195 - Click on sign-in
2020-05-12 01:43:18.937 | DEBUG    | roam_to_git.scrapping:signin:195 - Click on sign-in
2020-05-12 01:43:21.143 | DEBUG    | roam_to_git.scrapping:go_to_database:205 - Load database from url 'https://roamresearch.com/#/app/***'
2020-05-12 01:43:21.169 | DEBUG    | roam_to_git.scrapping:go_to_database:205 - Load database from url 'https://roamresearch.com/#/app/***'
2020-05-12 01:43:21.204 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:101 - Wait for interface to load
2020-05-12 01:43:21.225 | DEBUG    | roam_to_git.scrapping:_download_rr_archive:101 - Wait for interface to load
##[error]The operation was canceled.

Basically it gets stuck for 15+ minutes at that step and gets cancelled :(

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.