maboroshy / note-station-to-markdown Goto Github PK

The cross-platform script that converts notes from Synology Note Station to markdown files

License: Apache License 2.0

QML 16.81% Python 82.31% Dockerfile 0.88%

notes synology markdown converter nsx cross-platform python note-station

note-station-to-markdown's Introduction

This script converts notes from Synology Note Station to plain-text markdown notes.
The script is written in Python and should work on any desktop platform.

After conversion you get:

Directories named like the exported notebooks;
Notes in those directories as markdown-syntax plain text files with all in-line images in-place;
Assigned tags and links to attachments at the beginning of note texts;
All images and attached files in media subdirectories inside notebook directories.

Local installation

The script requires Python 3.5+ and pandoc installed on your system. Get the installation packages or use the package manager of your OS.
Put nsx2md.py to the directory, where you want to convert notes.

Usage

Export your Synology Note Station notebooks by: Setting -> Import and Export -> Export. You will get .nsx file.
Adjust the .nsx file permissions if required.
Copy the .nsx file(s) to the directory where you've put nsx2md.py.
Set script settings if required - see the "Optional settings" section below.
Run python nsx2md.py to convert all the .nsx files in the directory or python nsx2md.py path/to/export.nsx to convert a specific file.

Docker setup

build Docker image

docker build -t nsx2md .

run the docker image

docker run -it -v "$PWD:/nsx2md nsx2md <file.nsx>

Optional settings

Inside the script you can make some adjustments to the link format and notes metadata:
Select metadata options:
meta_data_in_yaml - True YAML block the following metadata that are set True, False metadata will be in text;
insert_title - True to insert note title as a markdown heading at the first line, False to disable;
insert_ctime - True to insert note creation time to the beginning of the note text, False to disable;
insert_mtime - True to insert note modification time to the beginning of the note text, False to disable;
tags - True to insert list of tags, False to disable;
tag_prepend - string to prepend each tag in a tag list inside the note, default is empty;
tag_delimiter - string to delimit tags, default is comma separated list;
no_spaces_in_tags - True to replace spaces in tag names with '_', False to keep spaces.

Select file link options:
prepend_links_with - Prepends file links with set string (ex. 'file://'), '' for no prepend
encode_links_as_uri - Encodes links' special characters with "percent-encoding, True for /link%20target style links, False for /link target style links
absolute_links - True for absolute links, False for relative links;

Select File/Attachments/Media options:
media_dir_name - name of the directory inside the produced directory where all images and attachments will be stored;
md_file_ext - extension for produced markdown syntax note files;
creation_date_in_filename - True to insert note creation time to the note file name, False to disable;

For QOwnNotes users

There are several ways to get tags from converted notes to work in QOwnNotes:

Import tags to QOwnNotes native way

Convert .nsx files with default nsx2md.py settings;
Add notebook directories produced by nsx2md.py as QOwnNotes note folders;
Set one of these note folders as current;
Enable provided import_tags.qml script in QOwnNotes (Note -> Settings -> Scripting) (remove_tag_line.py should be at the same directory);
The script will add 2 new buttons and menu items:
1. Import tags - to import tags from the tag lines of all the notes in the current note folder
2. Remove tag lines - to remove the tag lines from all the notes in the current folder
Use the buttons in the according order, any previous QOwnNotes tag data for the note folder will be lost;
Move to the next note folder produced by nsx2md.py, repeat #5;
Disable import_tags.qml script. That is obligatory.

"@tag tagging in note text (experimental)" QOwnNotes script

For default @ tag prepends use the following nsx2md.py settings:

tag_prepend = '@'  # string to prepend each tag in a tag list inside the note, default is empty
tag_delimiter = ' '  # string to delimit tags, default is comma separated list
no_spaces_in_tags = True  # True to replace spaces in tag names with '_', False to keep spaces

Convert .nsx files;
Add notebook directories produced by nsx2md.py as QOwnNotes note folders.

note-station-to-markdown's People

Contributors

Stargazers

Watchers

Forkers

shafiqalibhai jhellerstedt cdkey51 rchrd-blkly ychuan1115 adrw fhkhoury the7day imagoiq sweisgerber zhuatw dawnadvent amtq inclusa benfab ovrgrw ahsquared jeremy125 comgram kissmeinthedark seekertian xuxanwan bberwei chentyjpm mariusmos simonfaculty cberkom grantx2016 onesdev poppy7921 jango-fx cyyeh21 yuntaochn 0x24bin vinnyradev opajonk steffenonthehub artelse jpmsilva kellyling80 nasbeginner delaere rwaight lifrc zero-feng xinbindai stonexie royalwang

note-station-to-markdown's Issues

Convert to HTML?

Is there any chance to add functionality to convert NSX to HTML for subsequent importing into other applications (e.g., OneNote)?

tables containing block elements will be dropped completely from Markdown output

Unless you use the 'multiline_tables' extension to pandoc, any table containing a block element such as a DIV will be completely replaced by the string "[TABLE]", potentially causing significant data loss. Note Station apparently inserts all kinds of cruft into its tables.

TypeError: '<' not supported between instances of 'str' and 'int'

Hi Maboroshy,

when run this script in Windows 10 and Python 3.7.3, I get following error output:

Found pandoc exe 2.7.3
Traceback (most recent call last):
File "nsx2md.py", line 66, in
if distutils.version.LooseVersion(pandoc_ver) < distutils.version.LooseVersion('1.16'):
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.7_3.7.1008.0_x64__qbz5n2kfra8p0\lib\distutils\version.py", line 52, in lt
c = self._cmp(other)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.7_3.7.1008.0_x64__qbz5n2kfra8p0\lib\distutils\version.py", line 337, in _cmp
if self.version < other.version:
TypeError: '<' not supported between instances of 'str' and 'int'

Can I do something to fix it or is it a problem of the code?

Only title instead of note

The converted files only contain the title of the note, not the content.
NSX attached renamed as ZIP.
20240331_144226_11102_Lagavulin.zip
Thanks for having a look at this.

Tables

Hi, great script! Question, are tables expected to be converted to [TABLE] with discarded content? If not, are you planning to implement table conversion, or at least embed its content between code tags perhaps?

OSError: [Errno 63] File name too long

Hi,

I got this OSError: [Errno 63] File name too long on MacOS for a file named /media/b9e5d7dfb4f4315b201e813e77057476_1_2_3_4_5_6_7_8_9_10_11_12_13_14_15_16_17_18_19_20_21_22_23_24_25_26_27_28_29_30_31_32_33_34_35_36_37_38_39_40_41_42_43_44_45_46_47_48_49_50_51_52_53_54_55_56_57_58_59_60_61_62_63_64_65_66_67_68_69_70_71_72_73_74_75_76.jpeg

Anything possible to at least skip this then? Or better rename it somehow?

Really long titles crash

I have one note with a title length of 414 characters, this causes a crash on line 223 md_file_path.write_text(content, 'utf-8').

Shortening the title in the sanitise_path_string(path_str) method resolves the issue.:

    if truncate_title and len(path_str) > truncate_title_chars:
        path_str = path_str[:truncate_title_chars]

But this requires adding new settings as for example:

truncate_title = True # Catch problems with titles that are too long
truncate_title_chars = 200 # Do not allow more than X characters for a title

It seems the image is not displaying correctly, even though the image file can be located

The image file is available in the media folder, yet it doesn't appear correctly in Obsidian. Could you investigate this problem? Thank you.

![](file://media/59661686846628447.png)

Syntax Error: invalid syntax

I'm sure this is a blatant noobie error, but ... I get...

F:\nsx2md>python nsx2md.py
File "nsx2md.py", line 7

^
SyntaxError: invalid syntax

I'm running Win10x64 with Python37 (x64) and lastest Pandoc (x64). Both Python and Pandoc are in PATH. I tried moving the py script and the nsx file into the Python folder, but same message.

Error on Linux

any suggestions?

Traceback (most recent call last):
  File "nsx2md.py", line 33, in <module>
    config_data = json.loads(nsx.read('config.json'))
  File "/usr/lib64/python3.4/json/__init__.py", line 312, in loads
    s.__class__.__name__))
TypeError: the JSON object must be str, not 'bytes'

Notestation 2.4.3-0810
Python 2.7.5
pandoc 1.12.3.1
RHEL 7.3

ModuleNotFoundError: No module named 'distutils'

I get the error "ModuleNotFoundError: No module named 'distutils'" when running but it seems distulis is legacy?

https://docs.python.org/3.10/library/distutils.html

Am I doing it wrong?

Thanks in advance.

Thank you so much

Needed just this today and really appreciate that you made it and released it.

Feel free to close this issue, just wanted to say thanks.

the content of markdown file only title

for example,

ex. foo_hello_world.md , it's content will be:

foo_hello_world
======

Crash when trash is not empty

When the trash is not empty, it gives me this error on line 132:
parent_notebook = notebook_id_to_path_index[parent_notebook_id]

KeyError: '1027_#00000000'

Emptying the trash and re-exporting the notebook solves the issue.

Make pandoc.Wait(5) time adjustable

A few of my notes were triggering this error, which appears to be a 5s wait for pandoc to do its thing:

Traceback (most recent call last):
  File "./nsx2md.py", line 133, in <module>
    pandoc.wait(5)
  File "/usr/local/Cellar/python/3.7.2_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/subprocess.py", line 990, in wait
    return self._wait(timeout=timeout)
  File "/usr/local/Cellar/python/3.7.2_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/subprocess.py", line 1616, in _wait
    raise TimeoutExpired(self.args, timeout)
subprocess.TimeoutExpired: Command '['pandoc', '-f', 'html', '-t', 'markdown_strict+pipe_tables-raw_html', '--wrap=none', '--atx-headers', '-o', '/var/folders/gg/8fhvnc85693bnnpcw53k1jt80000gn/T/tmpqskz8j8t', '/var/folders/gg/8fhvnc85693bnnpcw53k1jt80000gn/T/tmpvw_o1sn3']' timed out after 5 seconds

I solved it by changing the wait to 20s:

pandoc = subprocess.Popen(pandoc_args)
pandoc.wait(20)

Maybe the delay could be extended by default, or just break it out as a setting like the other options at the top of the file?

Thanks a million the script works great! I'm using it to migrate from DS Note to Joplin currently

Error if attachments with identical file names

The script delivers an error if two notes have attachments, which have identical filenames.

The error message reads as follows:

Traceback (most recent call last): File "nsx2md.py", line 119, in <module> '{}/{}/{}'.format(notebook_title, media_dir, name)) FileExistsError: [WinError 183] Eine Datei kann nicht erstellt werden, wenn sie bereits vorhanden ist:

Would it be possible to change the code in a way that it attaches increasing numbers to filenames if they already exist?

wrong "interpretation" of some media file

Wonderful script, you saved my life.
Just found some media filename were not correctly "translated". It was the case with special char in the filename with % char
ex: filename in the media folder: 5-petit-de%CC%81jeuner.png
and the link was media 5-petit-de%25CC%2581jeuner.png
as you can see the script systematically skips the 2 digits following the % char
not a big deal, but opportunity for improvement

Convert Notes with tables and pictures

Converting plain text notes works quite well on Windows 11, but I'm currently having two problems:

Notes with tables
Notes with images/screenshots

1 Message:

Unconverted table found. Tweaking pandoc options may (or may not) allow to convert it.

How to tweek pandoc?
This error does not stop the conversion.

2 Message:

Traceback (most recent call last):
File "C:\Users\Lagavulin\Documents\Einstellungen\NAS\Notes\nsx2md.py", line 236, in
link_path = urllib.request.pathname2url(link_path).replace('///', '')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^.
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.12_3.12.752.0_x64__qbz5n2kfra8p0\Lib\nturl2path.py", line 73, in pathname2url
raise OSError(error)
OSError: Bad path: media/2456https://static.tp-link.com/image023_1493261584939q.png

This error is particularly annoying because it aborts the conversion.
It would be helpful if, when an error occurs, the conversion would skip the erroneous note and continue with the next one. Then, if necessary, I would copy the incorrect notes by hand.

Thanks again for having a look in this.

attachment_id in note_data.get('attachment', ''):

my first time using github or python, so go easy on me plz! I'm getting this error everytime

File "nsx2md.py", line 130, in
for attachment_id in note_data.get('attachment', ''):

Attachments are not linked correctly

Attachments seem not to be incorporated correctly. They are apparently not working for two reasons.

They must have an absolute path
Blanks need to be escaped via %20

Problems with import-tag-to-QON.qml

I successfully installed the import-tag-to-QON.qml script. The script dialogue is telling me that the scripts seems to be valid.

However, if I click on an imported note, no tag will be created. The note starts with

Tags: tagName

List notes that are not converted

After the script completes I get the message:

Converted 16 notebooks and 371 out of 384 notes. Press Enter to quit...

Is it possible to get more info about what was not converted? Thanks!

Enhancement: Possibility to put the tags with @

When using the "@tag tagging in note text (experimental)" script in QOwnNote it is useful to import the tags directly with an @, so you could add the following:
use_at_tags = False # True when you want to use experimental @tags of QOwnNotes

Modification for the tag writing:

if note_data.get('tag', '') and use_at_tags:
            content = 'Tags: @{}  \n{}'.format(', @'.join(note_data['tag']), content)
        else:
            content = 'Tags: {}  \n{}'.format(', '.join(note_data['tag']), content)

Cannot see Pictures after Export with your Script

Hi,
first of all, thank you for your great job.
I have this issue with Pictures.
I exported all my DS Notes and then imported in Obsidian.
I can't see the pictures, when the format is like this after the export:
But if I try to link it like this ![[15101598275384230.png]], it works.
Do you habe any suggestion?
Thank you very much!
Michael