Code Monkey home page Code Monkey logo

nbmerge's Introduction

image

image

image

image

image

image

nbmerge: merge / concatenate Jupyter notebooks

Installation

Usage

For the usage as originally specified by @fperez's gist,

Alternatively, nbmerge can cursively collect all files in the current directory and below, recursively. After collection, it sorts them lexicographically. You can use a regular expression as a file name predicate. All .ipynb_checkpoints are automatically ignored. And, you can use the -i option to ignore any notebook prefixed with an underscore (think pseudo-private in python).

For example, the following command collects all notebooks in your project that have the word intro in the file name and saves it to a merged file named _merged.ipynb,

Finally, you can also instruct the script to demarcate the boundary between each original file with the -b / -boundary [BOUNDARY] flag. The src_nb value in the metadata for the first cell in each original notebook will then contain the path of the original notebook, relative to the cwd at the point of script execution.

More details

Read the docs: here.

nbmerge's People

Contributors

jbn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

nbmerge's Issues

Boundary key source path annotation fails on Windows when using notebooks from different drive

    annotate_source_path(nb, base_dir, path, boundary_key)
  File "C:\Users\pete\AppData\Local\Continuum\anaconda3\lib\site-packages\nbmerge\__init__.py", line 37, in annotate_source_path
    cells[0].metadata[boundary_key] = os.path.relpath(nb_path, base_dir)
  File "C:\Users\pete\AppData\Local\Continuum\anaconda3\lib\ntpath.py", line 584, in relpath
    path_drive, start_drive))
ValueError: path is on mount 'c:', start on mount 'D:'

Move all imports to the top

Just wondering if it would be nice to add a feature where it moves all imports to the top and deduplicates them?

FIX argparse bug

When I used a pattern I don't often use,

nbmerge f1.ipynb f2.ipynb > output.ipynb

the program failed. For some reason, the full path to the nbmerge script is being included in the array argparse aggregates for the files parameter. I quickly fixed it with a kludge (see: 03dcbec). But, I need to figure out why argparse is behaving this way, then fix it correctly.

Add notebook demarcation to cell metadata

Generally, you use multiple notebooks because there are semantic difference between them. The merger exists mostly for serialization and -- for me -- easy of latex document preparation (bibliographies!). Dropping the demarcation completely is an information loss. Preprocessors and templates should be able to use the original source information (e.g. for header rewriting, implicit titles, etc).

nbformat.reader.NotJSONError

Tried to merge a couple jupyterlab notebooks, but got an error:

nbmerge a.ipynb b.ipynb > a.ipynb
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/nbformat/reader.py", line 14, in parse_json
    nb_dict = json.loads(s, **kwargs)
  File "/usr/local/Cellar/[email protected]/3.7.10_3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "/usr/local/Cellar/[email protected]/3.7.10_3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/local/Cellar/[email protected]/3.7.10_3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/bin/nbmerge", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/site-packages/nbmerge/__init__.py", line 194, in main
    plan['boundary_key'])
  File "/usr/local/lib/python3.7/site-packages/nbmerge/__init__.py", line 72, in merge_notebooks
    nb = read_notebook(fp, as_version=4)
  File "/usr/local/lib/python3.7/site-packages/nbformat/__init__.py", line 143, in read
    return reads(buf, as_version, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/nbformat/__init__.py", line 73, in reads
    nb = reader.reads(s, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/nbformat/reader.py", line 58, in reads
    nb_dict = parse_json(s, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/nbformat/reader.py", line 17, in parse_json
    raise NotJSONError(("Notebook does not appear to be JSON: %r" % s)[:77] + "...") from e
nbformat.reader.NotJSONError: Notebook does not appear to be JSON: ''...

I can't vouch for these notebooks being valid JSON, but they are valid jupyter notebooks.

Add recursive descent merging with private notebook filtering

That is, something like,

nbmerge --recursive --exclude-private > merged.ipynb

BookBook also uses <number>-<name>.ipynb semantics for sorting. Specifically, it's a glob over -.ipynb (latex.py:143), lexicographically sorted. My convention uses lexicographic sorting, also. However, instead of that glob, it's aggregates all file that ending with .ipynb which do not begin with _. This lets me name files for inclusion as ###_Title_of_Notebook.ipynb. I think this is convenient because:

  • My eyes don't see the underscores;
  • You can produce formal titles by name.replace('_', ' '), stripping (optional) number prefix;
  • There are no spaces to worry about when performing shell actions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.