Code Monkey home page Code Monkey logo

jupyter-fairly's Introduction

License: MIT

Jupyter Fairly

A jupyterLab extension for the fairly package, and the seamless integration of Jupyter-based research environments and research data repositories.

This extension is composed of a Python package named jupyter_fairly for the server extension and a NPM package named jupyter-fairly for the frontend extension.

Requirements

  • JupyterLab >= 3.0 < 4
  • fairly == 0.4.1

This is the last version that supports JupyterLab 3.x.

Install

To install the extension, execute:

pip install jupyter_fairly

Configurations are stored in .fairly/config.json in the user's home directory. This is where the extension stores access tokens for data repositories.

To add an access tokens, use the Fairly menu in the JupyterLab main menu bar.

Fairly Menu

Uninstall

To remove the extension, execute:

pip uninstall jupyter_fairly

Troubleshoot

If you are seeing the frontend extension, but it is not working, check that the server extension is enabled:

jupyter server extension list

If the server extension is installed and enabled, but you are not seeing the frontend extension, check the frontend extension is installed:

jupyter labextension list

Contributing

Development install

Note: You will need NodeJS to build the extension package.

The jlpm command is JupyterLab's pinned version of yarn that is installed with JupyterLab. You may use yarn or npm in lieu of jlpm below.

# Clone the repo to your local environment
# Change directory to the jupyter_fairly directory
# Install package in development mode
pip install -e ".[test]"
# Link your development version of the extension with JupyterLab
jupyter labextension develop . --overwrite
# Server extension must be manually installed in develop mode
jupyter server extension enable jupyter_fairly
# Rebuild extension Typescript source after making changes
jlpm build

You can watch the source directory and run JupyterLab at the same time in different terminals to watch for changes in the extension's source and automatically rebuild the extension.

# Watch the source directory in one terminal, automatically rebuilding when needed
jlpm watch
# Run JupyterLab in another terminal
jupyter lab

With the watch command running, every saved change will immediately be built locally and available in your running JupyterLab. Refresh JupyterLab to load the change in your browser (you may need to wait several seconds for the extension to be rebuilt).

By default, the jlpm build command generates the source maps for this extension to make it easier to debug using the browser dev tools. To also generate source maps for the JupyterLab core extensions, you can run the following command:

jupyter lab build --minimize=False

Development uninstall

# Server extension must be manually disabled in develop mode
jupyter server extension disable jupyter_fairly
pip uninstall jupyter_fairly

In development mode, you will also need to remove the symlink created by jupyter labextension develop command. To find its location, you can run jupyter labextension list to figure out where the labextensions folder is located. Then you can remove the symlink named jupyter-fairly within that folder.

Testing the extension

Server tests

This extension is using Pytest for Python code testing.

Install test dependencies (needed only once):

pip install -e ".[test]"
# Each time you install the Python package, you need to restore the front-end extension link
jupyter labextension develop . --overwrite

To execute them, run:

pytest -vv -r ap --cov jupyter_fairly

Frontend tests

This extension is using Jest for JavaScript code testing.

To execute them, execute:

jlpm
jlpm test

Integration tests

This extension uses Playwright for the integration tests (aka user level tests). More precisely, the JupyterLab helper Galata is used to handle testing the extension in JupyterLab.

More information are provided within the ui-tests README.

Packaging the extension

See RELEASE

Citation

Please cite this software using as follows:

Garcia Alvarez, M., Girgin, S., & Urra Llanusa, J., Jupyter-fairly: a JupyterLab extension for the fairly pacakage [Computer software]

Acknowledgements

This research is funded by the Dutch Research Council (NWO) Open Science Fund, File No. 203.001.114.

Project members:

jupyter-fairly's People

Contributors

dependabot[bot] avatar girgink avatar j535d165 avatar jurra avatar manugil avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

jupyter-fairly's Issues

Edit metadata locally

Story

I want to edit metadata locally with my favourite text editor/IDE so that I can create and update it easily.

Given that

  • The user can edit metadata in someway in the working environment (minimum a text editor with yaml)
  • When a new request from the working environment/commandline or gui is created, the metadata created is passed as input to create the new record.

Some requirements and considerations

  • The metadata needs to be validated against the standard before making the request
  • We are sticking to a basic standard, something like Dublin core (Business rule)

Implementation ideas

  • Metadata is created in yaml by the user
  • It is transformed to xml. Package example
  • Then using some validator against schema. XMLschema package
  • Then this is passed as a body request to the api.

Any ideas or issues? This raises other issues such as downloading and getting the latest version, with the yml format for instace...

Unexpected error when installing extension

Is this something related to fairly?
This error first appeared when trying to update the fairly package in the extension. That is a re-installation was called.
I started the installation process in a new venv and now the error appears with attempting to install the extension in develop mode with jupyter labextension develop --overwrite .

ModuleNotFoundError: There is no labextension at .. Errors encountered: 
[TypeError("the 'package' argument is required to perform a relative import for '.'"), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), 
AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), 
AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), 
AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), 
AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL')]
See the log file for details:  /tmp/jupyterlab-debug-k2ryluqd.log

Create a data record from working environment

Story

I want to create a metadata record directly from my working environment so that I can perform the task without leaving my working environment.

Given that

  • A user is logged in a server or jupyter-lab server or console
  • When user sends a request using a command (either via console or some gui)
  • Then the application checks for the validity of the record
  • Then the request is sent
  • Then the server/data provider returns a link where the record and the item lives. So that the user can also check it
  • Then confirmation that request was succesful

Some requirements and considerations

  • See #11. The user needs to create a record via either a text editor or some kind of form extension in jupyter lab.
  • It would be nice to use some generic protocol, instead of specific API

Implementation ideas

  1. Basic option: Do this via 4TU API
  2. Do a wrapper class that has an interface with different implementations for the same repository. These idea can be explored using the Open API swagger code generator for those repositories that are compatible the Open API specification.
  3. Find a common protocol in the spirit of SWORD and OAI-PMH

Incompatibility with jupyter_server_terminals

Enabling the server extension fails when jupyter_server_terminals is installed as an extension in the conda environment. This is due to changes in the newer version of Jupyter Server (~1.23)

Temporal Solution

  • Uninstall using pip jupyter_server_terminals

Upload data from working environment

I want to upload data files directly from my working environment so that I can perform the task without using a web-based GUI provided by the data repository.

Given that

  • A user has a working environment in JupyterLab with datasets ready to be archived in a data repository.
  • A user has created an empty data entry in a repository and it has the root URL to upload the data
  • The repository uses the Figshare API to upload datasets.

Some requirements and considerations

  • The root URL of a data entry in a repository shall be stored in a configuration file in the working environment
  • Large datasets shall be uploaded in parts whenever supported by the data repository.
  • The extension shall log the status of the data upload.
  • If the upload fails halfway the client shall be able to use the log to restart the upload process.
  • If the data repository doesn't support uploading large files in parts, on failure, the upload process shall start from the beginning.

Implementation ideas

  • As a first implementation step. The user can manually create a data entry in the 4TU Data repository and copy the root URL to a configuration file in the working environment. The configuration file can be implemented using a .env or .yaml file, and he Python packages: python-dotevn or Hydra

License for repository

The NWO requires that the outputs of the project be published under a CC-BY license.
Eventually, we have to change this.

Warning message for Mac installation

When installing using pip install jupyter-fairly in a new conda environment on Mac, the following warning message occurs:

DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at Homebrew/homebrew-core#76621

Cleanup JupyterFAIR repo

We should do some cleanup on the repo to avoid clutter. We need to look up unused branches that need to be deleted, issues that were solved but not close, or issues that are not relevant anymore but remain open.

Upload to 4TU and Zenodo fails

  • Attempting to upload to 4TU fails with the following:
[W 2023-03-19 23:50:58.222 ServerApp] wrote error: 'Unhandled error'
    Traceback (most recent call last):
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/tornado/web.py", line 1711, in _execute
        result = method(*self.path_args, **self.path_kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/tornado/web.py", line 3208, in wrapper
        return method(self, *args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/manuel/Documents/devel/jupyter-fairly/jupyter_fairly/jupyter_fairly/handlers.py", line 253, in post
        local_dataset.upload(client)
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/fairly/dataset/local.py", line 354, in upload
        dataset = client.create_dataset(self.metadata)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/fairly/client/__init__.py", line 292, in create_dataset
        id = self._create_dataset(metadata)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/fairly/client/figshare.py", line 753, in _create_dataset
        result, _ = self._request("account/articles", "POST", data={"title": metadata.get("title", "")})
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/fairly/client/__init__.py", line 355, in _request
        response.raise_for_status()
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
        raise HTTPError(http_error_msg, response=self)
    requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://api.figshare.com/v2/account/articles
  • Attempting to upload to Zenodo fails with the following:
[W 2023-03-19 23:54:16.875 ServerApp] wrote error: 'Unhandled error'
    Traceback (most recent call last):
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/tornado/web.py", line 1711, in _execute
        result = method(*self.path_args, **self.path_kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/tornado/web.py", line 3208, in wrapper
        return method(self, *args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/manuel/Documents/devel/jupyter-fairly/jupyter_fairly/jupyter_fairly/handlers.py", line 253, in post
        local_dataset.upload(client)
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/fairly/dataset/local.py", line 354, in upload
        dataset = client.create_dataset(self.metadata)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/fairly/client/__init__.py", line 292, in create_dataset
        id = self._create_dataset(metadata)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/fairly/client/zenodo.py", line 291, in _create_dataset
        result, _ = self._request("deposit/depositions", "POST", data={})
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/fairly/client/__init__.py", line 355, in _request
        response.raise_for_status()
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
        raise HTTPError(http_error_msg, response=self)
    requests.exceptions.HTTPError: 403 Client Error: FORBIDDEN for url: https://zenodo.org/api/deposit/depositions

Experiment with SWORD protocol for pushing and pulling data

Background

Researchers want to deposit data in different archives for several reasons. The point is that there is no single repository that research relies on. Given that SWORD protocol and it has been implemented to achieve interoperability for depositing.

Falsifiable hypothesis

We believe that implementing in our package a SWORD client, can make our package more reusable for different data providers that have implemente the sword protocol.

Experiment method

  • Count how many repositories have implemented SWORD protocol and what version
  • For the 4TU this would imply setting up a SWORD server that talks to the 4TU repository.
  • Implement/reuse a sword client fo repositories that are implementing SWORD (perhaps data verse)

Variables and methods

Quantitative: Repos that implement a sword server.

Qualitative: X amount of users didnt upload until metadata was complete

Results

Quantiative results:

Qualitative results:

Validated learning

Validated or invalidated
Summarize the learning

Next steps

Cloning is broken in new version of Request

Call the clone dataset function produces the following error on the server.

HTTPServerRequest(protocol='http', host='localhost:8888', method='POST', uri='/jupyter-fairly/clone?1692199914244', version='HTTP/1.1', remote_ip='127.0.0.1')
    Traceback (most recent call last):
      File "/home/manuel/.local/lib/python3.10/site-packages/requests_toolbelt/_compat.py", line 48, in <module>
        from requests.packages.urllib3.contrib import appengine as gaecontrib
    ImportError: cannot import name 'appengine' from 'requests.packages.urllib3.contrib' (/home/manuel/miniconda3/envs/jupyterfair3/lib/python3.10/site-packages/urllib3/contrib/__init__.py)

It might be related to the newest version of the request package: https://stackoverflow.com/questions/76175487/sudden-importerror-cannot-import-name-appengine-from-requests-packages-urlli

Experiment: Reusing package generated from swagger code generator

Background

Some repositories implement Open API, which can be used to generate swagger code, for SDKs.

Falsifiable hypothesis

Using swagger to generate the different implementations for the different clients can facilitate the implementation of some jupyFAIR functionality. This should be easier to do than writing the implementation ourselves for each repository.

Experiment method

Generate or reuse two swagger implementations into one package or codebase, to perform the following command, upload and download.

if the are for instance generated as python packages we could import them in a wrapper package that could act more as a CLI.

Variables and methods

Quantitative: % of something

Qualitative: X amount of users didnt upload until metadata was complete

Results

Quantiative results:

Qualitative results:

Validated learning

Validated or invalidated
Summarize the learning

Next steps

Specify which data files and folders to upload as a dataset

Story

I want to specify which data files and folders are included in the dataset so that I can manage the dataset easily

Given that

  • A user has a working environment (local or remote) with files and folders that she doesn't want to archive to the data repository.

Some requirements and considerations

  • The user shall specify in advance which files or folders to include in the data archiving process.
  • The user shall provide a list of files and folders to include or exclude from the data archiving process.
  • A user might need to specify a different list for different data repositories.

Implementation ideas

  • The user can provide a list of files and folders to include as a YAML/plain text file (a manifest) which the application can use to upload datasets to a data repository.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.