itc-crib / jupyter-fairly Goto Github PK

View Code? Open in Web Editor NEW

8.0 7.0 3.0 1.55 MB

A JupyterLab extension for seamless integration of Jupyter-based research environments and research data repositories.

Home Page: https://fairly.readthedocs.io

License: MIT License

JavaScript 2.80% Python 31.26% TypeScript 65.74% CSS 0.19%

jupyterlab research-data research-data-management research-data-repository

jupyter-fairly's Introduction

Jupyter Fairly

A jupyterLab extension for the fairly package, and the seamless integration of Jupyter-based research environments and research data repositories.

This extension is composed of a Python package named jupyter_fairly for the server extension and a NPM package named jupyter-fairly for the frontend extension.

Requirements

JupyterLab >= 3.0 < 4
fairly == 0.4.1

This is the last version that supports JupyterLab 3.x.

Install

To install the extension, execute:

pip install jupyter_fairly

Configurations are stored in .fairly/config.json in the user's home directory. This is where the extension stores access tokens for data repositories.

To add an access tokens, use the Fairly menu in the JupyterLab main menu bar.

Uninstall

To remove the extension, execute:

pip uninstall jupyter_fairly

Troubleshoot

If you are seeing the frontend extension, but it is not working, check that the server extension is enabled:

jupyter server extension list

If the server extension is installed and enabled, but you are not seeing the frontend extension, check the frontend extension is installed:

jupyter labextension list

Contributing

Development install

Note: You will need NodeJS to build the extension package.

The jlpm command is JupyterLab's pinned version of yarn that is installed with JupyterLab. You may use yarn or npm in lieu of jlpm below.

# Clone the repo to your local environment
# Change directory to the jupyter_fairly directory
# Install package in development mode
pip install -e ".[test]"
# Link your development version of the extension with JupyterLab
jupyter labextension develop . --overwrite
# Server extension must be manually installed in develop mode
jupyter server extension enable jupyter_fairly
# Rebuild extension Typescript source after making changes
jlpm build

You can watch the source directory and run JupyterLab at the same time in different terminals to watch for changes in the extension's source and automatically rebuild the extension.

# Watch the source directory in one terminal, automatically rebuilding when needed
jlpm watch
# Run JupyterLab in another terminal
jupyter lab

With the watch command running, every saved change will immediately be built locally and available in your running JupyterLab. Refresh JupyterLab to load the change in your browser (you may need to wait several seconds for the extension to be rebuilt).

By default, the jlpm build command generates the source maps for this extension to make it easier to debug using the browser dev tools. To also generate source maps for the JupyterLab core extensions, you can run the following command:

jupyter lab build --minimize=False

Development uninstall

# Server extension must be manually disabled in develop mode
jupyter server extension disable jupyter_fairly
pip uninstall jupyter_fairly

In development mode, you will also need to remove the symlink created by jupyter labextension develop command. To find its location, you can run jupyter labextension list to figure out where the labextensions folder is located. Then you can remove the symlink named jupyter-fairly within that folder.

Testing the extension

Server tests

This extension is using Pytest for Python code testing.

Install test dependencies (needed only once):

pip install -e ".[test]"
# Each time you install the Python package, you need to restore the front-end extension link
jupyter labextension develop . --overwrite

To execute them, run:

pytest -vv -r ap --cov jupyter_fairly

Frontend tests

This extension is using Jest for JavaScript code testing.

To execute them, execute:

jlpm
jlpm test

Integration tests

This extension uses Playwright for the integration tests (aka user level tests). More precisely, the JupyterLab helper Galata is used to handle testing the extension in JupyterLab.

More information are provided within the ui-tests README.

Packaging the extension

See RELEASE

Citation

Please cite this software using as follows:

Garcia Alvarez, M., Girgin, S., & Urra Llanusa, J., Jupyter-fairly: a JupyterLab extension for the fairly pacakage [Computer software]

Acknowledgements

This research is funded by the Dutch Research Council (NWO) Open Science Fund, File No. 203.001.114.

Project members:

jupyter-fairly's People

Contributors

Stargazers

Watchers

Forkers

manugil j535d165 priya-gittest

jupyter-fairly's Issues

Package JupyterFAIR extension

Edit metadata locally

Story

I want to edit metadata locally with my favourite text editor/IDE so that I can create and update it easily.

Given that

The user can edit metadata in someway in the working environment (minimum a text editor with yaml)
When a new request from the working environment/commandline or gui is created, the metadata created is passed as input to create the new record.

Some requirements and considerations

The metadata needs to be validated against the standard before making the request
We are sticking to a basic standard, something like Dublin core (Business rule)

Implementation ideas

Metadata is created in yaml by the user
It is transformed to xml. Package example
Then using some validator against schema. XMLschema package
Then this is passed as a body request to the api.

Any ideas or issues? This raises other issues such as downloading and getting the latest version, with the yml format for instace...

Unexpected error when installing extension

Is this something related to fairly?
This error first appeared when trying to update the fairly package in the extension. That is a re-installation was called.
I started the installation process in a new venv and now the error appears with attempting to install the extension in develop mode with jupyter labextension develop --overwrite .

ModuleNotFoundError: There is no labextension at .. Errors encountered: 
[TypeError("the 'package' argument is required to perform a relative import for '.'"), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), 
AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), 
AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), 
AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), 
AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL'), AttributeError('Invalid client_id orcid:APP-IEL')]
See the log file for details:  /tmp/jupyterlab-debug-k2ryluqd.log

Side panel for jupyterlab extension

Create a data record from working environment

Story

I want to create a metadata record directly from my working environment so that I can perform the task without leaving my working environment.

Given that

A user is logged in a server or jupyter-lab server or console
When user sends a request using a command (either via console or some gui)
Then the application checks for the validity of the record
Then the request is sent
Then the server/data provider returns a link where the record and the item lives. So that the user can also check it
Then confirmation that request was succesful

Some requirements and considerations

See #11. The user needs to create a record via either a text editor or some kind of form extension in jupyter lab.
It would be nice to use some generic protocol, instead of specific API

Implementation ideas

Basic option: Do this via 4TU API
Do a wrapper class that has an interface with different implementations for the same repository. These idea can be explored using the Open API swagger code generator for those repositories that are compatible the Open API specification.
Find a common protocol in the spirit of SWORD and OAI-PMH

Test handler for Cloning dataset

Fix broken reselease action

The release action is broken. https://github.com/ITC-CRIB/jupyter-fairly/actions/runs/9081282678/job/24954745998
Pushing the package using twine works. The issue may be related with the changes in authentication in PyPI. We will try updating the token or registering action.

Add status notification for Uploading a dataset

The user should have a visual queue of the progress of uploading files and metadata to a data repository. Either a status bar or at least a splash screen.

Add status notification for Downloading datasets

The user should have a visual queue of the progress of downloading a dataset. Either a status bar or at least a splash screen.

Upload does not work with new 4TU platform

Uploading dataset fails due to changes in the endpoints of the new platform of the repository.
A new client for fairly should be developed to solve this issue.

Configuration command for GUI

Implement configuration command in the context menu. The command opens the config.json in a JuperLab tab.

Release Documentation

Prepare bare minimum documentation for the next release.

Functionality to clone datasets from zenodo is broken

This might be due to changes to the Zenodo api

Incompatibility with jupyter_server_terminals

Enabling the server extension fails when jupyter_server_terminals is installed as an extension in the conda environment. This is due to changes in the newer version of Jupyter Server (~1.23)

Temporal Solution

Uninstall using pip jupyter_server_terminals

Migrate from JupyterFair extenstion to Jupyterlab 4

Refer to migration documentation: https://jupyterlab.readthedocs.io/en/latest/extension/extension_migration.html#jupyterlab-3-x-to-4-x

example

Improve HTTP error handling in server handlers

Extend error handling in GUI components

Implement JupyterLab extension basic GUI

Hide edit metadata command in context menu depending on context

The 'Edit metadata' command should only be 'enable' when the directory contains a manifest.yalm file

Upload data from working environment

I want to upload data files directly from my working environment so that I can perform the task without using a web-based GUI provided by the data repository.

Given that

A user has a working environment in JupyterLab with datasets ready to be archived in a data repository.
A user has created an empty data entry in a repository and it has the root URL to upload the data
The repository uses the Figshare API to upload datasets.

Some requirements and considerations

The root URL of a data entry in a repository shall be stored in a configuration file in the working environment
Large datasets shall be uploaded in parts whenever supported by the data repository.
The extension shall log the status of the data upload.
If the upload fails halfway the client shall be able to use the log to restart the upload process.
If the data repository doesn't support uploading large files in parts, on failure, the upload process shall start from the beginning.

Implementation ideas

As a first implementation step. The user can manually create a data entry in the 4TU Data repository and copy the root URL to a configuration file in the working environment. The configuration file can be implemented using a .env or .yaml file, and he Python packages: python-dotevn or Hydra

Update readme of JupyterFAIR repo

We need to update the readme file in line with FAIR principles

License for repository

The NWO requires that the outputs of the project be published under a CC-BY license.
Eventually, we have to change this.

Discussion on GUI design for the JupyterLab extension

How should users interact with the JupyterFAIR extension via a GUI?
What the GUI for the JupyterFAIR extension should look like?
Can we use some of the GUI helpers from JupyterLab?
Should we use the lumino library instead for building a custom GUI?

Implement handler for archiving datasets

Warning message for Mac installation

When installing using pip install jupyter-fairly in a new conda environment on Mac, the following warning message occurs:

DEPRECATION: Configuring installation scheme with distutils config files is deprecated and will no longer work in the near future. If you are using a Homebrew or Linuxbrew Python, please see discussion at Homebrew/homebrew-core#76621

Cleanup JupyterFAIR repo

We should do some cleanup on the repo to avoid clutter. We need to look up unused branches that need to be deleted, issues that were solved but not close, or issues that are not relevant anymore but remain open.

Implement handlers for JupyterLab extension backend

Upload to 4TU and Zenodo fails

Attempting to upload to 4TU fails with the following:

[W 2023-03-19 23:50:58.222 ServerApp] wrote error: 'Unhandled error'
    Traceback (most recent call last):
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/tornado/web.py", line 1711, in _execute
        result = method(*self.path_args, **self.path_kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/tornado/web.py", line 3208, in wrapper
        return method(self, *args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/manuel/Documents/devel/jupyter-fairly/jupyter_fairly/jupyter_fairly/handlers.py", line 253, in post
        local_dataset.upload(client)
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/fairly/dataset/local.py", line 354, in upload
        dataset = client.create_dataset(self.metadata)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/fairly/client/__init__.py", line 292, in create_dataset
        id = self._create_dataset(metadata)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/fairly/client/figshare.py", line 753, in _create_dataset
        result, _ = self._request("account/articles", "POST", data={"title": metadata.get("title", "")})
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/fairly/client/__init__.py", line 355, in _request
        response.raise_for_status()
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
        raise HTTPError(http_error_msg, response=self)
    requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://api.figshare.com/v2/account/articles

Attempting to upload to Zenodo fails with the following:

[W 2023-03-19 23:54:16.875 ServerApp] wrote error: 'Unhandled error'
    Traceback (most recent call last):
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/tornado/web.py", line 1711, in _execute
        result = method(*self.path_args, **self.path_kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/tornado/web.py", line 3208, in wrapper
        return method(self, *args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/manuel/Documents/devel/jupyter-fairly/jupyter_fairly/jupyter_fairly/handlers.py", line 253, in post
        local_dataset.upload(client)
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/fairly/dataset/local.py", line 354, in upload
        dataset = client.create_dataset(self.metadata)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/fairly/client/__init__.py", line 292, in create_dataset
        id = self._create_dataset(metadata)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/fairly/client/zenodo.py", line 291, in _create_dataset
        result, _ = self._request("deposit/depositions", "POST", data={})
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/fairly/client/__init__.py", line 355, in _request
        response.raise_for_status()
      File "/home/manuel/miniconda3/envs/jupyterfairly/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
        raise HTTPError(http_error_msg, response=self)
    requests.exceptions.HTTPError: 403 Client Error: FORBIDDEN for url: https://zenodo.org/api/deposit/depositions

Experiment with SWORD protocol for pushing and pulling data

Background

Researchers want to deposit data in different archives for several reasons. The point is that there is no single repository that research relies on. Given that SWORD protocol and it has been implemented to achieve interoperability for depositing.

Falsifiable hypothesis

We believe that implementing in our package a SWORD client, can make our package more reusable for different data providers that have implemente the sword protocol.

Experiment method

Count how many repositories have implemented SWORD protocol and what version
For the 4TU this would imply setting up a SWORD server that talks to the 4TU repository.
Implement/reuse a sword client fo repositories that are implementing SWORD (perhaps data verse)

Variables and methods

Quantitative: Repos that implement a sword server.

Qualitative: X amount of users didnt upload until metadata was complete

Results

Quantiative results:

Qualitative results:

Validated learning

Validated or invalidated
Summarize the learning

Next steps

Cloning is broken in new version of Request

Call the clone dataset function produces the following error on the server.

HTTPServerRequest(protocol='http', host='localhost:8888', method='POST', uri='/jupyter-fairly/clone?1692199914244', version='HTTP/1.1', remote_ip='127.0.0.1')
    Traceback (most recent call last):
      File "/home/manuel/.local/lib/python3.10/site-packages/requests_toolbelt/_compat.py", line 48, in <module>
        from requests.packages.urllib3.contrib import appengine as gaecontrib
    ImportError: cannot import name 'appengine' from 'requests.packages.urllib3.contrib' (/home/manuel/miniconda3/envs/jupyterfair3/lib/python3.10/site-packages/urllib3/contrib/__init__.py)

It might be related to the newest version of the request package: https://stackoverflow.com/questions/76175487/sudden-importerror-cannot-import-name-appengine-from-requests-packages-urlli

Method list_my_archives() in FourTuData client returns status code instead of response body

The list_my_archives() is returning the status code instead of the list of archives.

Example:

data_repository = FourTuData()
list_of_archives=data_repository.list_my_archives()
print(list_of_archives)

Output:

<Response [200]>

Experiment: Reusing package generated from swagger code generator

Background

Some repositories implement Open API, which can be used to generate swagger code, for SDKs.

Falsifiable hypothesis

Using swagger to generate the different implementations for the different clients can facilitate the implementation of some jupyFAIR functionality. This should be easier to do than writing the implementation ourselves for each repository.

Experiment method

Generate or reuse two swagger implementations into one package or codebase, to perform the following command, upload and download.

if the are for instance generated as python packages we could import them in a wrapper package that could act more as a CLI.

Variables and methods

Quantitative: % of something

Qualitative: X amount of users didnt upload until metadata was complete

Results

Quantiative results:

Qualitative results:

Validated learning

Validated or invalidated
Summarize the learning

Next steps

Specify which data files and folders to upload as a dataset

Story

I want to specify which data files and folders are included in the dataset so that I can manage the dataset easily

Given that

A user has a working environment (local or remote) with files and folders that she doesn't want to archive to the data repository.

Some requirements and considerations

The user shall specify in advance which files or folders to include in the data archiving process.
The user shall provide a list of files and folders to include or exclude from the data archiving process.
A user might need to specify a different list for different data repositories.

Implementation ideas

The user can provide a list of files and folders to include as a YAML/plain text file (a manifest) which the application can use to upload datasets to a data repository.

itc-crib / jupyter-fairly Goto Github PK

jupyter-fairly's Introduction

Jupyter Fairly

Requirements

Install

Uninstall

Troubleshoot

Contributing

Development install

Development uninstall

Testing the extension

Server tests

Frontend tests

Integration tests

Packaging the extension

Citation

Acknowledgements

jupyter-fairly's People

Contributors

Stargazers

Watchers

Forkers

jupyter-fairly's Issues

Story

Given that

Some requirements and considerations

Implementation ideas

Story

Given that

Some requirements and considerations

Implementation ideas

Given that

Some requirements and considerations

Implementation ideas

Background

Falsifiable hypothesis

Experiment method

Variables and methods

Results

Validated learning

Next steps

Background

Falsifiable hypothesis

Experiment method

Variables and methods

Results

Validated learning

Next steps

Story

Given that

Some requirements and considerations

Implementation ideas

Recommend Projects

Recommend Topics

Recommend Org