Code Monkey home page Code Monkey logo

reana's People

Contributors

alputer avatar audrium avatar daanrosendal avatar dinosk avatar dprelipcean avatar github-actions[bot] avatar giuseppe-steduto avatar goseind avatar hjhsalo avatar longr avatar marcdiazsan avatar mdonadoni avatar mr-c avatar mvidalgarcia avatar okraskaj avatar reana-team avatar ronalddobos avatar sashabaranov avatar tiborsimko avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

reana's Issues

cli: easy deletion of pods

Duringna- REANA development, when one tests some feature branch for e.g. r-j-controller, one currently deletes the pods "manually" after compiling the new version, for example:

$ cd reana-job-controller
$ reana git-checkout -b . 72 --fetch
$ reana docker-build -c .
$ kubectl delete pod -l app=job-controller

We can simplify this process by providing a new command:

$ reana-dev delete-pod -c .

that will basically wrap the kubectl delete pod -l app=... statements to ease the
developer task.

Note that e.g. r-w-e-yadage component creates pods called like yadage-default-worker-855d498f7f-nd7fh, so that we need a configuration table that would be translating component names to the pod prefixes, for example:

COMPONENT_PODS = {
  'reana-workflow-engine-yadage': 'yadage-default-worker',
  'reana-job-controller': 'job-controlller',
   }

so that kubectl delete pod -l app=... statements could be launched.

P.S. Note that reana delete-pod would by default delete all the cluster pods since no component is specified...

sprint: Internal-Consolidation kick-off

Internal-Consolidation sprint kick-off

Dates

  • 2018-04-18 kick-off
  • 2018-05-02 closure

People

Goals

Consolidate the REANA internals and the collaboration between components. Separate DB into a real service. Share models and schemas across the ecosystem. Improve logging and traceback propagation. Introduce simple serial workflow engine. Prepare basic CI.

Planning

internal-consolidation

Kanban

components: upgrade yadage/packtivity

After the milestone goals are reached and v0.1.0 released, we should look at upgrading yadage/packtivity everywhere:

./reana-client/setup.py:    'yadage-schemas==0.7.4',
./reana-workflow-engine-yadage/setup.py:    'yadage-schemas==0.7.4',
./reana-workflow-engine-yadage/setup.py:    'yadage==0.12.13',
./reana-workflow-monitor/setup.py:    'yadage==0.10.8',
./reana-workflow-monitor/setup.py:    'yadage-schemas==0.6.0',

./reana-workflow-engine-yadage/setup.py:    'packtivity==0.8.8',
./reana-workflow-monitor/setup.py:    'packtivity==0.5.18',

These are versions that should work:

packtivity==0.10.0
yadage[viz]==0.13.5
yadage-schemas==0.7.1

Stemmed from reanahub/reana-client#54 (comment)

RFC license change

In order to facilitate collaboration with external non-GPL licensed systems, we consider changing the REANA platform license from GNU GPL to a more permissive MIT License.

According to the change log, here is the full list of authors and contributors:

Can we please all explicitly express our agreement/disagreement?

tests: setup CI for end to end tests

Get a Kubernetes cluster running on CI environment with service account token (needed by REANA-Job-Controller) and test that examples work using reana CLI (reana install-client, reana install-cluster, reana run-example...). Some known options:

Alternatives for live editing on REANA cluster

Since we have experienced many troubles with minikube lately we have investigated the alternatives and compared:

Shared volume support Live feedback Stable
Telepresence no yes yes
Rebuild/restart yes no yes
Volume mounting yes yes no

Since we depend on shared volume mounting for REANA, Telepresence is not yet an option (maybe when we support S3 storage backend). Regarding the two other options, the only virtualization backend for minikube which supports volume mounting is VirtualBox but it is giving some problems from time to time, so rebuild+kill pods technique is the alternative (optimization of Dockerfiles will come to facilitate the workflow).

@reanahub/developers

RFC: clarify the role of "code"

To reiterate the 4 questions:

  1. What is your input data?
  2. What is your environment?
  3. Which code analyses it?
  4. Which steps did you take?

the role of "code" is not really clear yet. One the one hand it is being treated just like the data by being bind-mounted in, on the other hand a lot of the analysis code (if not all of it) is already captured in the environment (i.e. docker image)

I think it would be useful to collect thoughts to clarify this. In my understanding

code (from 3) +environment (from 4)-> runtime (container_image or filesystem)

there are multiple ways to arrive at this image

  • build the image via Dockerfile
  • various source2image solutions like

buildpacks (https://buildpacks.io)
source2image (https://github.com/openshift/source-to-image.git)
repo2docker (https://github.com/jupyter/repo2docker)
ad-hoc mounting in the data (only works for dynamic languages -- which is what we do now)
(docker run -v code:/code)

For a workflow there might be multple code bases (for each workflow), multiple images, etc. So we should try to define this carefully

Once the and then the data (from 1) ) is made available by mounting in a directory and running the step in 4) :

ctr run -v data:/data container_image "./mycommand.sh /data/in.root /data/out.root"

reana get reana-workflow-controller does not return a URL

$ reana get reana-workflow-controller
127.0.0.1:NodePort:                 http  30226

But the docs assume that it does:

# check workflows:
curl "http://$(reana get reana-workflow-controller)/api/workflows"
{
  "message": "An organization should be provided"
}

Deploying on Rancher

Issue:

When deploying Reana-Cluster onto a Rancher Kubernetes Cluster, I'm running into some certificate issues. Kubectl, on the other hand, still works without problems.

...
HTTPSConnectionPool(host='192.168.1.10', port=8443): Max retries exceeded with url: 
/k8s/clusters/c-rqbzb/api/v1/namespaces/default/secrets?includeUninitialized=false
(Caused by SSLError(CertificateError("hostname '192.168.1.10' doesn't match '192.168.1.10'",),))

Rancher is using port 8443, k8s API is available at (https://192.168.1.10:8443/k8s/clusters/c-rqbzb). I am able to access the url https://192.168.1.10:8443/k8s/clusters/c-rqbzb/api/v1/namespaces/default/secrets in my browser.
The certificate for rancher is auto-generated (self-signed). Could this be the problem? Btw. my kube-config file contains the certificate-authority-data section. Kubectl is not complaining about any ssl issues.

I'm trying to start my Reana-Cluster with the following command:

openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
    -keyout /tmp/tls.key -out /tmp/tls.crt \
    -subj "/CN=192.168.1.10"

./kubectl delete secrets reana-ssl-secrets
./kubectl create secret tls reana-ssl-secrets \
      --key /tmp/tls.key --cert /tmp/tls.crt

reana-cluster init # <-- exception occurs here

Steps to reproduce:

  1. Run Rancher-UI
# run rancher
docker run -d --name=rancher --restart=unless-stopped -p 8080:80 -p 8443:443 rancher/rancher:v2.0.8
  1. login (https://localhost:8443), create a new cluster ("custom") --> leave default settings, just click on "next"
  2. make sure to check "etcd", "Control Plane" and "Worker"
  3. copy generated output command into cli
  4. wait until cluster is initialized, click on "Kubeconfig file" and place the content into ~/.kube/config
  5. run commands shown in the issue section

installation: upper boundaries of dependencies

Now that reana-commons and reana-db are released on PyPI, let us put an upper boundary in our setup.py dependencies so that REANA v0.3.0 will be re-buildable and re-installable even in several months when we start working on and releasing REANA 0.4.0 or later versions.

(1) We should set upper boundaries for our own shared packages, for example:

 'reana-commons>=0.3.0,<0.4.0',

This needs to be done in many repositories:

$ grep 0.3.0 r*/setup.py | grep reana-
reana-client/setup.py:    'reana-commons>=0.3.0',
reana-cluster/setup.py:    'reana-commons>=0.3.0',
reana-job-controller/setup.py:        'sphinxcontrib-openapi>=0.3.0'
reana-job-controller/setup.py:    'reana-commons>=0.3.0',
reana-job-controller/setup.py:    'reana-db>=0.3.0',
reana-server/setup.py:        'sphinxcontrib-openapi>=0.3.0'
reana-server/setup.py:    'reana-commons>=0.3.0',
reana-server/setup.py:    'reana-db>=0.3.0',
reana-workflow-controller/setup.py:        'sphinxcontrib-openapi>=0.3.0',
reana-workflow-controller/setup.py:    'reana-commons>=0.3.0',
reana-workflow-controller/setup.py:    'reana-db>=0.3.0',
reana-workflow-engine-cwl/setup.py:        'sphinxcontrib-openapi>=0.3.0'
reana-workflow-engine-cwl/setup.py:    'reana-commons>=0.3.0',
reana-workflow-engine-serial/setup.py:    'reana-commons>=0.3.0'
reana-workflow-engine-yadage/setup.py:    'reana-commons>=0.3.0',

(2) We should consider pinning all important dependencies, for example here is how we depend on Celery:

$ grep -i celery r*/setup.py
reana-workflow-controller/setup.py:    'celery==3.1.17',
reana-workflow-engine-cwl/setup.py:    'celery==4.1.0',
reana-workflow-engine-serial/setup.py:    'celery>=4.1.0',
reana-workflow-engine-yadage/setup.py:    'celery>=4.2.1',
reana-workflow-monitor/setup.py:    'celery==3.1.17',

If Celery 4.3 is released in a few months and if the way how we use it is changed, we use it, then REANA v0.3.0 may not be installable then.

We may want to pin fully (celery==4.1.0) or we may rather pin for a reasonable interval (celery>=4.1.0,<4.2.0). The best pinning philosophy will depend on each concrete dependency.

(3) Sometimes we don't even indicate any version, e.g. only zmq. We should also add versions that are known to work at this v0.3.0 release time.

cli: configurable outputs for run-example

Developers can use reana CLI to test all workflow engines for a particular example via one command:

$ cd reana-demo-worldpopulation
$ reana run-example -c .
...
[reana-demo-worldpopulation] reana-client list -w worldpopulation.yadage | grep -q plot.png
[reana-demo-worldpopulation] echo OK
OK

The command tests for the presence of plot.png in the output.

However some examples such us reana-demo-helloworld produce different output files, e.g. greetings.txt.

We should make this configurable, for example introduce a new configuration variable in the beginning of the script:

EXAMPLE_OUTPUTS = {
   'reana-demo-helloworld': ('greetings.txt`,),
   '*': ('plot.png`,)
}

and amend run_example() to test for given files.

sprint: Basic-Tests kick-off

Basic tests sprint kick-off

Dates

GanttStart: 2018-10-01
GanttDue: 2018-10-12

People

Goals

  • Implement the minimum test infrastructure for all REANA components, involving mostly mocking of:
    • AMQP
    • Celery
    • REST API (OpenAPI specs)
    • Filesystem
  • Extract the test fixtures to a common repository to share across components.

Kanban

Tasks

docs: General development instructions for REANA components

In order to develop REANA components you need to somehow create pods based on the new, development time Docker images in your Kubernetes cluster.

Current install / instantiation scripts make Kubernetes pull the images of each REANA component from Docker Hub.

Of course one could push development time images to Docker Hub, but at least my opinion is that is should only contain images that are known to work (development time images might not) and released through proper release process.

To support development efforts, we probably should specify, or at least say a thing or two about the process of updating the Kubernetes cluster with a new image without pushing it to Docker Hub.

With Minikube, which I presume we would use and recommend for local development environment, you have couple of options:

1. Use local cache of Docker daemon running inside Minikube VM and update pod(s) by deleting previous pod(s) or by triggering a rolling update for Kubernetes cluster

  • Attach your Docker client to Docker daemon running in Minikube VM:
    eval "$(minikube docker-env)"

  • Build your Dockerfile (e.g. docker build . -t reana-message-broker).
    Name it according to REANA component naming schema (e.g. 'reana-message-broker').
    Don't tag the image in the build command. (e.g. docker build . -t reana-message-broker:development)
    The built, new image gets stored to local cache of Docker daemon running in Minikube VM.

  • Delete the existing pod which is based on the old Docker image:
    kubectl delete pods <POD_NAME> , where <POD_NAME> is e.g. 'message-broker-1410199975-gtcr8'
    (You can list pods with kubectl get pods)
    When Kubernetes sees that the pod is destroyed, it will create a new pod, using the latest image it has available either in a) local cache or in b) Docker registry specified to it (defaults to Docker Hub). In this case it will should use the new image from local cache.
    Alternatively you can trigger a rolling update for all of the REANA components deployed in Kubernetes cluster, which will re-create the pods using latest available images. Kubernetes will should use the new image from local cache.

  • Optionally you can check that the new pod was created based on the new image by comparing sha256 digest of the image reported by Docker client and sha256 digest of a pod's image reported by Kubernetes:

    • docker images --no-trunc reana-message-broker
      --> sha256:ec7b2076d2744361332be1397224025456910feb83d41fcc4ab8e409bab5cf6a
    • kubectl describe pod message-broker- | grep Image
      --> Image ID: docker://sha256:6a5233dac19eb8e24445cda16fcb7ab4bfaca4d32f1cad0f6701c4060a184987
  • Note: If imagePullPolicy: IfNotPresent or imagePullPolicy: Never is not explicitly used in Kubernetes manifest file, this method is going to fail. Currently the manifests of each REANA component specify imagePullPolicy: IfNotPresent. Kubernetes seems to default to imagePullPolicy: Always
    "By default, the kubelet will try to pull each image from the specified registry. However, if the imagePullPolicy property of the container is set to IfNotPresent or Never, then a local image is used (preferentially or exclusively, respectively)." (https://kubernetes.io/docs/concepts/containers/images/#pre-pulling-images)

2. Use local Docker registry and use "imagePullPolicy: Always" -policy for the deployment of REANA components in Kubernetes cluster. Update pod(s) by deleting previous pod(s) or by triggering a rolling update for Kubernetes cluster

  • Create a local Docker registry. (e.g. docker run -d -p 5000:5000 --restart=always --name registry registry:2)
    Optionally prepopulate it with Docker images of REANA components from Docker Hub:
    docker pull reanahub/reana-message-broker
    docker tag reanahub/reana-message-broker localhost:5000/reana-message-broker
    docker push localhost:5000/reana-message-broker

  • When starting Minikube specify that you want to use a private / custom Docker registry for the images. (e.g. minikube start --insecure-registry localhost:5000)

  • Build your Dockerfile (e.g. docker build . -t reana-message-broker).
    Name it according to REANA component naming schema (e.g. 'reana-message-broker').
    Don't tag the image in the build command. (e.g. docker build . -t reana-message-broker:development)

  • Push the new image to your local Docker registry (e.g. docker push localhost:5000/reana-message-broker)

  • Delete the existing pod which uses the old image.
    When Kubernetes sees that the pod is destroyed, it will create a new pod, using the latest image it can find from the local Docker registry. In this case it will should use the new image from local Docker registry.
    Alternatively you can trigger a rolling update for all of the REANA components deployed in Kubernetes cluster, which will re-create the pods using latest available images. Kubernetes will should use the new image from local Docker registry.

  • Optionally you can check that the new pod was created based on the new image by comparing sha256 digest of the image reported by Docker client and sha256 digest of a pod's image reported by Kubernetes:

    • docker images --no-trunc reana-message-broker
      --> sha256:ec7b2076d2744361332be1397224025456910feb83d41fcc4ab8e409bab5cf6a
    • kubectl describe pod message-broker- | grep Image
      --> Image ID: docker://sha256:6a5233dac19eb8e24445cda16fcb7ab4bfaca4d32f1cad0f6701c4060a184987
  • Note: This method will might fail if imagePullPolicy: IfNotPresent is used and image available from local cache of Docker daemon running in Minikube VM is newer than the one available from local Docker registry. Also imagePullPolicy: Never won't work.

Things to consider:

  • I don't know if both or even on of the methods would work on a "real" Kubernetes cluster the same way as they work on Minikube (should, but you never know without trying), or what kind of configuration efforts you need to do e.g. to specify Docker registry to use for a deployment of REANA component to Kubernetes cluster that you are not administrating (might not even be possible).

  • Docker instructions are not that clear on what is considered to be the latest image.
    Somewhere I read that if you already have an image in registry (or in local cache) and you build another image with the same name and in the build command specify a tag for it, this new image is not considered to be the latest image!!
    It will be considered to be latest image if you don't specify tags in the build command itself, and tag the image after building with a completely separate 'docker tag' command. (e.g. docker tag reana-message-broker reana-message-broker:development
    Docker seems to tag ':latest' automatically if don't specify a tag in the build command.
    I don't (yet) know if the image is considered to be the latest if one specifies ':latest' tag in the build command. (e.g. docker build . -t reana-message-broker:latest)
    This ambiguity affects both of the previously described methods, at least when developer tags the image in the actual build command without providing also the ':latest' tag.

tests: prepare local development test cluster

In order to have a local working cluster the following things need to be done:

Creation of reana-db

Currently, reana-commons contains the db models and Session handling. For this reason it could not be included in the workflow engines. A stand alone repository reana-db can be created to host all db related code (and alembic recipes in the future), allowing to merge reana-workflow-commons and reana-commons.

docs: "getting started with development" -instructions

Currently docs are mostly written for end-users that use reana-client to interact with REANA cloud.

There is a getting started with development section, but it focuses on bringing up REANA cluster.
It might be beneficial to add more detailed development instructions or direct the user to REANA cluster documentation and write development instructions there.

tests: REST API calls

Unit tests have to include REST API calls between components. In the case of end to end tests the components must be deployed, but for unit testing we care for contract testing the connection, i.e. the call was done with the correct arguments.

One option to do this would be to mock the part where the actual request is sent from the SwaggerClient, but do the full validation based on the openapi specification. An example of this case could be the following:

def test_mock_rwc_client(app, default_user, mock_rwc_client):
    client = SwaggerClient.from_spec(
        rwc_json_spec,
        http_client=Mock(),  # replacing the actual http client 
        config={'also_return_response': True})

    with pytest.raises(ValidationError):
        res = mock_rwc_client.api.create_workflow(
            workflow={'specification': {},
                             # 'type': 'serial', leaving out a required parameter should raise a ValidationError
                             'name': 'test'},
            user=str(default_user.id_)).result()

    res = mock_rwc_client.api.create_workflow(
        workflow={'specification': {},
                         'type': 'serial',
                         'name': 'test'},
        user=str(default_user.id_)).result()  # the result is a Mock but there was no exception

A second option would be to have a mock server running, using one of the designated openapi.tools, like (Prism)[https://stoplight.io/platform/prism/]. If it is deployed with Travis or our CI tool of choice, it should respond with preset sample responses.

Prism
Deploying it in travis is simple, it can be installed in the before_script section and launched with the path to the json openapi spec:

# in .travis.yaml
before_script:
- curl https://raw.githubusercontent.com/stoplightio/prism/master/install.sh | sudo sh
- prism run --mock --list --spec /home/travis/build/dinosk/reana-server/reana_server/openapi_connections/reana_workflow_controller.json &

# a test case calling the prism server
def test_prism_server(app, default_user):
    rwc_api_client = create_openapi_client('reana-workflow-controller')
    with pytest.raises(ValidationError):
        res = rwc_api_client.api.create_workflow(
            workflow={'specification': {},
                      # 'type': 'serial',
                      'name': 'test'},
            user=str(default_user.id_)).result()

    _, res = rwc_api_client.api.create_workflow(
        workflow={'specification': {},
                  'type': 'serial',
                  'name': 'test'},
        user=str(default_user.id_)).result()
    assert res.status_code == 201

Sandbox
https://github.com/getsandbox/sandbox MIT
This would require creating the mock endpoints and their responses in Javascript. The main website takes the openapi specification and spawns a test server available for 1 hour, but it doesn't seem to allow this auto generation from the command line, if we want to deploy it on travis

Microrocks
Installation from binaries not there yet, other installation options are Kubernetes, Openshift, unclear how to use with Travis CI. http://microcks.github.io/using/index.html#installation

Apisprout
No Swagger version 2 support. Doesn't generate sample responses based on schema.

cli: simplified setup-environment

REANA developers can use the following one-liner to set up their environments:

$ reana setup-environment

The implementation uses kubectl system calls, since it was written before reana-cluster env got --include-admin-token option.

Let us simplify the implementation now that --include-admin-token option is available.

docs: add section on "reanification" of analyses

Add section explaining how to "reanify" an analysis. I.e. how to start locally with checking environment using Docker, then choosing a workflow engine and running CWL/Yadage locally, and only then trying out reana.yaml and friends.

This used to be part of various REANA examples, but it makes sense to centralise it here and use shorter version in examples.

(stemmed from reanahub/reana-demo-worldpopulation#25)

docs: improve sphinxcontrib-openapi rendering

The current output for a given OpenAPI spec using sphinxcontrib-openapi is very poor for the moment (see REANA Job Controller API).

We are looking for:

  • Display all the useful information contained in the specification
  • Structure the information in a meaningful way
  • Provide an option to select which HTTP method the user want to display for a given endpoint (currently there is only available an option to select the endpoint, so all the methods for the given endpoint are displayed, see here)
  • Submit a PR with tests and docs upstream sphinxcontrib-openapi

Some sources of inspiration:

There is some code here, which can serve as a starting point. This would be the output:
screen shot 2017-05-24 at 11 22 58 am

SwaggerClient instantiation abstraction

Each component instantiating a SwaggerClient has its own methods or class doing the work.
There could be an abstract base class for this placed in reana-commons.

scripts: `reana down` does not destroy shared folder

reana down should also destroy the shared /reana folder on the minikube VM, otherwise people doing reana down && reana init would get into the problem described in reanahub/reana-workflow-controller#33

Is the word "down" sufficiently strong to express the meaning that everything will be deleted forever?

Shall we take inspiration from docker-compose and accept this and also introduce some softer "up", "stop" commands?

Shall we take inspiration from vagrant and require a stronger word "vagrant destroy"?

Shall we take inspiration from minikube and require a stronger word "minikube delete" and introduce a softer command "stop"?

global: move away from "experiment" paradigm to general "access scope"

This wi;l need some discussion, but maybe we can seed it here. Right now the notion of an "experiment" (as in ATLAS, CMS, etc..) is used in a lot of the code. In order to be generic probably it's best to define a number of "access scopes" and use that name across the code. If we want REANA e.g. be installed on-premises elsewhere, a "scope" could be a specific professor's lab, where various labs use the same reana instance but have their own secrets, etc.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.