Code Monkey home page Code Monkey logo

adama's Introduction

Logo

Adaptible Data And Microservices API

image

image

Documentation Status

This project implements a data federation strategy for the Arabidopsis Information Portal.

Quickstart

Grab a token from Araport API Store. After getting it, check access to Adama with:

$ export TOKEN=<my token>
$ curl -L -k https://api.araport.org/community/v0.3/status \
    -H "Authorization: Bearer $TOKEN"
{
    "api": "Adama v0.3", 
    "hash": "c08ae56c7b47e62c0247de22c75e9511c462c0e0", 
    "status": "success"
}   

The access to Adama is granted if the response looks like the above. Otherwise, please, report the output to @waltermoreira, or to the issues page.

If everything looks ok, proceed to the tutorial.

Documentation

Documentation is now being consolidated in one document at Read The Docs. The sources for the documentation live in docs/full.

Before graduating to a nice place like "Read The Docs", the preliminary and rough docs live in the wiki.

You can also read the live docs of the API.

License

Free software: BSD License

adama's People

Contributors

mwvaughn avatar waltermoreira avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

adama's Issues

ADAMA should return provenance information in its responses.

There's quite a bit of design work in this one before we decide on an implementation:

ADAMA should be able to send back developer-specified provenance/attribution with its responses. In our service types that return JSON, this may need to be an extra entry in the response document. In services where we don't control the response content, we may consider injecting a custom Header into our response.

Implement type of adapter: "generic"

A generic adapter is a generalization of a query or structured adapter, in which the adapter can return content type other than JSON (for example: images or CSV files).

Should we add a map-concat adapter as a generalization of map-filter

Brought by Alex in all-hands meeting:

A map-concat adapter generalizes a map-filter. The drawback is that the function has to return a list of length 1 for the most usual case of just map.

Alternatively, we could support it as another type, i.e., having map-filter, and map-concat as different types, and letting the user choose.

Atted adapter raises exception

The following request raises an exception

wmoreira@wmoreira-mac:~/Sandbox/swagger-ui/dist (master●) (ADAMA) $ curl -skL -H "Authorization: Bearer $TOKEN" http://api.araport.org/community/v0.3/aip/atted_coexpressed_by_locus_v0.1/search\?locus\=AT5G49450\&threshold\=0.1\&relationship_type\=correlation_coefficient
{"result": [
{"relationships": [{"direction": "undirected", "type": "coexpression", "scores": [{"correlation_coefficient": 0.8824}]}], "related_entity": "AT5G49448", "class": "locus_relationship", "locus": "AT5G49450"}
, {"relationships": [{"direction": "undirected", "type": "coexpression", "scores": [{"correlation_coefficient": 0.7257}]}], "related_entity": "AT1G02610", "class": "locus_relationship", "locus": "AT5G49450"}
, {"relationships": [{"direction": "undirected", "type": "coexpression", "scores": [{"correlation_coefficient": 0.5754}]}], "related_entity": "AT2G05540", "class": "locus_relationship", "locus": "AT5G49450"}
, {"relationships": [{"direction": "undirected", "type": "coexpression", "scores": [{"correlation_coefficient": 0.6867}]}], "related_entity": "AT5G56550", "class": "locus_relationship", "locus": "AT5G49450"}
, {"relationships": [{"direction": "undirected", "type": "coexpression", "scores": [{"correlation_coefficient": 0.6903}]}], "related_entity": "AT3G15450", "class": "locus_relationship", "locus": "AT5G49450"}
, {"relationships": [{"direction": "undirected", "type": "coexpression", "scores": [{"correlation_coefficient": 0.6109}]}], "related_entity": "AT1G49500", "class": "locus_relationship", "locus": "AT5G49450"}
, {"relationships": [{"direction": "undirected", "type": "coexpression", "scores": [{"correlation_coefficient": 0.665}]}], "related_entity": "AT1G70290", "class": "locus_relationship", "locus": "AT5G49450"}
, {"relationships": [{"direction": "undirected", "type": "coexpression", "scores": [{"correlation_coefficient": 0.568}]}], "related_entity": "AT5G54540", "class": "locus_relationship", "locus": "AT5G49450"}
, {"relationships": [{"direction": "undirected", "type": "coexpression", "scores": [{"correlation_coefficient": 0.665}]}], "related_entity": "AT2G18700", "class": "locus_relationship", "locus": "AT5G49450"}
, {"relationships": [{"direction": "undirected", "type": "coexpression", "scores": [{"correlation_coefficient": 0.6209}]}], "related_entity": "AT3G62950", "class": "locus_relationship", "locus": "AT5G49450"}
, {"relationships": [{"direction": "undirected", "type": "coexpression", "scores": [{"correlation_coefficient": 0.6621}]}], "related_entity": "AT5G22920", "class": "locus_relationship", "locus": "AT5G49450"}
, {"relationships": [{"direction": "undirected", "type": "coexpression", "scores": [{"correlation_coefficient": 0.5292}]}], "related_entity": "AT1G80630", "class": "locus_relationship", "locus": "AT5G49450"}
, {"relationships": [{"direction": "undirected", "type": "coexpression", "scores": [{"correlation_coefficient": 0.506}]}], "related_entity": "AT3G15358", "class": "locus_relationship", "locus": "AT5G49450"}
, {"traceback": "Traceback (most recent call last):\n  File \"/root/worker.py\", line 27, in callback\n    t = self.operation(message)\n  File \"/root/worker.py\", line 44, in operation\n    getattr(self.module, endpoint)(json.loads(body))\n  File \"/root/user_code/main.py\", line 59, in search\n    agi_locus_from_entrez = resolve_locus(result['gene'])\n  File \"/root/user_code/main.py\", line 80, in resolve_locus\n    result = response.json()['results'][0]['data'][0]['row'][0]\nIndexError: list index out of range\n", "error": "list index out of range"}
],
"metadata": {"time_in_main": null},
"status": "success"}

Namespaces are being created with empty name

We should not allow an empty name:

$ curl -k https://adama-dev.tacc.utexas.edu/community/v0.3/namespaces -H "Authorization: Bearer ...."  
{
    "result": [
        {
            "description": "Developer workspace for @", 
            "name": "", 
            "url": null
        }, 
        {
            "description": null, 
            "name": "walter-dev", 
            "url": null
        }, 
        {
            "description": "Developer workspace for @mwvaughn", 
            "name": "vaughn-dev", 
            "url": null
        }
    ], 
    "status": "success"
}

Provide an idiomatic way of failing in adapters, for every language

We want to make it easier for developers to specify errors in their adapters. Adama could provide a fail function that aborts and returns user provided information through the standard error channel. The function can be injected into the namespace of the adapter, so the user doesn't have to import any extra module.

Map should be map_filter

map_filter is the proper name to the pattern. In addition, it's easier to understand that we can do maps, filter, or both at the same time.

Import user's code at registration time

We want to fail soon at registration time if the user's code has syntax errors or missing requirements. Currently, this happens at the time of the first request.

Add a generator of adapter templates

Following the app generator for front end apps, it would be nice to have an adapter generator that builds a complete template for the several types of adapter. Using the git checkout feature, the template would include all the metadata necessary for registering the adapter.

In the future, this generator could be integrated with the app generator, so developers can have everything in one package.

Status of services should include health of workers

The status endpoint should check whether the workers for a service are really running. There should be an option to restart them, or to manage them. Not clear who would have access to such features: the user who register the adapters? Or only Adama admins?

Issue #23 also needs the restart (and rebuild, in case images are lost).

Allow moving adapters across namespaces

From Matt V.:

I wanted to chat about the case where a service is moved (or copied) from one namespace to another. Potential use cases for this involve turning over “maintenance” of a service to someone else, moving services out of a “development” namespace into a “production” namespace (vaughn vs vaughn-dev for example), or moving services to a namespace owned by the AIP (or other tenant, someday) project as part of a “publication” process.

The REST implementation seems simple: Via a PUT operation on a known service, allow update of the “namespace” attribute as long as the user has write access to that namespace. The question becomes what happens to the previous instance of the service. I would argue that we need to keep it around, but resolve it with a 301 Moved Permanently unless someone explicit issues a DELETE on the old resource.

/community/vaughn-dev/starbuck

moves, after PUT operation, to

/community/vaughn/starbuck

GET to /community/vaughn-dev/starbuck returns

301 Moved Permanently
Location: /community/vaughn/starbuck

until.. I DELETE /community/vaughn-dev/starbuck

Which then returns

404 Not Found

Add endpoint /access for passthrough adapters

A passthrough adapter operates via an /access endpoint, similar to how the other adapters operate via /search and /list. This leaves the / endpoint free for returning a representation of the Adama service object.

ADAMA should be pushing out a live Swagger spec

ADAMA should be publishing a Swagger spec for its core abilities (namespaces, services) but also, we should see if there's a simple way for devs to specify either a) Swagger or b) sufficient metadata that we can roll usage documentation from community APIs into ADAMA's documentation. In combination with the master Agave swagger spec, this should allow the App builder SDK to display and interpret usage info about community APIs. This should be VERY powerful.

Improved error handling and response for fields expected to be URIs

ADAMA should check and reject invalid URIs at POST/PUT time rather than accepting the request and failing internally. For example, submitting a bad URL as a git_repository field results in an attempt to check out that repo and generates a nasty stacktrace error. Another example, when one provides 'url' as a metadata field, it is not checked for validity. This could be a security risk.

Should we add a verbose option to queries?

The json returned on error contains the full traceback, making the output long and scary. Should we allow an option verbose true/false to enable/disable this behavior? If so, what would be the default value?

Exceptions in map adapters don't abort the request

If a map adapter raises an exception, we want to immediately abort the request and return the traceback. Currently, the adapter keeps being applied to every JSON object, and an array of errors is returned:

{"result": [
{"error": "integer division or modulo by zero"}
, {"error": "integer division or modulo by zero"}
],
"metadata": {},
"status": "success"}

Namespaces are subject to the same rules as adapter names

This may be a bug or a future. Since the namespace is part of the full name of the images for the containers, they cannot contain uppercase. If we don't like this, we may need to add a mapping between user names and internal names. This is related to issue #3.

Add option to preserve structure of data source response

A map adapter currently applies a transformation to an array of objects and returns the result as an array with label "results" (usual Agave 3-stanza format).

It is a valid use case to preserve the structure of the original response of the third party service. This would allow to use the Adama url of the registered service in place of the original service.

Allow developer to specify name of main file

Pros:

  • Repositories of adapters easier to understand
  • self documenting
  • Allows more than one adapter in same directory

Cons:

  • One more option (but this is mitigated when #4 is implemented)

ADAMA should support validation of JSON responses against defined JSON schemas (Draft 4)

Eventually, query (and perhaps map_*) adapter types should support (and maybe even enforce) validation of JSON responses against a) AIP-defined JSON schemas b) External JSON schemas. This is pretty easy to code up in the adapters themselves but I'd like to promote this as core ADAMA functionality. We're a ways off from needing this since we haven't even defined our schemas yet.

See https://pypi.python.org/pypi/jsonschema for more information on a jumping off point towards implementation.

Security model

From Matt V.:


After you implement the additional mediator types, we will need a security model in place for ADAMA. I think we discussed this as our initial draft

  1. All namespaces are publicly discoverable after creation (consistent with Adama’s behavior now)
  2. A namespace can be created by any user. It may only be edited or deleted by that user
  3. All services are are publicly discoverable after creation (consistent with Adama’s behavior now)
  4. A service can be created by any user. It may only be edited or deleted by that user

In the longer run, we will want to adopt Agave’s ACL model where

READ: Discover metadata about the service or namespace
WRITE: Update or delete the service or namespace
EXEC: Query against the service. Has no meaning for namespaces.

Make workers more fault tolerant on queue failures

The worker containers die when their persistent connection to the queue is interrupted. Make them so they can reasonably retry a number of times. This would make them resilient to temporary queue failures.

Duplicate entries in whitelist

This service automatically detected the whitelist, and in doing so duplicated the apps.araport.org entry. I assume this is a more pervasive issue. The functionality of the service was not affected by this bug.

curl -sk -L -X GET $API/$NS/services -H "Authorization: Bearer $TOKEN"
{
    "result": [
        {
            "code_dir": "/tmp/tmpYeTTFX/user_code", 
            "description": "Query Thalemine", 
            "json_path": "", 
            "language": "python", 
            "main_module": "main.py", 
            "metadata": "", 
            "name": "query_thalemine_as_jbrowse_index", 
            "namespace": "$NS", 
            "notify": "", 
            "requirements": [], 
            "type": "map_filter", 
            "url": "https://apps.araport.org/thalemine/service/jbrowse/names/3702", 
            "version": 0.1, 
            "whitelist": [
                "apps.araport.org", 
                "apps.araport.org"
            ], 
            "workers": [
                "b78265f520a66265685f49ae52e7bc5c53551e3f913454766614c609a2983445"
            ]
        }
    ], 
    "status": "success"
}

Convert checking-out repo to an asynchronous operation

If the check-out of a repository takes too long, WSO2 times-out. We need to do the check-out asynchronously to avoid this. We'll return a temporary url with a hash or token for the user to check the progress, since we won't know the metadata of the adapter until the repo is cloned.

Provide Referrer to 3rd party Web Services invoked via ADAMA

From Jason Miller "Third party Web Services would like to track their usage. If AIP Science Apps use their services, all their users will appear to be at AIP. We should pass through other information such as the source of the Science App or IP address of the user. This idea came up in our discussion with Nick Provart, whose web services do track usage."

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.