The api.npolar.no from npolar

A lot of tests are failing. They shouldn't.

Enhancement request: Auto-tag entries with parent category if necessary

Users will often tag an entry with only the most relevant tag. This tag is sometimes a "child" tag (e.g. ICE Fluxes), and this will result in a missing "parent" tag (e.g. ICE).

The result is somewhat counter-intuitive filtering options; a filter for a parent tag may produce far less hits than its child filter. Real-world example:
ICE Fluxes -> 50 hits
ICE -> 5 hits

Proposed solution: Automagically apply the parent tag (before saving) whenever a user has applied a child tag only.

The issue has been identified for ICE and its sub-programmes (e.g. ICE Fluxes), and some area tags, like areas contained within the Arctic. (E.g. entries may be tagged only "Svalbard", not "The Arctic".)

Example 1: http://www.npolar.no/en/publications/ (see "Programme" filters)
Example 2: http://www.npolar.no/en/projects/ (see "Area" filters)

Remove global index

And indexing rivers

JSONP breaks for GeoJSON

Because the mime type "application/vnd.geo+json" does not pass the simplistic #is_json? test in
https://github.com/rack/rack-contrib/blob/master/lib/rack/contrib/jsonp.rb

External data publishing, validating, logging

With external data dumped to some sink, like rsync to a disk folder, we need a way to prove after imports that a system actually contains the data it should

We can inject source metadata in /source and compare source metadata like number of documents that day (or hour or timeframe) to the actual content in the destination API.

Create /polarbear/interaction API

For PBSG's polar bears/human interaction database

Support CORS

For configureless CORS

All npolar.no origins and all HTTP verbs enabled over https
"Access-Control-Allow-Methods: GET, HEAD, OPTIONS" for HTTP

Handle or disallow shared (across APIs) ES indexes

Part of the problem is the DELETEs here https://github.com/npolar/api.npolar.no/blob/master/lib/npolar/api/bootstrap.rb#L100

Create /oceanography APIs

Auto-generate citation only when npolar.no is publisher

Feature request: Markup in relevant project fields

The text areas "summary" and "abstract" in projects would benefit from allowing simple markup (specifically: italic, bold, headings, subscript, superscript, link).

"Allow" header on 405

Support OPTIONS (HTTP verb)

$ curl -IXOPTIONS http://address

HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Content-Length: 0
Allow: DELETE, GET, HEAD, OPTIONS, POST, PUT

Create JSON schema to ES mapping tool

Schema API should switch to file backend

To avoid inconsistency issues, we switch /schema to a file system backed API.

There are 4 simple rules to follow
0. Naming convention is semantic like /schema/dataset-1.x.y[.dev].json (where x is major and y minor edit number)

Never update a file once published (increment version)
Copy (don't symlink) latest versioned file into a file without versioning
Always declare explicit versions on the client side
User same version in filename and in schema's id

Schema modifications needed for project and person

Employee and project entries suffer from a potential conflict between timestamps and their related "labels":

Employees: NPI Employee and On Leave
Projects: State

These labels should ideally be "auto-updated" by the system, deduced from the timestamps. (As opposed to the current solution, which requires manual updating. Alternatively, they could be dropped – they are, after all, redundant. However, they do make very useful filters.)

Example situation:
An employee can be set as "currently employed" by adding a "hired" timestamp, but the NPI Employee field could indicate the opposite, until manually updated. Same vice versa, and the same type of conflict exists for On Leave, as well as projects and their state (planned, active, etc.).

The current system effectively forces editors to continuously monitor all these timestamps/statuses manually, and manually update the entries. Furthermore, such updates must be done at very specific times (the exact times specified in the timestamps).

Couch storage feed includes _design documents

Description: When requesting a feed from couch it will return information on all documents in the database including the couch _design documents holding couch views.

Reproduction:

Create a view in couchDB. After saving the view it should show up in the database.
Do a feed query with the API against the database containing the design view.
The results should have one record with _design in the ID

Bugged Code: Code in couch.rb

Dataset vocabulary validation

Datasets are validated against XML and JSON schemas before saving, but vocabularies are not cheked.

Could improve XML schema validation by checking vocabularies against http://api.npolar.no/gcmd/concept/?q= or equivalent ie. https://github.com/npolar/gcmd/blob/master/lib/gcmd/concepts.rb

Republish all services

Some services will not survive republishing if ES mapping is out of sync with the data.

Also, need to kill old, manually setup indexing rivers

OAI-PMH provider for /dataset API

Must support DIF XML (metadataPrefix=dif) and must provide set support.

Probably use
https://github.com/code4lib/ruby-oai

Should support all 6 verbs from v2 spec http://www.openarchives.org/OAI/openarchivesprotocol.html#ProtocolMessages

Dataset validation should enforce DIF XML validity for non-draft datasets

Need both XML schema validation and JSON schema validation
Improve XML schema validation by checking vocabularies against http://api.npolar.no/gcmd/concept/?q= or equivalent ie. https://github.com/npolar/gcmd/blob/master/lib/gcmd/concepts.rb

Release Dataset JSON schema 1

Planned changes from .pre:

Allow just 1 license (and use programming English key license)
Use linked schema for links, people, organisations,roles,npolar_topics,iso_topics,progress,etc.
Allow all ISO roles, see https://wiki.earthdata.nasa.gov/display/NASAISO/Individuals%2C+Organizations%2C+and+Roles
License => URI
Force proper IANA media type for links/type

API browser should not use hard coded formats

But rely on the formats in the service definition

Time based scalable bulk operations and endpoints

Until now we don't have official URIs for getting storage-layer data out of the API.
We have a couple of hacks, like _feed, but this isn't a scalable solution.

We need official support and persistent URIs for getting data out in bulk
Bulk endpoints should be built as separate middleware, not hacked in the single document pipeline
Documentation

Updated => Edited

Auto-updating "updated" is sometimes unfortunate, when you have external data coming in you loose the last "updated" date.

In AtomPub "edited" is used for system-generated edit-time, while updated can be user provided. Suggest we move to that.

Make ES mapping explicit in service documents

Currently the only thing that works is using a folder/file naming convention
This is perhaps OK as default, but should be overridable with either inline or linked mapping

Migrate core to new language

Reimplement core functionality in Go

Improve RESTfulness for collection URIs

We need a robust mechanism to GET all documents in a collection.

The API co-evolved from ideas similar to those expressed in http://tools.ietf.org/html/draft-kelly-json-hal-06 - with widespread use of hyperlinks between documents. However, at the collection level, we lack links from the Collection URI to bundles of documents (similar to a database view). With such links, RESTful clients could follow a simple set of rules to download entire collections

Make sure organisation is set for datasets (before_save)

Create Monitoring API

For environmental monitoring timeseries.

Endpoint
/monitoring

workspace: monitoring

3 document types
collection:timeseries
collection:parameter
collection:indicator

Enhancement request: Re-arrange institutions in project

Allowing for re-arranging the order of affiliated institutions (Participants -> Organisations) for projects with drag-and-drop, like one can for participants (Participants -> People), would be a handy enhancement.

Support async work flows

One of the major features we are missing in version 1.0.0 support for async work flows like streaming.

Semantic schema versioning

Every client should declare explicit schema versions, and the server should reject (422) documents that are lower than the minimal schema version accepted.

Dataset: Enforce DIF XML validation

First migrate all current (draft != yes) datasets into valid DIFs.
Integrate XML validation an JSON schema validation
Improve XML schema validation by checking vocabularies against http://api.npolar.no/gcmd/concept/?q=

Implement data release date (data policy)

Fra datapolitikken:

Beskyttelsesperioden skal datofestes, og bør normalt ikke settes til mer enn 2 år etter at datainnsamlingen er avsluttet. Innenfor rammene av de kontraktsvilkår og eksterne betingelser som gjelder for prosjektet, kan beskyttelsesperioden gjøres lengre hvis det er nødvendig for å ivareta førsteretten til publisering. Seksjonsleder er ansvarlig for å sette frigivelsesdato og kan utvide beskyttelsesperioden inntil 4 år etter at datainnsamlingen er avsluttet. Utvidelse av beskyttelsesperioden ut over 4 år skal kun gis unntaksvis og må godkjennes av avdelingsdirektør.

Når beskyttelsesperioden er utløpt, eller når ingen beskyttelsesperiode er datofestet, vil data som er ferdig prosessert, dokumentert og lagret ved instituttet være åpent tilgjengelige fra data.npolar.no.

Editlog fails to store request body on 422

http://api.npolar.no/editlog/e6772170-e5d2-434f-8eaf-5435def7a2ea

Html view crashes when getting null document.

When reducing the response fields there are situations that one or more documents don't have these fields. when this happens the search engine will return a null entry.

Add a null check to see if there is anything in the entry => https://github.com/npolar/api.npolar.no/blob/master/views/api/index.rb#L162

Original bug reference npolar/icelastic#4

Protect against path re-use in the service API.

Currently it is possible to define the same path multiple times in different service documents.

When creating two service documents with different names but using the same path the API will boot and show you the same /endpoint twice. Both seem to be pointing to the same API

Validate, tune, document OAI-service

http://api.npolar.no/dataset/oai

Tools
http://oval.base-search.net/
http://validator.oaipmh.com/

delete-persistence?
resumptionToken?
granularity?

OK with oai:npolar.no:dataset - gives errors but should work according to the examples given at http://www.openarchives.org/OAI/2.0/guidelines-oai-identifier.htm - linked from
http://www.openarchives.org/OAI/2.0/oai-identifier.xsd#repositoryIdentifierType

npolar / api.npolar.no Goto Github PK

api.npolar.no's People

Contributors

Stargazers

Watchers

api.npolar.no's Issues

Recommend Projects

Recommend Topics

Recommend Org