Code Monkey home page Code Monkey logo

datahub's Introduction

Datahub

Software License Build Status

The Datahub is a metadata aggregator. This application allows data providers to aggregate and publish metadata describing objects on the web through a RESTful API leveraging standardized exchange formats.

The Datahub is build with the Symfony framework and MongoDB.

Features

  • A RESTful API which supports:
    • Ingest and retrieval of individual metadata records.
    • Validation of ingested records against XSD schemas.
    • Supports OAuth to restrict access to the API.
  • An OAI-PMH endpoint for harvesting metadata records.
  • Includes support for LIDO XML but can be extended to include MARC XML, Dublin Core or other formats.

Requirements

This project requires following dependencies:

  • PHP = 5.6.* or 7.0.*
    • With the php-cli, php-intl, php-mbstring and php-mcrypt extensions.
    • The PECL Mongo (PHP5) or PECL Mongodb (PHP7) extension. Note that the mongodb extension must be version 1.2.0 or higher. Notably, the package included in Ubuntu 16.04 (php-mongodb) is only at 1.1.5.
  • MongoDB >= 3.2.10

Install

Via Git:

$ git clone https://github.com/thedatahub/Datahub.git datahub
$ cd datahub
$ composer install # Composer will ask you to fill in any missing parameters 
  before it continues

You will be asked to configure the connection to your MongoDB database. You will need to provide these details:

  • The connection to your MongoDB instance (i.e. mongodb://127.0.0.1:27017)
  • The username of the user (i.e. datahub)
  • The password of the user
  • The database where your data will persist (i.e. datahub)

Before you install, ensure that you have a running MongoDB instance, and you have created a user with the right permissions. From the [Mongo shell]https://docs.mongodb.com/getting-started/shell/client/) run these commands to create the required artefacts in MongoDB:

> use datahub
> db.createUser(
   {
     user: "datahub",
     pwd: "password",
     roles: [ "readWrite", "dbAdmin" ]
   }
)

The configuration parameters will be stored in app/config/parameters.yml.
You'll need to run an initiial one-time setup script, which will scaffold the database structure, generate CSS assets and create the application 'admin' user.

$ app/console app:setup
$ app/console doctrine:mongodb:fixtures:load --append

If you want to run the datahub for testing or development purposes, execute this command:

$ app/console server:run

Use a browser and Navigate to http://127.0.0.1:8000. You should now see the welcome screen.

Refer to the Symfony setup documentation to complete your installation using a fully featured web server to make your installation operational in a production environment.

Usage

Credentials

The application is installed with as default username admin and as default password datahub. Changing this is highly recommended.

The REST API

The REST API is available at api/v1/data. Documentation about the available API methods can be found at /docs/api.

POST and PUT actions

The PUT and POST actions expect and XML formatted body in the HTTP request. The Content-Type HTTP request header also needs to be set accordingly. Currently, supported: application/lido+xml. Finally, you will need to add a valid OAuth token via the access_token query parameter.

A valid POST HTTP request looks like this:

POST /api/v1/data?access_token=MThmYWMxMjFlZWZmYjVmZDU2NDNmZWIzYTE0YmNiYTk3YTc5ODJmMWJjOGI1MjE5MWY4ZjEyZWZlZmM2ZmZmNg HTTP/1.1
Host: example.org
Content-Type: application/lido+xml
Cache-Control: no-cache

<?xml version="1.0" encoding="UTF-8"?>
<lido:lido xmlns:lido="http://www.lido-schema.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.lido-schema.org http://www.lido-schema.org/schema/v1.0/lido-v1.0.xsd">
	<lido:lidoRecID lido:source="Deutsches Dokumentationszentrum für Kunstgeschichte - Bildarchiv Foto Marburg" lido:type="local">DE-Mb112/lido-obj00154983</lido:lidoRecID>
	<lido:category>
...

GET actions

Sending a GET HTTP request to the api/v1/data endpoint will return a paginated list of all the records available in the API. The endpoint will return a HTTP response with a JSON formatted body. The endpoint respects the HATEOAS constraint.

Content negotation is currently only supported via a file extension on individual resource URL's. Negotation via the HTTP Accept header is on the roadmap.

GET api/v1/data               # only JSON supported
GET api/v1/data/objectPID     # return JSON
GET api/v1/data/objectPID.xml # return XML

The OAI endpoint

The datahub supports the OAI-PMH protocol. The endpoint is available via the /oai path.

GET oai/?metadataPrefix=oai_lido&verb=ListIdentifiers
GET oai/?metadataPrefix=oai_lido&verb=ListSets
GET oai/?metadataPrefix=oai_lido&verb=ListRecords
GET oai/?metadataPrefix=oai_lido&verb=ListRecords&metadataPrefix=oai_lido&set=creator:brueghel_pieter_ii
GET oai/?metadataPrefix=oai_lid&verb=GetRecord&metadataPrefix=oai_lido&identifier=objectPID
GET oai/?metadataPrefix=oai_lido&verb=ListIdentifiers&metadataPrefix=oai_lido&from=2017-06-29T05:22:30Z&until=2017-07-14T04:22:30Z

The datahub implements grouping of records into sets, but no soft deletes. As such, the OAI endpoint doesn't indicate whether a record has been deleted.

OAuth support and security

The datahub API can be set up to be either a public or a private API. The public_api_method_access parameter in parameters.yml allows you to configure which parts of the API are public or private:

    # Setting this to some unknown value like [FOO] disables public api access
    # Leaving this option empty [] means allowing all methods for anonymous access
    # public_api_method_access: [FOO]
    public_api_method_access: [GET]

The datahub requires OAuth authentication to ingest or retrieve metadata records. The administrator has to issue a user account with a client_id and a client_secret to individual Users or client applications. Before clients can access the API, they have to request an access token:

curl 'http://localhost:8000/oauth/v2/token?grant_type=password&username=admin&password=datahub&client_id=slightlylesssecretpublicid&client_secret=supersecretsecretphrase'

Example output:

{
    "access_token": "ZDIyMGFiZGZkZWUzY2FjMmY4YzNmYjU0ODZmYmQ2ZGM0NjZiZjBhM2Q0Y2ZjMGNiMjc0ZWIyMmYyODMzMGJjZg",
    "expires_in": 3600,
    "token_type": "bearer",
    "scope": "internal web external",
    "refresh_token":  "MzhkYzY0MzMxM2FmNmQyODhiOWM4YzEzZjI3YzViZjg3ZThlMTA2YWY4ZTc2YjUwYzgxNzVhNTlmYTBkYWZhNQ"
}

The endpoint can also be used to revoke both access and refresh tokens.

curl 'http://localhost:8000/oauth/v2/revoke?token=ZDIyMGFiZGZkZWUzY2FjMmY4YzNmYjU0ODZmYmQ2ZGM0NjZiZjBhM2Q0Y2ZjMGNiMjc0ZWIyMmYyODMzMGJjZg'

Example output:

{
    "result": "success",
    "message": "The token has been revoked."
}

Change log

Please see CHANGELOG for more information what has changed recently.

Testing

Testing will require a MongoDB instance, as well as Catmandu installed. You can either take care of this yourself, or run the tests using the provided Docker container.

Please ensure you've taken care of the initial setup described above before attempting to run the tests.

Running tests:

./scripts/run_tests

Running tests using Docker:

./scripts/run_tests_docker

Front end development

Front end workflows are managed via yarn and webpack-encore.

The layout is based on Bootstrap 3.3 and managed via sass. The code can be found under app/resources/public/sass.

Javascript files can be found under app/resources/public/js. Dependencies are managed via yarn. Add vendor modules using require.

Files are build and stored in web/build and included in app/views/app/base.html.twig via the asset() function.

The workflow configuration can be found in webpack.config.js.

Get started:

# Install all dependencies
$ yarn install
# Build everything in development
$ yarn run encore dev
# Watch files and build automatically
$ yarn run encore dev --watch
# Build for production
$ yarn run encore production

Contributing

Please see CONTRIBUTING for details.

Authors

All Contributors

Copyright and license

The Datahub is copyright (c) 2016 by Vlaamse Kunstcollectie vzw and PACKED vzw.

This is free software; you can redistribute it and/or modify it under the terms of the The GPLv3 License (GPL). Please see License File for more information.

datahub's People

Contributors

bert-packed avatar dependabot[bot] avatar hero-solutions avatar kalmanolah avatar nassia avatar netsensei avatar pieterdp avatar robwyse avatar tomvda avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

datahub's Issues

Move bundle services to app/config/services.yml

Detailed description

The app is split in a few core bundles who each manages their own services.yml with a custom dependencyInjection class. Since we don't plan to release those bundles as separate, stand alone package, it's overkill to spread all the service definitions across bundles.

Instead, we should centralise them in app/config/services.yml.

Context

This is important because it improves maintainability. Additionally, this would allow us to use Auto Wiring for the application logic.

Possible implementation

  • Consolidate all the services.yml files into app/config/services.yml
  • Remove the DependencyInjection / APIExtension classes from the bundles

We should be aware of issues with namespacing and renaming bundles and services. See #51.

[Users] Integrate FOS UserBundle profile support

Detailed description

The Datahub uses FOS User Bundle which comes with /profile route (and subroutes) which allows the current logged in user to change their own mail address and password. Currently, the interface doesn't link to those paths and neither the forms / views on those paths are themed.

It would be a good, userfriendly enhancement to include them in the Datahub.

Context

  • Userfriendly way of changing your own profile details, password, etc.
  • Extensible to include other features we might want to include in the long term future.

MongoException 'invalid name' when updating/deleting records

Updating/deleting records results in an MongoException complaining about an invalid name.

Error message

request.CRITICAL: Uncaught PHP Exception MongoException: "MongoDB::__construct(): invalid name " at /vagrant/vendor/doctrine/mongodb/lib/Doctrine/MongoDB/Database.php line 579 {"exception":"[object] (MongoException(code: 2): MongoDB::__construct(): invalid name  at /vagrant/vendor/doctrine/mongodb/lib/Doctrine/MongoDB/Database.php:579)"} []

Dashboard: indications of the contents of an instance

When you a user navigates to the dashboard, there should be an indication or notification of the active format (LIDO, Dublin Core, MODS, etc.). This allows the user to anticipate the format of the data stored in the hub and made available through the API.

Preferably the notification displays this information:

  • The format canonical name (i.e. LIDO, Dublin Core)
  • A link to the homepage of the format (i.e. http://lido-schema.org)
  • A link to the Wikipedia page of the format with more information

These should be stored in the PHP classes that define the format types.

The dashboard should also feature a short overal description of the data it contained. i.e. "Archival records stored in the City Archives" or "Library records from the University of Foo Bar" or "Miscellaneous blog posts aggregated from several sources" (Dublin core), etc.

This should be part of the general configuration in parameters.yml

Support format: MODS

From Wikipedia:

The Metadata Object Description Schema (MODS) is an XML-based bibliographic description schema developed by the United States Library of Congress' Network Development and Standards Office. MODS was designed as a compromise between the complexity of the MARC format used by libraries and the extreme simplicity of Dublin Core metadata.

See: http://www.loc.gov/standards/mods/

Todo:

  • Check with the copyright status of the XSD schema's
  • Register the format
  • Add the XSD schema's
  • Add functional tests for MODS XML.

[meta] Unit tests should reflect and cover the current API contract

The unit tests were written against an older API contract. The POST /v1/data call in DataControllerTest.php does not reflect the newer structure.

The body of the HTTP request is a JSON object with 2 parameters: "format" and "data". The new API call should just receive a serialized record as is (ie. a LIDO record) and have the format specified as a HTTP "Content-Type" request header. The datahub instance itself has a global parameter which defines the type of format which will be accepted.

Unit tests should also test both valid and invalid input.

  • A test to assert if the API call returns an error if JSON was POST'ed when XML is expected by the datahub
  • A test to assert if a GET call returns a valid response body.
  • A test to assert if a GET call with an invalid id returns a correct error.
  • ....

This is a meta issue. Refer to this one through specific issues aimed at changing / fixing concrete unit tests.

[Users] Missing form fields for newly created user

When creating a new user - and editing this same user - several important fields are missing:

  • Password
  • Status

=> When you create a new user - or edit afterwards - it is not possible to set the password.
=> When you create a new user - or edit afterwards - it is not possible to set the status to 'enabled'.

Move to Bootstrap 4.0

Detailed description

Bootstrap 3.3.7 is currently used to drive the (primitive) interface. However, Bootstrap 4.0 is the new stable version. Since the interface is still fairly uncomplicated, now's the chance to upgrade to Bootstrap 4.0.

Context

This change is important to ensure the longevity and the maintenance of the project. While Bootstrap 3.3.7 certainly has it merits, it's better to go with the latest major stable release while there's still a window of opportunity to do so.

If we stay with 3.3.7 and develop a more advanced interface - pending certain requests - it will be harder to move to 4.x and later major versions.

Possible implementation

  • Switch out bootstrap-sass for the bootstrap package
  • Rebuild the templates using bootstrap 4.0 templates.

The latter change is due anyhow since the original templates where conceived within a very limited scope.

Fatal error FOS\OAuthServerBundle\Storage\OAuthStorage::__construct()

Problem

After installation, this error will pop up:

ContextErrorException in OAuthStorage.php line 80:
Catchable Fatal Error: Argument 5 passed to FOS\OAuthServerBundle\Storage\OAuthStorage::__construct() must implement interface Symfony\Component\Security\Core\User\UserProviderInterface, instance of FOS\UserBundle\Doctrine\UserManager given, called in /vagrant/app/cache/dev/appDevDebugProjectContainer.php on line 2171 and defined

Cause

fos_user.user_manager became deprecated in favor of fos_user.user_provider.

This was fixed in FriendsOfSymfony/FOSOAuthServerBundle#26

Solution

config.yml:

    service:
        user_provider: fos_user.user_provider.username

Deprecate the "navigation" sidebar in the User section

Detailed description

The "Users" section under /user now has this left hand "Navigation" sidebar with just one item: "Users" which links to /user/users. The navigation sidebar is confusing, not really needed and should be deprecated / removed.

Context

This sidebar is only needed because of the convoluted /user/users routing. When the routing gets fixed in #41, the navigation sidebar can be deprecated, making the interface less complicated and more straightforward to use.

[Users] To be added: functional tests for user & client administration

We need to add functional tests for:

  • The login form
  • Logging out
  • Adding users
  • Editing users (Changing password, email, setting status)
  • Deleting users
  • Adding clients
  • Editing clients (changing clientID, redirectURI's and secretpassphrase)
  • Deleting clients
  • Revoking access tokens
  • Revoking refresh tokens
  • Revoking connected applications
  • Revoking connected authentication codes

This is required to avoid inadvertently breaking existing functionality during development.

README.md is missing some required packages

In README, some required packages and caveats are missing:

  • php-mbstring is required
  • For php7, mongodb (and not mongo) must be installed at a version equal or above 1.2.0.

Unit test of POST request uses body format

The POST request test uses the old JSON body format to push a record to the Datahub. Like this:

list($crawler, $response, $data) = $this->apiRequest('POST', '/v1/data', [
    'format' => $dataConverterId,
    'data'   => $testData,
]);

Later on, we decided to abandon this approach and go for a new way of POST'ing data to the API:

  • The body would contain the entire un-encapsulated record. Serialised in a target format (LIDO, MARC,...)
  • The type of data is defined in an HTTP request header.
  • The datahub configuration has a parameter which data is accepted.

TODO

  • The unit test for the POST reuqest should be adapted to reflect the new changes
  • The apiRequest() method in OAuthTestCase currently only allows for posting JSON data and needs to be expanded / adapted.

OAI-PMH sets

The OAI-PMH protocol allows harvesting of "sets" of records. These are subsets of records in a repository. If the Datahub allows assigning records into sets, partial harvesting would also become possible.

When data is aggregated from multiple data providers (different institutions, etc.), it's not possible to just harvest the data from that particular data provider without harvesting everything else. The onus of selecting the relevant records is completely on the client while harvesting everything means expending extra resources on the side of the Datahub. Subsets would solve this use case.

This is a new feature which needs to be designed / worked out.

Questions:

  • How do we model subsets on the database level? MongoDB isn't a relational database.
  • What is the functional impact of extending the administration GUI to allow managers / administrators to manage sets?
  • Can we just provide an extra "set" resource as part of the REST API? Would that be enough?

Remove "Hello, world!" from dashboard

Replace current "Hello, World" message with a more suitable message. Short term idea: just display the total number of records stored in the database.

[Unit testing] Move internal bundle tests to the root tests folder

Detailed description

Currently, all the tests are spread out over the various internal bundles (a tests folder per bundle). It would be better to move the test classes to the more general tests folder which lives in the root of the Symfony project.

Context

This change is important because:

  • There's no need for tests in separated in bundles since we do not intend to release the core bundles as standalone packages.
  • It would allow us to consolidate all the fixtures we use for testing (currently XML files all over the place)
  • Less need for all kinds of bash scripts to run tests across folders.
  • Improve simplicity towards contributors who are willing to write tests.

Possible implementation

  • Move tests to the tests folder but following the namespacing and directory structure of the bundles.
  • Consolidate the fixtures in a separate fixtures folder. Consider removing them all together if possible.
  • Configure phpunit.xml properly.
  • Refactor the bash scripts in the scripts folder
  • Refactor .travis.yml accordingly
  • Document how tests should be run

Deprecate the SharedBundle bundle

Detailed description

The SharedBundle bundle contains a lot of "general" code which is called in the other bundles such as traits, templates, functions. It also contains the DefaultController which has the Dashboard action. And finally, it has the app:setup command and a few fixtures.

Context

The Bundle itself doesn't serve a real purposes. It's just something other bundles depend upon. As such, it violates the Single Responsibility Principle. Deprecating the module would vastly improve maintainability and testability of code.

Possible implementation

Breaks down in these tasks:

  • Remove SharedBundle\Helper\SerializationHelper
  • Move EnvironmentSpecificDataFixture to UserBundle & OAuthBundle
  • Move DefaultController to DashboardBundle
  • Deprecate Traits and integrate them in bundles
  • Move Resources\views to App\Templates
  • Move views to App\templates (or Assets?)

Generate the 'admin' user on installation runtime

Currently, the administrator is loaded as a fixture via a separate console command. This is unsafe since a password needs to be reset manually after initial installation; an action which can be easily forgotten.

Generating the administrator user + password during installation runtime circumvents this.

Support Format: MARC XML

MARCXML is an XML schema based on the common MARC21 standards.[4] MARCXML was developed by the Library of Congress and adopted by it and others as a means of facilitating the sharing of, and networked access to, bibliographic information.[4] Being easy to parse by various systems allows it to be used as an aggregation format, as it is in software packages such as MetaLib, though that package merges it into a wider DTD specification.

See: https://en.wikipedia.org/wiki/MARC_standards
See: http://www.loc.gov/standards/marcxml/

Supporting MARC(XML) will open up the use case for Libraries to leverage the Datahub as an aggregator.

OAuth acces token lifetime is too short

Problem:

A bulk ingest of records in the Datahub will fail if it takes longer then 1 hour to complete. It will fail with this error:

Response code: 401
Response headers:
Response body:
	{"error":"invalid_grant","error_description":"The access token provided has expired."}
2016/12/13 13:13:16 [80941] - ERROR main::catch {...}  bin/dh-factory.pl (80) time=3743487 : Error while adding item S-54: 401 Unauthorized

Cause

The Access token life time is too short. By default, this is set to 3600 seconds by FOSOAuthServerBundle.

Solution

Try and make changes in the config.yml per the documentation

PHP 7.2 testing fails on Travis

PHP 7.2 support for Travis fails with authentication errors for MongoDB. To be researched what goes wrong and how it can be fixed.

Symfony 4.0

Symfony 4.0 was released on November 30th. Right now, the Datahub is build on top of Symfony 2.8.x. We should research if switching to 4.0 is a good proposition.

  • Security?
  • Performance?
  • Maintainability?
  • Long term support of Symfony 2.x?

Installation via an installer

Just like WordPress or Drupal, it should be as easy as dropping the code into a hosting and when you point to the homepage, you get an installer wizard via the browser.

  • The wizard guides you through setting up all the options and generate a working parameters.yml.
  • The wizard should create an administrator user with random generated strong password.

Once installed, you land on the homepage of a fully functional Datahub.

This assumes that:

  • We don't run composer install or app/console anymore from the CLI.

Refactor frontend tooling

The installation of Bootstrap seems to be unnecessary. If so, it can be deleted from the composer.json file.

Move templates to the app/templates folder

Detailed description

Currently, most templates live in the src/Datahub/<bundle> folders. However, these bundles are a core part of the application and the templates are thus available sitewide. Moreover, there's no intention to publish the bundles as separate, independent projects.

Per the documentation:

templates/
The application's views directory can contain application-wide base templates (i.e. your application's layouts and templates of the application bundle) as well as templates that override third party bundle templates (see How to Override Templates from Third-Party Bundles).
vendor/path/to/CoolBundle/Resources/views/
Each third party bundle houses its templates in its Resources/views/ directory (and subdirectories). When you plan to share your bundle, you should put the templates in the bundle instead of the templates/ directory.

We should move / restructure / consolidate the templates of the bundles into the templates/ folder.

Context

Right now, the templates are spread out across different bundles. Moreover, they reference each other creating all kinds of weird nesting issues. This is particularly noticeable in the UsersBundle that references to templates in the SharedBundle while overriding templates in the FOSUser bundle.

Cleaning this up would greatly enhance maintainability and durability.

Add support for CS fixer

Detailed description

Add support for PHP Code Standards Sniffer. The idea is to perform linting of the app & src code against the PSR standards to pick up on potential optimalization.

This is an enhancement.

Context

Linting code improves readability and maintainability, and also helps pick up potentially hidden bugs which are tedious to fix.

Add MongoDB authentication

Currently, authentication at the level of mongodb databases isn't supported. You need to disable authentication on the entire mongodb instance in order to connect and perform actions on the databases.

Should to be added: support for database connection authentication. This should also include supporting authentication for tests since running PHPUnit tests will create a new datahub-test database.

Alternatively, we need to rethink how tests are performed since creating a separate database isn't always possible or even allowed.

[Meta] Support multiple data types in a single instance

It should be possible to store records in different data types (EAD, LIDO) in a single Datahub installation.

Version 1.x doesn't support multiple data types. A Datahub instance only supports one data type. If you want to aggregate and disseminate data from multiple sources / formats, then you need to install multiple instances. i.e. an instance for LIDO and an instance for EAD.

Motivation

  • Organisations don't store and manage one type of data (museal, archival, library,...) but multiple types.
  • Installing multiple instances implies extra costs and might be cumbersome (hosting, maintenance, security, workflows & processes,...)

Discussion / Impact

The API's (OAI/REST) need to be able to differentiate between formats (EAD, LIDO,...). More accurately, the API's need to differentiate between data types of the records; or even differentiate between different types of collections of records independent of their formatting. Whereas a collection references a specific context (i.e. records describing all digitised photos created by an organisation, the archival holdings of James Joyce, sheet music written by a composer, archived medical records from a hospital, etc.)

Content negotiation allows the same record being delivered in different formats & data models. Since Dublin Core is a general purpose format / model, the same metadata record could be delivered as DC and LIDO. But it does not make sense to model the same record as LIDO and EAD since a record is either an archival description or an object record. Not both. Context matters. Formats such as MODS linger between the specificity of MARC (library data) and the general purpose flexibility of DC. It would be possible to describe museal objects in both MODS and LIDO since both models lend themselves to that end. But equally, we could store two discrete collections of records in the hub: MODS formatted library data & LIDO museal data.

The core principle of the Datahub is to avoid any internal data modelling (transformations, mappings, editing,...) of the records itself though, since those add complexity and assumptions that decrease flexibility. Data modelling is always deferred as an external responsibility. This implies that records might have a DC, MODS or LIDO representation, or not. This makes formal content negotiation where every record always has multiple representations, less feasible to implement.

Collections, on the other hand, could potentially be modelled through dynamic URL paths within both the REST and OAI endpoints. The hub would feature one REST API featuring collections, and multiple OAI endpoints (one per collection). Management would happen through the administrative interface.

This would also impact the OAuth implementation. Currently, protecting access to the data is all or nothing. But access is also a context driven aspect. One could differentiate between "digitized photos either accessible to the public or kept private because of copyright".

Manage navigation with KnpMenuBundle

Detailed description

Right now, navigation is hard wired throughout different templates. The topnav.html.twig template contains presentation and business logic to render the menu. Depending on the context (logged in/ logged out) different menu items are shown. Then there are the sidebar menu items on "users", "clients" and "tokens".

The result is hard to understand code which is difficult to maintain.

We should move KnpMenuBundle for building and managing menu's.

Context

  • Separates business from presentational layer which improves maintainability vastly.
  • Introduces flexibility and room to expand how the menu tree is generated while adhering to the SOLID principles.

Possible implementation

Follow the implementation guide to start.

This issue currently depends on #50 which deprecates much of the SharedBundle folder and turns it into a dedicated CoreBundle instead.

Invalid argument to ::getRecords() when issuing a PUT request

Issuing a PUT request causes a Fatal Error with an invalid argument to ::getRecords().

Error message

request.CRITICAL: Uncaught PHP Exception Symfony\Component\Debug\Exception\ContextErrorException: "Catchable Fatal Error: Argument 1 passed to DataHub\ResourceBundle\Data\Converter\CatmanduLidoXMLDataConverter::getRecords() must be of the type array, none given, called in /vagrant/src/DataHub/ResourceAPIBundle/Controller/DataController.php on line 282 and defined" at /vagrant/src/DataHub/ResourceBundle/Data/Converter/CatmanduLidoXMLDataConverter.php line 35 {"exception":"[object] (Symfony\\Component\\Debug\\Exception\\ContextErrorException(code: 0): Catchable Fatal Error: Argument 1 passed to DataHub\\ResourceBundle\\Data\\Converter\\CatmanduLidoXMLDataConverter::getRecords() must be of the type array, none given, called in /vagrant/src/DataHub/ResourceAPIBundle/Controller/DataController.php on line 282 and defined at /vagrant/src/DataHub/ResourceBundle/Data/Converter/CatmanduLidoXMLDataConverter.php:35)"} []

When an entity (user, client, token) is edited, a flash message should appear

Detailed description

When editing or updating an entity as an administrator (user, client,...) you don't get a message if the edit wether or not the edit was successful. This leaves the administrator in the dark if things went as expected.

When a form is submitted, it should always show a proper flash message reporting whether an action went as expected.

Context

Usability issue.

If no records are present, api/v1/data will trigger an error

The JSON API currently returns an error when requesting GET api/v1/data while there are no records present.

{
  "code": 500,
  "message": "You have requested a non-existent service \"serializer\". Did you mean one of these: \"fos_rest.serializer\", \"fos_rest.serializer.exception_normalizer.jms\", \"hateoas.serializer.exclusion_manager\", \"jms_serializer\", \"jms_serializer.accessor_strategy\", \"jms_serializer.array_collection_handler\", \"jms_serializer.constraint_violation_handler\", \"jms_serializer.datetime_handler\", \"jms_serializer.deserialization_context_factory\", \"jms_serializer.doctrine_proxy_subscriber\", \"jms_serializer.expression_evaluator\", \"jms_serializer.form_error_handler\", \"jms_serializer.handler_registry\", \"jms_serializer.json_deserialization_visitor\", \"jms_serializer.json_serialization_visitor\", \"jms_serializer.metadata_driver\", \"jms_serializer.naming_strategy\", \"jms_serializer.php_collection_handler\", \"jms_serializer.serialization_context_factory\", \"jms_serializer.stopwatch_subscriber\", \"jms_serializer.templating.helper.serializer\", \"jms_serializer.xml_deserialization_visitor\", \"jms_serializer.xml_serialization_visitor\", \"jms_serializer.yaml_serialization_visitor\", \"fos_rest.serializer.jms_handler_registry\", \"jms_serializer.metadata.file_locator\", \"jms_serializer.metadata_factory\"?"
}

Expected: if no records are present, an error should be thrown with an appropriate HTTP status message.

Unit tests all pass, but we know there are errors

When you run ./scripts/run_tests the unit tests will all pass, but we know that the API does throw PHP and other types of errors.

Cause:

The "LIDOXML" array key in $testDataList should be lower case in DataControllerTest:

        $testDataList = [
            'LIDOXML' => [
                file_get_contents(__DIR__.'/../Resources/LidoXML/LIDO-Example_FMobj00154983-LaPrimavera.xml'),
                file_get_contents(__DIR__.'/../Resources/LidoXML/LIDO-Example_FMobj20344012-Fontana_del_Moro.xml'),
            ],
        ];

Remove Nelmio documentation support

Detailed description

Documentation for the Rest API at /api/docs is generated through NelmioApiDocBundle. Right now, the REST API is extremely straightforward (1 controller, 5 actions).

We should remove this bundle and replace it with a self maintained static documentation page.

Context

The complexity of wiring up this bundle, generating the HTML documentation and then wiring up the generated documentation in the UI, currently weighs in far more then the benefit it creates: a documented API.

Removal would reduce the number of dependencies which needs to be maintained and makes the package more robust.

Possible implementation

  • Remove the NelmioApiDocBundle package
  • Remove references in the controllers and kernelApp.php
  • Create a static a new static page.
  • Manually document the REST API through static HTML.

Disentangle nested configuration files

Detailed description

The config_*.yml files are currently nested into each other.

  • There's a main config.yml file which is included in config_test.yml and config_dev.yml
  • config_dev.yml is loaded in config_test.yml
  • There's a main security.yml file which is loaded in config.yml
  • There's a routing.yml and a routing_dev.yml
  • etc.

This setup is brittle, causes confusing and is error prone. Creating a separate security_test.yml and including it in config_test.yml, doesn't correctly override the main firewall in security.yml higher up the include chain, for instance.

Context

This is important to allow clean functional and unit testing. And it's important because isolating concerns creates a less room for errors and bugs.

Possible implementation

  • The configuration files should be completely split up, and tailored to their specific environments: test, dev and production.
  • The configuration files need to be cleaned up and documented.

Fix the users controller routing

Detailed description

The users controller uses this routing:

/user/users
/user/users/new
/user/users//show

This is confusing and complicated. The routing should be made simpler:

/user
/user/new
/user/
/user//edit
/user//delete

Context

Refactoring the routing paths, would simplify the way users are added. Make it more understandable. Also, there's this second-level "navigation" sidebar in the Users section with just one menu item "users", to get to the second level (user/users) which could be deprecated, making the interface less cluttered.

[Records] The API only accepts XML formats

Detailed description

The Datahub only accepts XML formatted documents as an input. Not JSON formatted documents.

We should document this explicitly.

Context

This is not a bug, but a design choice since most of the traditional cultural formats (MODS, EAD, LIDO,...) are XML based. JSON itself is originally schema-less. Efforts such as JSON Schema are trying to change this.

Documenting this is important since future extensions (new formats) which are JSON based (outside the cultural domain) won't be supported by the Datahub.

Rename the internal bundles

Detailed description

There's a difference between the bundle (folder) name and the class name. This is very confusing:

OAuthBundle > DataHubOauthBundle
SharedBundle > DataHubSharedBundle
UserBundle > DatahubUserBundle
OAIBundle > DataHubOAIBundle
ResourceBundle > DataHubResourceBundle
ResourceAPIBundle > DataHubResourceAPIBundle

Context

Do we need the 'DataHub' prefix for each bundle? Omitting that would make things more clear and maintainable.

Remove the parameters.yml file

Currently, the parameters.yml file for the FlemishArtCollection instance is included in the project. This should be removed from the repository leaving only the parameters.yml.dist file.

The parameters.yml file should be generated during composer install.

Add 'soft' deletes for OAI-PMH

OAI-PMH models the concept of deleted records. These are records that are no longer available in the repository, but are still listed in the endpoint. When a client harvests a record which is denoted as "deleted", it can take an appropriate action based on this status.

Currently, the Datahub does not support this. Using DELETE on a record, will perform a hard removal of the resource without any trace.

This issue involves:

  • Research if SoftDeletable is a good starting point to do the heavy lifting at the database facing side.
  • The REST controller should still return a 404 when requesting a soft deleted entity.
  • The OAI-PMH controller needs to include records which were soft deleted, expressed according to the OAI-PHM model (use proper XML elements)

Tangential, this means that:

  • All existing tests should still remain green.
  • Add functional tests for OAI-PMH deletes.

Clean up after running unit tests

When running tests, artefacts such as extra tables, etc. will be created. Since these are temporary, they should be purged when testing has finished.

When running tests, a datahub-* (datahub-test) database is automatically created. This database should be dropped after testing.

Remove the default Symfony templates

Detailed description

The default Symfony templates in app/resources/views weren't removed.

These templates need to be removed:

views/base.html.twig
views/default/index.html.twig

Context

Neither template is used in the application. These are placeholders which come with Symfony2 and don't serve any functional purpose.

Removing them makes things cleaner and more maintainable (less cruft).

Deprecate the scripts folder

Detailed description

The scripts folder is not part of the Symfony framework. It's a folder containing

  • custom scripts for running unit tests with and without Docker
  • setting up a quick installation.
  • Building the REST API documentation with Nelmio API.

These scripts (partly) encapsulate functionality that should reside outside the core package because they violate the single responsibility principle.

Context

While there is definitely a use case for these scripts, including them as is - undocumented, untested - creates confusing for users who have no direct need for these scripts.

The scripts related to unit testing with Docker are assuming a preference for Docker as 'go-to' tool for setting a up a testing environment. On the contrary, from the point of view of the Datahub, it is of no concern which technology lower in the stack is used to run the unit tests. As such, Docker support should not be part of a generic package.

[Meta] Package according to the PHPLeague skeleton package

Package managers and tools such as Composer and packagist.org allow for extremely easy distribution and deployment of PHP packages. This includes applications too.

Before publishing the Datahub as a package on packagist.org, it's important to package the application according to best practices.

The League of Extraordinary Packages is a band of PHP developers who aim to deliver high quality PHP libraries according to best practices. They set the benchmark against which to build a quality software package.

We should aim to re-package this repo according to their Skeleton project:

https://github.com/thephpleague/skeleton

Support format: EAD

EAD or Encoded Archival Description is an XML format used to serialize archival records.

The EAD standard's XML schema specifies the elements to be used to describe a manuscript collection as well as the arrangement of those elements (for example, which elements are required, or which are permitted inside which other elements). The EAD tag set has 146 elements and is used both to describe a collection as a whole, and also to encode a detailed multi-level inventory of the collection. Many EAD elements have been, or can be, mapped to content standards (such as DACS and ISAD(G)) and other structural standards (such as MARC or Dublin Core), increasing the flexibility and interoperability of the data.[9]

See: https://en.wikipedia.org/wiki/Encoded_Archival_Description

Supporting for this format will allow archives to use the Datahub as an aggregator.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.