kbss-cvut / termit Goto Github PK
View Code? Open in Web Editor NEWAn advanced SKOS terminology manager linking concepts to their definitions in documents
License: GNU General Public License v3.0
An advanced SKOS terminology manager linking concepts to their definitions in documents
License: GNU General Public License v3.0
As a developer, I want to migrate the project configuration to Spring Boot.
This will allow easier configuration w.r.t. virtualized environments like Docker.
Endpoint se snapshotem pojmu by měl vracet v elementu properties tyto atributy:
http://onto.fel.cvut.cz/ontologies/slovnik/slovník-datového-modelu-dtm/pojem/je-reálným-objektem
http://purl.org/dc/terms/references
http://www.w3.org/2004/02/skos/core#notation
http://www.w3.org/2004/02/skos/core#example
Tady příklad pojmu v aktuálních datech a ve verzi, kdy datově by měly být totožné:
If a term is a root term of a vocabulary and is removed, the skos:hasTopConcept
referencing the term from the glossary remains in the repository.
When a vocabulary is imported, TermIt fails to generate a document for such vocabulary. In contrast, when a vocabulary is created (not through import, but through create vocabulary form), a document is generated for it.
In order to facilitate compatibility with the SGoV assembly line, TermIt has to be able to use Keycloak as an authorization service.
However, to retain backwards compatibility, it also has to be able to run without it, using its internal authentication mechanisms for secure access to the application.
Note that this issue involves backend as well as frontend of TermIt.
Currently, the Docker Compose setup does logs only to system out, so the output is lost on restart. As a system admin, I need to be able to examine logs from before last restart.
In order to support the new architecture of the SGoV Assembly Line, identifiers of vocabulary contexts need not coincide with the identifiers of the vocabularies they contain.
TermIt needs to adapt to this change. Also, since vocabularies (and thus their contexts) may be created externally by the assembly line, TermIt must be able to update whatever information it holds as to the contexts vocabularies are stored in.
The following exception is thrown when attempting to update a document:
cz.cvut.kbss.jsonld.exception.AmbiguousTargetTypeException: Object with types [http://onto.fel.cvut.cz/ontologies/slovník/agendový/popis-dat/pojem/zdroj, http://onto.fel.cvut.cz/ontologies/slovník/agendový/popis-dat/pojem/dokument] matches multiple equivalent target classes: [class cz.cvut.kbss.termit.dto.listing.DocumentDto, class cz.cvut.kbss.termit.model.resource.Document]
at cz.cvut.kbss.jsonld.deserialization.util.TargetClassResolver.ambiguousTargetType(TargetClassResolver.java:133)
at cz.cvut.kbss.jsonld.deserialization.util.TargetClassResolver.selectFinalTargetClass(TargetClassResolver.java:105)
at cz.cvut.kbss.jsonld.deserialization.util.TargetClassResolver.getTargetClass(TargetClassResolver.java:82)
at cz.cvut.kbss.jsonld.deserialization.expanded.Deserializer.resolveTargetClass(Deserializer.java:51)
at cz.cvut.kbss.jsonld.deserialization.expanded.ObjectDeserializer.openObject(ObjectDeserializer.java:79)
at cz.cvut.kbss.jsonld.deserialization.expanded.ObjectDeserializer.processValue(ObjectDeserializer.java:60)
at cz.cvut.kbss.jsonld.deserialization.expanded.ExpandedJsonLdDeserializer.deserialize(ExpandedJsonLdDeserializer.java:61)
at cz.cvut.kbss.jsonld.jackson.deserialization.JacksonJsonLdDeserializer.deserialize(JacksonJsonLdDeserializer.java:85)
at cz.cvut.kbss.jsonld.jackson.deserialization.JacksonJsonLdDeserializer.deserializeWithType(JacksonJsonLdDeserializer.java:120)
Endpoint: rest/resources/document
To facilitate collaborative creation and maintenance of multiple vocabularies, TermIt must be able to open only a selected set of vocabularies for editing and treating any other vocabularies as read-only. This should be session-based, so that multiple requests from the same user can work with the same set of vocabularies.
All vocabulary contexts are available for editing by default (this will ensure compatibility with the current behavior).
The current implementation of a vocabulary content history retrieval is extremely inefficient, as it retrieves all change records related to the repository. There can be thousands of those, so the loading takes minutes and there are megabytes of data sent to the client which then only needs the grouped changes per day (added/edited every day).
This should be rewritten so that the backend immediately returns the aggregated changes.
As a developer, I want to keep the TermIt ontology in a separate context (RDF graph) in the repository, so that it can be updated automatically (#227).
Currently, some of the existing deployments have the ontology in the default context, which makes the automated updates difficult (additions are fine, removals would be hard). If the ontology were in a dedicated context, we could just replace the context completely.
As a developer, I sometimes make changes to the TermIt ontology (occasionally, even changes to the popis dat (data description) ontology happen). These changes may influence the inference results or behavior of the application. As installations of TermIt are created that are not managed by the development team, there needs to be a mechanism of automatically updating these ontologies in the main application repository, so that when a new version of TermIt is deployed, the ontologies in the repository are up-to-date.
When using plain JSON, datetime values using Java 8 datetime API (Instant
in particular) are serialized as decimal numbers by Jackson. Instead, they should be serialized as ISO 8601 strings. This will ensure, among other things, consistency with the representation in JSON-LD.
Currently, the REST API documentation is maintained manually at SwaggerHub. However, this is quite inefficient for two reasons:
Instead, the documentation should be a part of each deployment of TermIt so that it can be directly tested. Moreover, the documentation of the endpoints would be specified directly in code. Springdoc OpenAPI could be used for this purpose.
Follow-up to #163 and #164 - a repository may contain several copies of the same vocabulary, one is canonical, the other ones are working copies. Each user may open open of the working copies for editing.
TermIt has to be able to determine the correct context of the vocabulary and any other related vocabularies (vocabularies containing terms SKOS-related to the terms from the edited vocabulary).
Possibly problematic areas:
skos:exactMatch
and skos:relatedMatch
- do not know if they are inferred based on statements in someone else's workspace)
skos:exactMatch
and skos:relatedMatch
TODOs:
When text analysis is invoked on an already annotated larger file (cca 1MB) containing many term occurrences, processing of its results can take minutes to finish. This makes it practically unusable, as the user is unsure whether it is normal that the application shows Please wait...
for several minutes and may leave/attempt to refresh.
Analysis of repeated annotation of the metropolitan plan shows the following times:
The goal should be to get at least under a minute altogether, preferably even better.
As a TermIt administrator, I want to be able to specify a file containing the definition of types users can use to classify terms.
Currently the types (based on UFO ontology) are loaded from a file that is packed into the application archive. This cause any changes to the types language to require rebuilding the project. Instead, it should be at least possible to specify the path to the language file as a parameter on startup, with the built-in file used as a default when no custom one is provided.
This is motivated by attempts to incorporate TermIt into the SGoV assembly line, which uses a different language to stereotype terms.
Following migration to JOPA 2.0.0(-SNAPSHOT), AspectJ is no longer required to work with the object model. However, we are currently using Aspects to notify certain components of selected events. This prevents the removal of AspectJ Maven plugin from the build configuration.
We should replace the Spring aspects with application events and remove AspectJ altogether.
As a developer, I want the TermIt build to be faster. The tests take too much time which slows the development down considerably (PRs, Jenkins build before deployment, local test build).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.