Code Monkey home page Code Monkey logo

osdu-ontology's Issues

osdu:bbox is not properly defined

A bounding box is a structure of 2 points or 4 numbers.
But it is defined as a single number:

osdu:bbox rdf:type owl:DatatypeProperty ;
	rdfs:range xsd:decimal ;

Then it is used like this:

osdu:GeoJSONPolygon rdf:type owl:Class ;
	rdfs:subClassOf [
		a owl:Restriction ;
		owl:onProperty osdu:bbox ;
		owl:minQualifiedCardinality "4"^^xsd:nonNegativeInteger ;
		owl:onClass xsd:decimal ;
	] ;

So in instance data you may have this:

<myGeoJSONPolygon> osdu:bbox 1,2,3,4

But RDF multivalued properties don't keep order between the values. So if you try to fetch it with SPARQL:

select * {
  <myGeoJSONPolygon> osdu:bbox ?bbox
you'll get the coordinates in random order.

many "strings" should become "things"

There are 866 data props with rdfs:range xsd:string ("strings").
However, many of them are candidates for converting to object props ("things"):

  1. "ID" props that point to existing classes, but do so indirectly, eg
    activityLevelID activityTemplateID ... consequenceCategoryID consequenceSubCategoryID ...
  2. Props where the target should perhaps be converted to a class, to capture richer info, eg
    acquisitionCompanyID acquisitionSite agency businessActivities attributionAuthority ...
  3. Enumerated props where the target could be converted to skos:Concept in its own ConceptScheme,
    acquisitionTypeID activityCodeID activityLevel activityOutcomeDetailID activityOutcomeID activityTypeID additiveTypeID agreementExternalSystem businessIntentionID ...

Note: in contrast, activityID, agreementExternalID are identifiers inside an object, so should remain strings

use GeoSPARQL don't invent your own classes

You define a lot of your own classes following GeoJSON, eg

			osdu:AnyCrsGeoJSONPoint
			osdu:AnyCrsGeoJSONLineString
			osdu:AnyCrsGeoJSONPolygon
			osdu:AnyCrsGeoJSONMultiPoint
			osdu:AnyCrsGeoJSONMultiLineString
			osdu:AnyCrsGeoJSONMultiPolygon
			osdu:AnyCrsGeoJSONGeometryCollection
			osdu:AnyCrsGeoJSONFeature
			osdu:AbstractAnyCrsFeatureCollection
			osdu:GeoJSONPoint
			osdu:GeoJSONLineString
			osdu:GeoJSONPolygon
			osdu:GeoJSONMultiPoint
			osdu:GeoJSONMultiLineString
			osdu:GeoJSONMultiPolygon
			osdu:GeoJSONGeometryCollection
			osdu:GeoJSONFeature
			osdu:AbstractFeatureCollection

However, the OGC GeoSPARQL standard defines how to represent all of this in RDF.

  • Geometries are represented as opaque literals with datatypes gmlLiteral or wktLiteral
  • Any OGC CRS (in the EPSG collection but not only) can be used
  • Defines spatial relations such as geo:ehContains, geo:rcc8ntpp (inside), geo:sfContains
  • The standard is widely supported by semantic repositories. Upon seeing the special datatypes, they pass the geo data to special components for geospatial indexing.

doubled restriction

Here's an example of a restriction that is stated twice, which is redundant and not useful:

osdu:AnyCrsGeoJSONLineStringCoordinatesArray rdf:type owl:Class ;
	rdfs:subClassOf osdu:Array ;
	rdfs:subClassOf [
		a owl:Restriction ;
		owl:onProperty osdu:items ;
		owl:minQualifiedCardinality "2"^^xsd:nonNegativeInteger ;
		owl:onClass xsd:decimal ;
	] ;
	rdfs:subClassOf [
		a owl:Restriction ;
		owl:onProperty osdu:items ;
		owl:minQualifiedCardinality "2"^^xsd:nonNegativeInteger ;
		owl:onClass xsd:decimal ;
	] ;

reduce the use of acronyms

You use many abbreviations specific to O&G, eg

cCLTopShotDistance rdf:type owl:DatatypeProperty ;
	rdfs:comment "Distance from CCL to Interval Top " ;

An internet search indicates this is "casing collar location" (or maybe "locator"?). IMHO it would be better to spell it out in full: casingCollarLocationToTopShotDistance.

In another case casingCollarLocatorMD you spell out "CCL" (which is inconsistent with the previous case), but abbreviate "measured depth". IMHO it would be better to spell it out in full: casingCollarLocatorMeasuredDepth.

duplicated restrictions

osdu:Point includes this restriction repeated 17 (!) times.

rdfs:subClassOf [
		a owl:Restriction ;
		owl:onProperty osdu:observationmeasureddepth ;
		owl:minCardinality "1"^^xsd:nonNegativeInteger ;

Repeating a restriction is pointless, so please diagnose what has caused this duplication. It's possible it occurs on other classes and restrictions as well

reuse QUDT or another UoM ontology

There are a number of classes/props that describe units of measure and their characteristics, eg:

UnitOfMeasure
UnitQuantity
ExternalUnitOfMeasure
ExternalUnitQuantity
baseForConversion
memberUnits

However, these don't capture all the complexity of UOMs, eg dimension vectors, conversion factors, systems of units, etc.

Reuse a well established ontology of UoM, eg QUDT, rather than making your own partial ontology.

`boundingBoxEastBoundLongitude` etc are improperly defined

osdu:boundingBoxEastBoundLongitude rdf:type owl:DatatypeProperty ;
	rdfs:comment "Eastern longitude limit of the bounding box in degrees based on WGS 84 " ;
	rdfs:domain osdu:Extent ;
	rdfs:range gn:Feature ;
	owl:sameAs gn:longitude ;

Several mistakes here:

  • the range should be xsd:decimal not gn:Feature. This is a data prop, whereas gn:Feature is a named GeoNames object
  • Cannot be owl:sameAs gn:longitude because you're using this on 2 of your props, so they will become owl:sameAs between themselves.
  • Furthermore, you should use equivalentProperty or subPropertyOf for props (in this case I think you want to use the latter)

improve the camel casing of abbreviations

Even if you disagree with #12, IMHO it's better to treat abbreviations as "words" and then apply the camelCase convention. Eg
cCLTopShotDistance should become
cclTopShotDistance because it's easier to "parse out" the ccl part.

property naming convention

There are many properties that don't conform to the lowerCamelCase convention, eg:

totalcostamount
costcurrency

unreadable description

Several properties (osdu:workflowPersona, osdu:workflowUsage) have descriptions that do not parse:

rdfs:comment "<...> that the record is technical assurance value is valid for. " ;
rdfs:comment "<...> that the record is technical assurance value is not valid for. " ;

Even the single comments don't parse properly.
The two comments put together are truly puzzling.

Stale OSDU schema source

I saw recent activity - I am publishing the OSDU schemas for each milestone. The URL mentioned in the README is a copy.

  • The resources in that copy are based on local file references instead of schema ids
  • The resources to register the schemas in the OSDU core Schema service might be better, and reflect the latest version.
  • OSDU treats "status": "PUBLISHED" as read-only. As a consequence, there are a growing number of higher minor or patch schemas - for the ontology, I would assume only the latest version should be used. Such a list of the latest versions is available in reports, but could be generated as another artefact, easy to consume by the ontology creator.

If you are interested in a collaboration, please submit an issue on the public OSDU GitLab and assign it to me (Thomas Gehrmann [slb] @Gehrmann).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.