hdmf-dev / hdmf-schema-language Goto Github PK
View Code? Open in Web Editor NEWThe specification language for HDMF
Home Page: https://hdmf-schema-language.readthedocs.io
License: Other
The specification language for HDMF
Home Page: https://hdmf-schema-language.readthedocs.io
License: Other
This is not used as far as we know in NWB or HDMF, it doesn't quite make sense, and I do not think it is supported by NWB/HDMF or MatNWB. "value" is already not an option for datasets.
The 'linkable' key for group and dataset specifications is a holdover from NWB 1.0. It is not used by the NWB 2 schema, HDMF common schema, or any extensions to my knowledge. The only known use in the official APIs is in the HDMF validator which raises an IllegalLinkError
if a link is made to a group/dataset with linkable=False
. I also do not understand why a user would set linkable=False
.
@oruebel and I think it could be safely deprecated.
Pro: Removes complexity from schema language. Reduces edge cases and required support in the official APIs.
Con: It is unlikely, but there may be some extensions in the wild that are using 'linkable', and this change would remove support for that functionality.
It looks like build for the "latest" are passing but for "stable" they are not, see https://readthedocs.org/projects/schema-language/builds/
I'm not sure whether we need both stable and latest builds for this repo, but we should discuss a fix and versioning of the repo.
The schema language supports some flexibility in how data types are defined, and some methods are encouraged over others for clarity and consistency. These best practices should be added to the schema language documentation:
quantity
key not in the data type definition but in the group/dataset spec where the type is included. When the data type is defined at the root of the schema (as opposed to nested), then in order to use the data type, a new group (subgroup) spec is defined where the quantity
key is set to a value or if omitted, the default value of 1 would be used. This makes the quantity
defined in the data type definition meaningless and confusing. See also NeurodataWithoutBorders/nwb-schema#472name
key not in the data type definition but iin the group/dataset spec where the type is included. Mismatch between the name
defined on the data type definition and where it is included can lead to confusion in the expected behavior and may lead to errors in HDMF. See hdmf-dev/hdmf#582dtype
, shape
, or quantity
of a data type when using data_type_inc
should only restrict the values from their original definitions. For example, if type A has dtype: text
and type B extends type A (data_type_def: B, data_type_inc: A
), then type B should not redefine dtype
to be int
which is incompatible with the dtype of type A. Same thing if type A is included and a new type is not defined (just data_type_inc: A
). In other words, all children types should be valid against the parent type. This is not yet checked in HDMF but see progress in hdmf-dev/hdmf#321 .value
and default_value
keys are not yet supported by the official APIs, so these are discouraged until support is added.@bendichter @oruebel @ajtritt Can you think of other best practices to add? Do you agree with the above?
Corresponding to the schema language version tag used in HDMF when writing schema files, we should make tagged releases of this repo for the corresponding versions.
HDMF does not support datasets that do not have a data type inc/def and contain either a reference dtype a compound dtype. Such a situation should be noted as forbidden in the schema language documentation until support is added.
Supported dtypes were added to the documention here:
https://github.com/NeurodataWithoutBorders/nwb-schema/pull/382/files
but these were not transferred here for some reason.
As new versions of schema A are released, schema A may no longer be compatible with schema B that includes schema A. Older versions of schema A may also not be compatible with schema B that includes schema A.
For example, if the hdmf-common schema has a type X but changes it in a compatibility breaking way from version 2 to version 3, then my extension will break if hdmf-common schema version 3 is loaded. It should be restricted to versions <3.
If the hdmf-common schema introduces a new type X in version 1.4 and my extension schema depends on that new type, then my extension will break if hdmf-common schema version 1.3 is loaded. It should be restricted to versions >=1.4.
These should be combinable and support >, >=, ==, !=, <, and <=, just like in pip requirements specs and conda requirements specs. See https://www.python.org/dev/peps/pep-0440/ and https://www.python.org/dev/peps/pep-0508/
This would be the value for a key "version" under namespaces[i] > schema[i] alongside "namespace" and "data_types", like the following:
namespaces:
- author:
- ...
contact:
- ...
doc: ...
name: ndx-my-ext
schema:
- namespace: core
version: >=2,<3
neurodata_types:
- NWBDataInterface
- DynamicTable
- VectorData
- VectorIndex
- source: ndx-my-ext.extensions.yaml
version: 0.2.0
This comes up in NWB:
starting_time
and rate
pixel_mask
, image_mask
, voxel_mask
Raised in hdmf-dev/hdmf#542 in the context of how extra fields are treated.
It is not clear in the documentation whether the schema language allows new fields for groups/datasets defined with only a data_type_inc
. We should clarify this in the documentation.
It is my understanding that it is allowed (though perhaps not recommended) for fields to be added to groups/datasets defined with only a data_type_inc
.
For example: a new group may contain a dataset with:
- data_type_def: MyTable
datasets:
- name: new_column
data_type_inc: VectorData
attributes:
- name: new_attr
dtype: text
doc: a new attribute
A concrete example is in the Units table in NWB: https://github.com/NeurodataWithoutBorders/nwb-schema/blob/dev/core/nwb.misc.yaml#L187-L199
The resolution
attribute is added to the base VectorData
definition.
Or a new group may be defined as:
- data_type_def: MyContainer
groups:
- name: new_group
data_type_inc: Container
groups:
- name: new_subgroup
data_type_inc: Container
doc: a new subgroup
# also add new datasets, new links, and new attributes
A related but not fully matching example is the 'electrodes' table in the NWBFile: https://github.com/NeurodataWithoutBorders/nwb-schema/blob/dev/core/nwb.file.yaml#L269
The 'electrodes' table lacks a data_type_def but specifies many named VectorData
types. However, this differs from the above because DynamicTable
explicitly allows quantity: *
of data_type_inc: VectorData
without a name specified.
Note that a group/dataset defined with only a data_type_inc
may have a different doc, quantity, dims, shape, and sometimes even dtype than in the data_type_def
definition, as long as these are compatible with the included spec (e.g., a square is a special case of a rectangle). This should also be clarified in the documentation and fully supported in HDMF.
If adding fields to types with only data_type_inc
is not allowed, then the NWB schema should be amended.
If the above is allowed, then the HDMF validator should be amended to validate against the spec with any additions to the included type.
The build of the hdmf schema language docs on RTD currently fails with
Running Sphinx v1.8.5
Traceback (most recent call last):
File "/home/docs/checkouts/readthedocs.org/user_builds/hdmf-schema-language/envs/latest/lib/python3.7/site-packages/sphinx/config.py", line 368, in eval_config_file
execfile_(filename, namespace)
File "/home/docs/checkouts/readthedocs.org/user_builds/hdmf-schema-language/envs/latest/lib/python3.7/site-packages/sphinx/util/pycompat.py", line 150, in execfile_
exec_(code, _globals)
File "/home/docs/checkouts/readthedocs.org/user_builds/hdmf-schema-language/checkouts/latest/source/conf.py", line 16, in <module>
from ruamel import yaml
ModuleNotFoundError: No module named 'ruamel'
HDF5 allows you to have vectors of up to 4 dimensions within compound datatypes (ref), but our DTypeSpec does not allow us to have a shape parameter, so we cannot use this feature. I propose that we extend our schema language so that we can put vectors inside of compound data types.
Useful for NWB: NeurodataWithoutBorders/nwb-schema#542
Current docs: https://hdmf-schema-language.readthedocs.io/en/latest/description.html#dtype
Proposal:
Alternative:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.