hdmf-dev / hdmf-common-schema Goto Github PK
View Code? Open in Web Editor NEWSpecifications for pre-defined data structures provided by HDMF.
License: Other
Specifications for pre-defined data structures provided by HDMF.
License: Other
The docs for the new experimental namespace are currently not being rendered on ReadTheDocs. This will likely require some changes to the setup of how the docs are generated and/or the hdmf-docutils.
Add a comment to all of the YAML files saying which version of the schema language is in use, e.g.,
# hdmf-schema-language version 2.0.2
First we should test to make sure everything works in HDMF.
Branch: https://github.com/t-b/hdmf-common-schema/tree/use-text-as-encoding-for-dynamic-table-column-names
Good day โจ
When creating NWBv2 files from Igor Pro I'm facing a general problem with the text encoding. I can only write files with UTF8 encoded strings.
The easy fix would be the following change
$ git diff .
diff --git a/common/table.yaml b/common/table.yaml
index 49c8b6c..73a5ad3 100644
--- a/common/table.yaml
+++ b/common/table.yaml
@@ -87,7 +87,7 @@ groups:
of usability.
attributes:
- name: colnames
- dtype: ascii
+ dtype: text
dims:
- num_columns
shape:
(base)
That would undo the change from NeurodataWithoutBorders/pynwb@b0939429 but I don't know why that was done.
In https://github.com/hdmf-dev/hdmf-common-schema/blob/master/common/resources.yaml:
First:
I think the prefixes for these key names (except for "resources" > "name") is redundant and can be removed for readability and consistency with "resources" > "name". This would result in:
Would this be too confusing? If so, then for consistency, the resources table "name" field should be renamed "resource_name".
Second:
Either they should all have an underscore before "table" or none of them should. I personally prefer that they have an underscore before "table".
I think there is a bit of unclarity here with regards to the schema language. I do not see any documentation about the HDMF schema language. We do have the schema-language readthedocs, but that page states that it is for NWB and contains the NWB-specific keys neurodata_type_def
and neurodata_type_inc
. If we want to have a more general schema language we would need
Add new dtype for external file, which is basically the same as dtype: text
except that its value will get passed to a resolver, if present, during the build process. E.g. for externally stored images and video files in NWB
Based on the graphic in the NWB preprint which is also here:
https://github.com/hdmf-dev/hdmf-common-schema/blob/master/docs/source/figures/ragged-array.png
the docstring for VectorIndex
is wrong:
hdmf-common-schema/common/table.yaml
Lines 11 to 13 in b22b352
I think I wrote this incorrectly months ago. It should say:
The first vector is at VectorData[0:VectorIndex[0]]. The second vector is at VectorData[VectorIndex[0]:VectorIndex[1]].
example:
>>> from hdmf.common import VectorData, VectorIndex
>>> foo = VectorData(name='foo', description='foo column', data=['a', 'b', 'c', 'd'])
>>> foo_ind = VectorIndex(name='foo_index', target=foo, data=[2, 4])
>>> foo_ind[0]
['a', 'b']
When adding an object to the ExternalResources
objects table, you supply a container object ID and a field. In most cases, the field is the name of a dataset or attribute, but it could be a little more complicated.
Let's say the container is a group data type (e.g., TimeSeries
) and it has a dataset (e.g., data
) without a data type, and that dataset has a string attribute (e.g., unit
). The 'field' value then needs to signal that the field is the 'unit' attribute on the 'data' dataset. This could be done using '/' as a separator, e.g., field='data/unit'.
Now let's say the attribute is not a string but a compound data type with columns/fields 'x', 'y', and 'z', and each column/field is associated with different ontologies. The 'field' value also needs to account for this. This could also be done using '/' as a separator, e.g., field='data/unit/x'.
Whatever string formatting scheme we choose should be explicitly described in the docs and then handled by the API.
This comes up from @oruebel and @rightbower when ontologizing a column/field of a compound data type column in an ICEPhys table.
The types Data and Container are much more general than their use in a DynamicTable.
Suggestion: extract them out into a base.yaml file.
According to #8, there should be figures in https://hdmf-common-schema.readthedocs.io/en/latest/format_description.html but they do not show up.
Based on feedback from the LinkML developers and ontology experts, the Resources table is redundant and would rarely be used by the community. It adds extra overhead for adding entries to the ExternalResources "database". So, we decided to remove it. cc @oruebel @mavaylon1
The HDMF API will also need to be updated. rly/ndx-external-resources#6 may also need to be updated.
Currently, with the new ExternalResources
data type, each term is associated with an external resource (e.g., ontology), a unique identifier at the resource, and the URI for the resource entity.
Users would also like to associate a URI for the external resource. For example, the external resource "NCBITaxon" would have the URI "https://www.ncbi.nlm.nih.gov/taxonomy" associated with it. To normalize these data in the "resources" table, we should add a new table with fields (name, uri) and change the "resource_name" field in the "resources" table to be an index (foreign key) into the new table.
This change requires further feedback and coordination with ontology users.
VectorIndex has no dtype, but it should always be a type of unsigned int, since VectorIndex will always hold indices into a VectorData type
hdmf-common-schema/common/table.yaml
Lines 53 to 67 in 49e5fcb
The VectorData
, VectorIndex
, and DynamicTableRegion
types are missing the 'shape'
key and are thus interpreted as scalar when they should be 1-D/2-D/3-D/4-D, 1-D, and 1-D.
We added "object_type" in the objects table in ExternalResources to make queries easier.
But in DynamicTables, the "object_type" would be "VectorData" which is very generic and using that would pick up a lot of false positives, so it does not make queries for annotations of table columns any easier.
With the move of data types from NWB to hdmf-common, the hdmf-common types may no longer have documentation. They might get copied over into NWB, but it is not clear. Regardless, the hdmf-common types should have their own documentation on its own readthedocs page.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.