The dcc-metadata from faang

Link to livestock breed ontology doesn't work

In https://github.com/FAANG/faang-metadata/blob/master/docs/faang_sample_metadata.md link to http://www.ontobee.org/ontology/LBO?iri=http://purl.obolibrary.org/obo/LBO_0000000 gives an error

Link to EFO site appears to be broken

Link on this page

specifically
Sex (any child term of EFO_0000695) {http://www.ebi.ac.uk/efo/EFO_0000695}
does not render

These are some issues I picked up on when reading the final version of the documents, they are stylistic rather than content but it would be nice to see them fixed before we push out the final PDFs on the website and to everyone in FAANG. I have also made some phrasing changes directly in the documents. This possibly gets a bit rambly but I am happy to talk though what I put here when in the office.

These are based on my reading of the FAANG metadata documents found in

https://github.com/FAANG/faang-metadata/blob/master/docs/

In all docs

We use a very readable description for attributes in the form

attribute name (data type) a brief description

Not all fields follow this form. I think many could, especially when the field is biosamples id

e.g from the experiment metadata

sample_id (biosample_id) the biosample id of the sample the experiement was run on

Also in the list of attributes, sometimes things are in back ticks and sometimes not, this alters the formatting in the PDF in an inconsistent way, it would be good to make this consistent

In the sample doc

Do we want a controlled list of species codes for the field, something like

Cow BTA
Pig SSC
Sheep OAR
Chicken GGA
Horse ECA
Goat CHR

In the experiment doc

The indent of bullets in the pdf seems a bit skew, I have tried to improve it but I am not sure if I have.

Is it reasonable to require the RNA purity info?

For the hi-c experiements, is there anyway to make these ontology terms or controlled vocabulary?

When archiving experimental protocols, do we want to talk to Biostudies about archiving the protocols there rather than just hosting them on the DCC ftp site, might give us more power for versioning protocols?

In the analysis doc

Do we have a destination for this metadata yet?

This document seems to have no statement about what is or isn't required? or if it does, it seems much less obvious than the sample or experiment document.

In this list

Input data - a list of files used as input and references to the experiment records in a data archive
Reference data - genome assembly, gene set, etc
Analysis protocol

You mention references in the input data description and reference data at point 2. what are the references in point 1 and how are they different from the references in point 2.

Why are we asking for explict statement of percentage reads mapped if we get the total reads and the mapped read numbers, surely this value can be implied

This file lacks data types or submission info? there should be something here, even it is just a statement that we will collect this information in the first instance.

faang / dcc-metadata Goto Github PK

dcc-metadata's People

Contributors

Stargazers

Watchers

Forkers

dcc-metadata's Issues

This is a test

Link to livestock breed ontology doesn't work

Link to EFO site appears to be broken

FAANG metadata suggestions

In all docs

In the experiment doc

In the analysis doc

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent