Comments (37)
One thing to consider. Having a prefix on each object reduces the chance of namespace collisions.
Developers will likely build on more than just the GA4GH libraries and use others as well. Those other libraries may have an object called e.g. ReadAlignment. In that case, you will need to differentiate between the two using the full-path.
from ga4gh-schemas.
-0, the advantage of removing the GA prefix is that you remove redundancy and simplify the class name, but the disadvantage is that you can run into overlapping class names from multiple packages the implement similar functionality (e.g., net.sourceforge.samtools.SAMRecord vs. org.bdgenomics.formats.ADAMRecord).
from ga4gh-schemas.
I see. In proto you have to fully qualify anything not in the same package. So you'd have to type out org.ga4gh.x no matter what.
from ga4gh-schemas.
It depends on how you're importing it in each language; if you imported the records in C++, you'd just namespace them, or in Scala you'd just rename on import:
import org.ga4gh.{AlignedRead => GAAlignedRead}
import someone.elses.api.{AlignedRead => OtherAlignedRead}
It's not an insurmountable issue, but it's a bit frustrating to have your interop code be inconsistent with the rest of your codebase because you need to use full namespace paths/etc to refer to classes.
from ga4gh-schemas.
btw - some of the API providers will still be outputting json - will you be able to load those into your Avro objects easily?
from ga4gh-schemas.
@cassiedoll I believe you can create Avro directly from JSON in both Java and C++. I'm not sure about the other avro language bindings.
from ga4gh-schemas.
excellent - thanks frank!
from ga4gh-schemas.
-1 because of 1) auto-generated code and 2) name collisions.
from ga4gh-schemas.
Avoiding namespace collisions is important, but wouldn't this be relatively rare? If the common case we expect is not to have conflicting namespaces in end-user code, then it seems a bit heavy-handed to force the GA prefix on everyone. How hard is it to work around namespace conflicts in any case? In Python, it would be easy to sort out using "import X as Y" or whatever.
from ga4gh-schemas.
IMHO adding prefixes to the names when there is a namespace collision will
probably hinder code readability. It's relatively hard to grasp that what's
a Record everywhere else is a GARecord in this particular file, and it's
worse if some files use Record for our record and some other files in the
same project use Record for the conflicting record type.
On Mon, Jul 7, 2014 at 1:30 AM, Jerome Kelleher [email protected]
wrote:
Avoiding namespace collisions is important, but wouldn't this be
relatively rare? If the common case we expect is not to have conflicting
namespaces in end-user code, then it seems a bit heavy-handed to force the
GA prefix on everyone. How hard is it to work around namespace conflicts in
any case? In Python, it would be easy to sort out using "import X as Y" or
whatever.—
Reply to this email directly or view it on GitHub
#87 (comment).
from ga4gh-schemas.
Maybe I'm just too used to Google's style - we never prefix anything :)
from ga4gh-schemas.
IMHO collisions should rarely happen with well-written code. In any case, there is @namespace() if one would like specify the namespace. Just my $0.02 :)
from ga4gh-schemas.
IMHO collisions should rarely happen with well-written code. In any case, there is @namespace() if one would like specify the namespace. Just my $0.02 :)
@pgrosu conceptually, I agree. However, if we're exporting data into/out of the API, we may need to interoperate with other libraries that also use the *Record name (e.g., net.sf.samtools.SAMRecord, org.bdgenomics.format.avro.ADAMRecord).
from ga4gh-schemas.
Thanks @fnothaft for the clarification - now that makes more sense regarding the direction we want to head toward.
from ga4gh-schemas.
@fnothaft wouldn't it be better to fully qualify the class references in the (surely relatively rare?) cases when there is a namespace conflict, than to force all code to use a prefix? I think the common case should be as easy and intuitive as possible - I certainly found the GA prefixes confusing and unnecessary when I first looked at the API. Perhaps I'm misunderstanding what the common case will be though?
from ga4gh-schemas.
from ga4gh-schemas.
For what it is worth, I am also in favor of this (although I don't have voting authority); see e.g. pull request #94.
from ga4gh-schemas.
+1 -- I think the benefits of a more readable API for everyone outweigh the costs of occasionally, in some programming contexts, needing to do explicit namespacing.
from ga4gh-schemas.
+1 -- agree benefits (far) outweigh the costs. This is not C, we have well
defined namespaces!
On Wed, Jul 16, 2014 at 6:28 PM, David Glazer [email protected]
wrote:
+1 -- I think the benefits of a more readable API for everyone outweigh
the costs of occasionally, in some programming contexts, needing to do
explicit namespacing.—
Reply to this email directly or view it on GitHub
#87 (comment).
from ga4gh-schemas.
We have no consensus here, so I'm closing this issue.
(We can always revisit it later if we like)
from ga4gh-schemas.
After discussing this with @fnothaft, @massie and @vadimzalunin it seems we might be a bit closer to consensus on this - perhaps we could have another vote? It would be good to discuss in person in San Diego if we are still divided on the topic.
I'm +1 for removing the prefix.
from ga4gh-schemas.
I would be OK with removing the prefix if that is the consensus. So a cautious +1.
Richard
On 15 Oct 2014, at 01:41, Jerome Kelleher [email protected] wrote:
After discussing this with @fnothaft, @massie and @vadimzalunin it seems we might be a bit closer to consensus on this - perhaps we could have another vote? It would be good to discuss in person in San Diego if we are still divided on the topic.
I'm +1 for removing the prefix.
—
Reply to this email directly or view it on GitHub.
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
from ga4gh-schemas.
@jeromekelleher - Wait, you convinced @fnothaft and @massie regarding this? :) So if we have multiple libraries with the same object names, then how do we differentiate between them?
from ga4gh-schemas.
I mildly prefer to keep the GA prefix when Avro itself does not have a concept of namespace (if I am right). When I read code, the prefix immediately tells me the source of the APIs, which to me is useful. Nonetheless, I don't mind if the consensus is to remove the prefix. -0
from ga4gh-schemas.
Conceptually, I like removing the prefix. There is a concerned
about how this will impact code generation for languages that
don't support namespaces, such as C.
from ga4gh-schemas.
There is some advantage for keeping it in metadata, where the GA will unanimously point to our interpretation of value range in this context (see the lengthy GAsex related exchanges)
I understand that this is different for reads/file formats, where the contributions are meant even more general (GA or not).
Michael
On 15 Oct 2014, at 15:54, Heng Li [email protected] wrote:
I mildly prefer to keep the GA prefix when Avro itself does not have a concept of namespace (if I am right). When I read code, the prefix immediately tells me the source of the APIs, which to me is useful. Nonetheless, I don't mind if the consensus is to remove the prefix. -0
—
Reply to this email directly or view it on GitHub.
from ga4gh-schemas.
+1
@jeromekelleher - Wait, you convinced @fnothaft and @massie regarding this? :) So if we have multiple libraries with the same object names, then how do we differentiate between them?
We've de-prefixed ADAM over the past few months: bigdatagenomics/adam#418, bigdatagenomics/adam#342...
from ga4gh-schemas.
+1 for removing most prefixes, but I do agree that some of the metadata records can stand to keep it so we stay out of developing and clashing with Ontologies.
from ga4gh-schemas.
0.1 and 0.5 proved prefixes less useful the I anticipated before, so changing mine -1 to +0.
from ga4gh-schemas.
It seems that a lot of people are in favour of removing the GA prefix, and no one is strongly against the idea. The prefix is certainly useful in some situations, but I think that our overriding concern should be for end-users writing client code. In this case, we need to ensure that the API is as easy as possible to learn and understand, and the GA prefix is a major obstacle to this in my view.
@mbaudis and @delagoya - I don't quite follow why metadata is different. Why is org.ga4gh.GASex
better than org.ga4gh.Sex
(for example)?
from ga4gh-schemas.
@jeromekelleher, I think it comes down to the idea that if a set of all possible attributes for a class in GA4GH becomes a superset - and its has specific properties in the way it was defined - where the majority of users might only use a subset of the same class. Then the superset sort of becomes its own class.
I think this discussion ( https://github.com/cassiedoll/schemas/commit/51cd6ece5c83fc8076959a3be997190303718170#commitcomment-7663742 ) - as one of several that @mbaudis alluded to - helps to illustrate that. I think with some classes, we're working on first defining the class, attributes and their interpretation (including their relation to other classes), in order to collect all the variations under those definitions, which strives to behave as sets of controlled vocabularies.
from ga4gh-schemas.
@pgrosu Well said. While the domain points to the technical definition of the attribute, the prefix makes it more obvious that the contained values may need some consideration.
That said, the "GA" here is of a more psychological nature and technically not necessary if the context is provided through the org.ga4gh domain. I am not against removing it; there always will be trade-offs ... And we still can change the names of the attributes (org.ga4gh.geneticSexAsWeSeeIt) ;-)
from ga4gh-schemas.
+1
Things that can be misunderstood as "GA4GH is developing their own Ontology" should be prefixed to make it clear that this we are not attempting to do so.
On Oct 22, 2014, at 3:57 AM, Paul Grosu [email protected] wrote:
@jeromekelleher, I think it comes down to the idea that if a set of all possible attributes for a class in GA4GH becomes a superset - and its has specific properties in the way it was defined - where the majority of users might only use a subset of the same class. Then the superset sort of becomes its own class.
I think this discussion ( cassiedoll@51cd6ec#commitcomment-7663742 ) - as one of several that @mbaudis alluded to - helps to illustrate that. I think with some classes, we're working on first defining the class, attributes and their interpretation (including their relation to other classes), in order to collect all the variations under those definitions, which strives to behave as sets of controlled vocabularies.
—
Reply to this email directly or view it on GitHub.
from ga4gh-schemas.
+1, @mbaudis, Thank you and I agree :)
+1, @delagoya, I think we should be fine as a long as we mention that the focus of GA4GH is to enable a standard for sharing sequencing and associated clinical information (meta)data, which requires accommodating a large number of possibilities.
from ga4gh-schemas.
@mbaudis, @delagoya, OK, this seems sensible. Perhaps we should make a PR without all the GAs and see how things look? We can add some back if it looks like we are trying to be too all-encompassing with the definitions.
This would be a pretty disruptive change, so perhaps we should wait until some of the outstanding PRs are merged?
from ga4gh-schemas.
Let's definitely make a PR - it looks like everybody is now on board. I think we're always going to have conflicting PRs, so we don't have to really wait. Whoever has free time can pick this up :)
from ga4gh-schemas.
Closing this since a fix has been merged.
from ga4gh-schemas.
Related Issues (20)
- Package for CRAN
- RNA expression data structure is inefficient HOT 7
- Rename repository HOT 2
- Update Release notes for the v0.6.0a10 release
- Remove created and updated timestamps from API HOT 4
- Add peer service human readable docs HOT 1
- Document maven release process HOT 1
- Move datamodel to its own repo
- Improve development.rst
- Content Type Negotiation
- Implement updated transcript effects protocol
- Deprecate reference ID in favor of reference name or accession ID HOT 1
- Recreate assay metadata HOT 2
- Update Java Protobuf Dependency to 3.1+
- protobuf java square write code-gen HOT 3
- Change booleans to enums
- Assay Metadata for Analysis object table is broken in documentation...
- GeoLocation attributes names HOT 1
- ListReferenceBasesRequest GET or POST HOT 1
- AnalysisResult scores
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ga4gh-schemas.