Comments (10)
Usually I use formula: work/users_num
I think it is something that everybody who publishes their results would need, so I think it is not so much work in the end. I'll keep it open and close when the system is in place
from gnverifier.
It is indeed a problem. And it is not only code, because database evolves as well, although, it mostly stays backward compatible sofar. However, nothing prevents a situation where an important feature would break that backward compatibility. So I guess a solution would be
- Figure out how to monitor database versioning (database actually is defined by this internal package https://github.com/gnames/gnidump), which is an equivalent of walking around the house alone in pajamas (no docs, bad architecture, no versions). So it would need to be improved. It would need to get to v1, and every time there is a breaking change in the database, increae major version number to v2, v3 etc.
- Add version number to sql dump file at http://opendata.globalnames.org/dumps/
- gnames version should return its own version + version of gnmatcher
- Every major version of database dumps has one latest file (something like dump-v1.3.6, dump-v2.0.2)
That gives a theoretical possibility to put together verification system. Using particular version of gnames + gnmatcher + database.
It does not solve a problem of data changing all the time, but I think that in most cases for most data-sources data change is cumulative, so result should be close, albeit not identical sometimes.
from gnverifier.
Quite a lot of work.
So to be realistic, I think we are much closer to a day where I can create a replicable protocol using this combination:
- my own draft list of problematic names
- my own set of checklist datasources (i.e., dwc dump of my preferred sources, and extract the needed columns from them to feed gndiff/gnparser)
- gndiff+gnparser CLI
All these are versionable, downloadable, easily citable and standalone executable.
I will closely follow gndiff evolution ;)
from gnverifier.
I think it is something that everybody who publishes their results would need, so I think it is not so much work in the end. I'll keep it open and close when the system is in place
Great. Not sure if you are now meaning gnverifier / gndiff option, or both. But any advances would be good as for "theorical" repeteability.
As for really practical, I think the gndiff approach is the only good one (it would be easy to replicate something as long as you use the same offline tools; but anybody would accomplish the task of replicating the whole gnames services as they were at some time in the past, just for reviewing goodness of a small experiment or checklist).
from gnverifier.
for gndiff it should be easy, it has no remote dependencies, so just its version defines the result
from gnverifier.
Yes I agree.
Version plus a given combination of request parameters, since it would be best to give users the option to define as much as possible the matching behaviour (of course with default values for everything, to avoid undesired CLI complexity).
Either that or using an editable default config file, so users can see default values and modifiy as needed.
from gnverifier.
Somehow related, but a bit off-topic.
I have seen some Zenodo links related to your work (i.e. https://doi.org/10.5281/zenodo.5111543). A couple of questions:
- As that links back to github, I understand you prefer the Zenodo link to be cited. Correct?
- Does Zenodo contain a full backup of the github project files by that time?
I wonder if or you needed to upload them all to Zenodo (perhaps there is some "auto-zenoding" tool for github projects that you can tell me?) - When still not in Zenodo, which would you say is the best way to citate a github project?
I am a bit lost because the above Zenodo url links back to https://github.com/gnames/gnverifier/tree/v0.3.3 , but I am not sure what "tree" and "v0.3.3" means in this context. What's the difference between tree v0.3.3 and realease v0.3.3? https://github.com/gnames/gnverifier/releases/tag/v0.3.3
Just looking for advice so I might decide to use github and/or zenodo for versioning a checklist in the future.
Thanks a lot in advance
from gnverifier.
Someone wanted to cite gnames, so I created Zenodo link for that purpose. Being lazy, I prefer to avoid unnecessary work, so I decided not to update these links, until someone requests a change again :)
from gnverifier.
OK. I thought you used some kind of auto-backup from github and zenodo.
As for the difference between github tree v.xxx and release v.xxx, do you have any opinion?
from gnverifier.
I these tree/vx.x.x and vx.x.x mean the same. In case of github links I usually use something like
https://github.com/gnames/gnverifier/releases/tag/v0.8.2
from gnverifier.
Related Issues (20)
- As a Developer I want to refactor the code to a better file structure
- Update list of data-sources given in web-UI
- brew v 1.0.2 fails HOT 2
- As a User I want to see results for exact name_string
- As a User I want to see a widget for a particullar name-string
- No fuzzy matching? HOT 3
- Post return incorrect name HOT 1
- Updating datasets, iNaturalist and VASCAN in particular HOT 2
- include prokaryotic names (e.g., from LPSN) as a source in the verifier HOT 7
- Advanced search does not work for all taxa HOT 1
- Bring verification and search `data-sources` and `all matches` fields in sync
- bug? (also propagated to old releases) HOT 14
- As a Developer I want to be able to pass context.Context to gnverifier methods
- Additional space after the `×` in nothospecies names HOT 5
- doubtful entries in GBIF HOT 6
- Advanced search: filter on taxonomic rank HOT 18
- Prepare gnverifier to v1.0.0 release HOT 5
- Improve uBio presence in gnverifier
- new datasources of fungal names HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gnverifier.