poseidon-framework / poseidon-framework.github.io Goto Github PK

View Code? Open in Web Editor NEW

4.0 5.0 2.0 16.15 MB

Main landing page for getting information about the Poseidon project

Home Page: https://www.poseidon-adna.org

HTML 31.24% TeX 68.76%

ancient-dna genotype-data documentation research-data

poseidon-framework.github.io's People

Stargazers

Watchers

Forkers

neija2611 jfy133

poseidon-framework.github.io's Issues

Jupyter Tutorial for Bash and R backends

Now that we have a complete workflow using trident, we can aim at having a nice Jupyter/Binder tutorial, complete from downloading Poseidon packages to exploring and analysing them.

For Bash, @nevrome suggested this workflow:

trident list --remote --packages
trident fetch -d . -f "*2020_Immel_Moldova*,*2020_Wang_subSaharanAfrica*"
trident list -d . --packages
trident summarise -d .
trident survey -d .
trident validate -d .
trident forge -d . -f "<HYR002>,<Gordinesti>" -o test -n Testpackage

We could go further than that and even run a quick PCA and plotting in R (using a separate R-backend notebook)

For binder, we'd need to define an environment that includes trident as well as smartPCA. The latter is available via a conda package, so that's easy. Trident isn't on conda yet.

Poseidon<->trident version mapping table

We should add a table that shows clearly which versions of the Poseidon schema are compatible with which trident version.

Clarify Poseidon.yml details on Webpage

We could add a details page for the Poseidon.yml file, which could also explain the versioning rules.

We could possibly then also link to that page from within the trident docs.
See also poseidon-framework/poseidon-schema#69

Improve repo list on the homepage

The current public repo list is neither especially useful nor pleasant on the eyes. We should redesign that.

Poseidon version changelog

We should add a page to document the changes from one Poseidon version to the next and how one could update a package accordingly.

Improve per-package routing in the archive explorer

We would like to be able to use per-package URLs to refer to specific package views in the Archive Explorer. Currently, all per-package views are under the same URL https://www.poseidon-adna.org/#/archive_explorer.

Get terms.rdf.json to be statically hosted

We need to statically serve the file terms.rdf.json (which gets automatically copied here from Poseidon-schema (see Issue #3 in that repo). I think just https://poseidon-framework.github.io/terms.rdf.json as URL would be good.

We can set up the redirect from w3id.org/poseidon (yet to be registered)

There will be more files to come. The schema files need to be also copied and statically served, so perhaps we should have a sub-dir called static_schema_files or something prettier?

Changes for new Poseidon schema version

Consequence of: poseidon-framework/poseidon-schema#35

Write documentation for the POSEIDON.yml snpSet field.
Explain how snpSet will be handled by trident forge and trident init.
Explain new janno column Genetic_Source_Accession_IDs.
Explain new column Data_Preparation_Pipeline_URL.
Change the documentation of Publication_Status to reflect that it now allows multiple values.

Tutorial pages

We decided today to bundle the existing Getting Started and Tutorials into one place called "Tutorials". Clemens will make a first step, and Stephan can add more sessions (for example the Comp-Book F-Stats session).

Dating confusion

Maybe linking to this blogpost with the excellent figure can help to make the dating columns more clear.

Nit: Maybe use example domains for documentation purposes

Currently, in standard.md email addresses in domains institute.org, etc are used. There's a couple of special domains, designed explicitly for this purpose. That's not too much of an issue here, but one may still inadvertently expose actual people to getting spammed.

Add a figure to document internal update dependencies

This is only relevant from a developers perspective, but maybe the homepage is still the right place for it.

I imagine a table or a graph that documents which of our software tools potentially have to be updated if certain parts of the Poseidon schema are modified. For example: A change to the .ssf file specification is very important for trident and poseidon-eager, but not for the janno R package. It's not trivial to think this through, but I think it will help us to estimate the cost of certain operations and not forget about critical updates to our infrastructure.

Acknowledgements

The website should acknowledge the contributions of multiple people:

Michelle OReilly for designing the new logo (not yet in use!)

New "Getting Started" section?

I think it would be great to have some simple "Getting started" page, which guides you through

Downloading the tools
Downloading some packages
Listing some entities
Forging a package
Basic analyses with Xerxes

Archive explorer ToDos and potential enhancements

I just published the new archive explorer. Here are some ToDos for the future, ideally for @93Boy. Generally each of them deserves an own Pull Request, but maybe the map-related ones can be solved in one go.

Write a little blog post at https://blog.poseidon-adna.org about the new archive explorer to introduce it to the world.

Currently possible changes for the map:

The per-sample popups in the map should only show the information we actually have for a given sample, so the empty fields should be omitted.
The sample popups should also have a button to open the relevant per-package page.
Modern and ancient samples should be distinguished by marker shape in the map (although the marker clusters generally omit the individual markers, admittedly...)

Currently possible changes beyond the map:

The per-package pages should include some more summary statistics for the package. E.g. How many modern/ancient samples are in the package? How many are male/female/unknown? What is the mean number of SNPs?
We should get a clever mechanism to attribute each per-package page an own URL, so that they can be referenced directly.
List old package versions on the per-package pages.

Changes that will be possible in the future with a more beefed-up version of the Web API:

Load detailed sample information for the per-package pages only per package to safe bandwidth.
Display more general meta/context information (e.g. contributors) for each package on the per-package pages.
Render the package changelog on the per-package pages as we did for the old homepage.
Render the literature references on the per-package pages as we did for the old homepage.

@stschiff Do you agree with this? Feel free to add or change any of these suggestions.

Software documentation changes

We decided to provide PDF versions of the documentation for all major software tools, and provide them - including old versions - for download on the website. This would then also make the tabs redundant. Ideally we will have an automatic action that creates new PDFs if documentation gets updated. This concerns existing version-documentations. For new versions, we will still have to make new PDFs and add the old ones to the link list manually.

forge command

Hi!
I just used the forge command as in the documentation with -f *package1*, *package2* and got this error

option -f: (line 1, column 27):
unexpected end of input
expecting white space, "*" or "<"

So I think the documentation needs to be adjusted to reflect that you cannot have spaces between the -f inputs. Thanks!

Add a tutorial for a common workflow: A Western Eurasian PCA with projection

I was recently approached by @DivyaratanPopli on how to prepare a simple Western Eurasian PCA for the projection of ancient samples with the tools and data provided by and for Poseidon. This is a very common application and I suggest we add a tutorial for that on the website.

Maybe we elevate such tutorials, including the Getting started section, to an own top level category of the website.

Make persistent url at https://w3id.org

For linked data, we need to have a persistent URL to host our terms.rdf and various schema definitions. For this reason I have provisionally assumed in all schema definitions so far that we'll eventually have such a persistent URL at https://w3id.org/poseidon. This seems to be easy to set up, following the instructions on https://w3id.org, with a redirect on our GitHub page. Later, if we decide to move away from github, perhaps to an MPI-hosted website, we can just change the redirect with w3id.org and don't have to change any of the linked data RDF terms.

So we need to grab https://w3id.org/poseidon and set up that redirect.

Visibility

I made this repo private, because the website was seriously outdated but still the most visible thing that came up for people coming from google. We have to update this package and make it public again.

Update URL to www.poseidon-adna.org?

We could change the URL for our website to

www.poseidon-adna.org

Described here:
https://docs.github.com/en/pages/configuring-a-custom-domain-for-your-github-pages-site

What is unclear to me is whether the old URL will then still work but automatically redirect. That would be good. If it stops working it would be bad.

Community leader responsible for enforcement of the Code of Conduct

To be as inclusive as possible I added a solid standard template for a Code of Conduct to the website already a while ago: https://poseidon-framework.github.io/#/conduct

I was now made aware that this includes naming a contact address to report concerns:

Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to 
the community leaders responsible for enforcement at [INSERT CONTACT METHOD]. 
All complaints will be reviewed and investigated promptly and fairly.

What would be the best way to organize this?

Schema documentation changes

We would link the schema in PDF format to the website.

move user-facing fstat documentation from trident to xerxes

We need a new help page for xerxes first

flag "--onlyGeno" not further described

On the webpage, the description of the forge-command use is missing an explanation on what the --onlyGeno flag would do.

Column specifications and content

Concerning https://github.com/poseidon-framework/poseidon2-schema/blob/master/janno_columns.tsv

1. Publication Status

Publication_Status	bibtex key (e.g. "@AuthorJournalYear") or "unpublished"

It seems that if the Publication status actually starts with an @, either the bibtex validator doesn't accept the key containing the character, or the poseidon validator prints the following error:

! The .bib file does not contain the literature in the janno file or the bibtex keys are different
! This seems to be a valid package, but some things are fishy.

Removing the @ from the .janno file fixed the issue, but it seems like an update is needed, either to the validator (strip leading @) or to the content explanation for the field.

2. mtContamination error

mtContam_stderr	Standard error of ContamMix/Schmutzi estimate

ContamMix doesn't actually return a stderr, but a 95% confidence interval instead, making the error around the mode asymmetric. In my own package I have reported the largest difference between MAP and the edges of the 95% confidence interval, but that can be somewhat misleading. It would be good to either allow people to specify mtDNA contamination error as an interval or two fields with min and max of the CI (which can be done for both stderr and 95%CI), or give clear instructions on how one should report a 95%CI here.

Divide docs into subpages

Someone who knows Jekyll better than myself could structure that page nicer. I would like to have three menu-linked pages leading to the three building blocks (format, tools, repo).

Move documentation from READMEs into this repository

I think embedding the existing READMEs for trident, the standard, and so forth, was very clever, and it certainly has the advantage that we can keep the documentations for trident, standard and PoseidonR in their respective packages. At least for trident and the standard, however, I somehow feel it also hampers flexibility to maintain and develop this webpage further. For example, it could make sense to weave the individual docs more together to create a comprehensive overview or getting started page, or to document the server API together with trident, or such things.

I would suggest to move the individual documentations here directly into this website-repository, and replace the original READMEs with some basic info and then a link to this webpage. This would allow us to freely design this webpage to be maximally user-friendly and engaging, without the danger to compartmentalise the docs for the individual parts.

(Note that copying the READMEs over here with a gh-action won't help, I think, as it still would mean that we are forced to keep the documentations as separate chapters.)

poseidon-framework / poseidon-framework.github.io Goto Github PK

poseidon-framework.github.io's People

Stargazers

Watchers

Forkers

poseidon-framework.github.io's Issues

Recommend Projects

Recommend Topics

Recommend Org