Code Monkey home page Code Monkey logo

poseidon-framework.github.io's People

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

neija2611 jfy133

poseidon-framework.github.io's Issues

Jupyter Tutorial for Bash and R backends

Now that we have a complete workflow using trident, we can aim at having a nice Jupyter/Binder tutorial, complete from downloading Poseidon packages to exploring and analysing them.

For Bash, @nevrome suggested this workflow:

trident list --remote --packages
trident fetch -d . -f "*2020_Immel_Moldova*,*2020_Wang_subSaharanAfrica*"
trident list -d . --packages
trident summarise -d .
trident survey -d .
trident validate -d .
trident forge -d . -f "<HYR002>,<Gordinesti>" -o test -n Testpackage

We could go further than that and even run a quick PCA and plotting in R (using a separate R-backend notebook)

For binder, we'd need to define an environment that includes trident as well as smartPCA. The latter is available via a conda package, so that's easy. Trident isn't on conda yet.

Poseidon version changelog

We should add a page to document the changes from one Poseidon version to the next and how one could update a package accordingly.

Get terms.rdf.json to be statically hosted

We need to statically serve the file terms.rdf.json (which gets automatically copied here from Poseidon-schema (see Issue #3 in that repo). I think just https://poseidon-framework.github.io/terms.rdf.json as URL would be good.

We can set up the redirect from w3id.org/poseidon (yet to be registered)

There will be more files to come. The schema files need to be also copied and statically served, so perhaps we should have a sub-dir called static_schema_files or something prettier?

Changes for new Poseidon schema version

Consequence of: poseidon-framework/poseidon-schema#35

  • Write documentation for the POSEIDON.yml snpSet field.
  • Explain how snpSet will be handled by trident forge and trident init.
  • Explain new janno column Genetic_Source_Accession_IDs.
  • Explain new column Data_Preparation_Pipeline_URL.
  • Change the documentation of Publication_Status to reflect that it now allows multiple values.

Tutorial pages

We decided today to bundle the existing Getting Started and Tutorials into one place called "Tutorials". Clemens will make a first step, and Stephan can add more sessions (for example the Comp-Book F-Stats session).

Add a figure to document internal update dependencies

This is only relevant from a developers perspective, but maybe the homepage is still the right place for it.

I imagine a table or a graph that documents which of our software tools potentially have to be updated if certain parts of the Poseidon schema are modified. For example: A change to the .ssf file specification is very important for trident and poseidon-eager, but not for the janno R package. It's not trivial to think this through, but I think it will help us to estimate the cost of certain operations and not forget about critical updates to our infrastructure.

Acknowledgements

The website should acknowledge the contributions of multiple people:

  • Michelle OReilly for designing the new logo (not yet in use!)

New "Getting Started" section?

I think it would be great to have some simple "Getting started" page, which guides you through

  • Downloading the tools
  • Downloading some packages
  • Listing some entities
  • Forging a package
  • Basic analyses with Xerxes

Archive explorer ToDos and potential enhancements

I just published the new archive explorer. Here are some ToDos for the future, ideally for @93Boy. Generally each of them deserves an own Pull Request, but maybe the map-related ones can be solved in one go.


Currently possible changes for the map:

  • The per-sample popups in the map should only show the information we actually have for a given sample, so the empty fields should be omitted.
  • The sample popups should also have a button to open the relevant per-package page.
  • Modern and ancient samples should be distinguished by marker shape in the map (although the marker clusters generally omit the individual markers, admittedly...)

Currently possible changes beyond the map:

  • The per-package pages should include some more summary statistics for the package. E.g. How many modern/ancient samples are in the package? How many are male/female/unknown? What is the mean number of SNPs?
  • We should get a clever mechanism to attribute each per-package page an own URL, so that they can be referenced directly.
  • List old package versions on the per-package pages.

Changes that will be possible in the future with a more beefed-up version of the Web API:

  • Load detailed sample information for the per-package pages only per package to safe bandwidth.
  • Display more general meta/context information (e.g. contributors) for each package on the per-package pages.
  • Render the package changelog on the per-package pages as we did for the old homepage.
  • Render the literature references on the per-package pages as we did for the old homepage.

@stschiff Do you agree with this? Feel free to add or change any of these suggestions.

Software documentation changes

We decided to provide PDF versions of the documentation for all major software tools, and provide them - including old versions - for download on the website. This would then also make the tabs redundant. Ideally we will have an automatic action that creates new PDFs if documentation gets updated. This concerns existing version-documentations. For new versions, we will still have to make new PDFs and add the old ones to the link list manually.

forge command

Hi!
I just used the forge command as in the documentation with -f *package1*, *package2* and got this error

option -f: (line 1, column 27):
unexpected end of input
expecting white space, "*" or "<"

So I think the documentation needs to be adjusted to reflect that you cannot have spaces between the -f inputs. Thanks!

Add a tutorial for a common workflow: A Western Eurasian PCA with projection

I was recently approached by @DivyaratanPopli on how to prepare a simple Western Eurasian PCA for the projection of ancient samples with the tools and data provided by and for Poseidon. This is a very common application and I suggest we add a tutorial for that on the website.

Maybe we elevate such tutorials, including the Getting started section, to an own top level category of the website.

Make persistent url at https://w3id.org

For linked data, we need to have a persistent URL to host our terms.rdf and various schema definitions. For this reason I have provisionally assumed in all schema definitions so far that we'll eventually have such a persistent URL at https://w3id.org/poseidon. This seems to be easy to set up, following the instructions on https://w3id.org, with a redirect on our GitHub page. Later, if we decide to move away from github, perhaps to an MPI-hosted website, we can just change the redirect with w3id.org and don't have to change any of the linked data RDF terms.

So we need to grab https://w3id.org/poseidon and set up that redirect.

Visibility

I made this repo private, because the website was seriously outdated but still the most visible thing that came up for people coming from google. We have to update this package and make it public again.

Community leader responsible for enforcement of the Code of Conduct

To be as inclusive as possible I added a solid standard template for a Code of Conduct to the website already a while ago: https://poseidon-framework.github.io/#/conduct

I was now made aware that this includes naming a contact address to report concerns:

Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to 
the community leaders responsible for enforcement at [INSERT CONTACT METHOD]. 
All complaints will be reviewed and investigated promptly and fairly.

What would be the best way to organize this?

Column specifications and content

Concerning https://github.com/poseidon-framework/poseidon2-schema/blob/master/janno_columns.tsv

1. Publication Status

Publication_Status bibtex key (e.g. "@AuthorJournalYear") or "unpublished"

It seems that if the Publication status actually starts with an @, either the bibtex validator doesn't accept the key containing the character, or the poseidon validator prints the following error:

! The .bib file does not contain the literature in the janno file or the bibtex keys are different
! This seems to be a valid package, but some things are fishy.

Removing the @ from the .janno file fixed the issue, but it seems like an update is needed, either to the validator (strip leading @) or to the content explanation for the field.

2. mtContamination error

mtContam_stderr Standard error of ContamMix/Schmutzi estimate

ContamMix doesn't actually return a stderr, but a 95% confidence interval instead, making the error around the mode asymmetric. In my own package I have reported the largest difference between MAP and the edges of the 95% confidence interval, but that can be somewhat misleading. It would be good to either allow people to specify mtDNA contamination error as an interval or two fields with min and max of the CI (which can be done for both stderr and 95%CI), or give clear instructions on how one should report a 95%CI here.

Divide docs into subpages

Someone who knows Jekyll better than myself could structure that page nicer. I would like to have three menu-linked pages leading to the three building blocks (format, tools, repo).

Move documentation from READMEs into this repository

I think embedding the existing READMEs for trident, the standard, and so forth, was very clever, and it certainly has the advantage that we can keep the documentations for trident, standard and PoseidonR in their respective packages. At least for trident and the standard, however, I somehow feel it also hampers flexibility to maintain and develop this webpage further. For example, it could make sense to weave the individual docs more together to create a comprehensive overview or getting started page, or to document the server API together with trident, or such things.

I would suggest to move the individual documentations here directly into this website-repository, and replace the original READMEs with some basic info and then a link to this webpage. This would allow us to freely design this webpage to be maximally user-friendly and engaging, without the danger to compartmentalise the docs for the individual parts.

(Note that copying the READMEs over here with a gh-action won't help, I think, as it still would mean that we are forced to keep the documentations as separate chapters.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.