Code Monkey home page Code Monkey logo

Comments (7)

rossmounce avatar rossmounce commented on August 25, 2024

If that is implying that all trees have to be binary then no, that is not correct.

It is permitted in Newick to have a tri-furcation e.g. (A(B,C,D) or larger polytomy (A(B(C,D,E,F,G))).
There are unfortunately many slightly different ways of writing Newick.

from phylotree.

petermr avatar petermr commented on August 25, 2024

On Fri, Aug 7, 2015 at 4:49 PM, Ross Mounce [email protected]
wrote:

If that is implying that all trees have to be binary then no, that is not
correct.

That's what it implies, and it calls itself a Validator.

It is permitted in Newick to have a tri-furcation e.g. (A(B,C,D) or larger
polytomy (A(B(C,D,E,F,G))).
There are unfortunately many slightly different ways of writing Newick
https://en.wikipedia.org/wiki/Newick_format.

That's exactly why it's a problem. It may mean that I will have to create a
STK2-specific Newick. In any case the transfer has to be validated.

So the likelihood is that we have a single file of 5000 lines with Newick
in? In which case we will at some stage need a tool to summarize the CTrees
and create one [1].

[1] Yes we can find/grep/cat to concatenate output, but ultimately
summarisation should be done in AMI using some form of map/reduce strategy.

Reply to this email directly or view it on GitHub
#16 (comment)
.

Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069

from phylotree.

petermr avatar petermr commented on August 25, 2024

Does this mean that we can try with a small number of trees to test whether the supertree workflow works (even if the answers are not meaningful)?

from phylotree.

petermr avatar petermr commented on August 25, 2024

From Ross

??? Do you mean input into STK2 from ami? We just need to concatenate all the *.nwk files into one big *.tre file for STK2. One nwk per line in the STK2. No additional re-shaping or reformatting (provided that the taxon names have already been standardised). At most it will entail the subtraction or addition of semicolons at the end of each line.

We haven't decided where the *.nwk files are in the Ctree. Since there could be >1 image there will be >1 *.nwk

I have validated the Newick generated by AMI this morning. I used the command line mode of TreeGraph 2 to generate new images of the trees in .png & .svg from the .nwk files. 2195 / 2211 were successfully interpreted. Sorry I have not reported this sooner. I will get up details about the errors in the error folder on phylotree ASAP

So you will flag 16 files as errors in an issue, explain what is wrong and assign them as an issue for me?

I trust TreeGraph 2 as a validator. Some like DendroPy (Python) are useful but too strict - they throw a fit at all the unlabelled taxa, so not so useful at this stage.

That's your shout. My point is that I have to know that AMI output is valid. It sounds like some of it isn't

from phylotree.

rossmounce avatar rossmounce commented on August 25, 2024

Just uploaded it all to https://github.com/ContentMine/phylotree/tree/master/errors/TreeGraph2-validation-tests

I have now posted a separate issue here: #17 for the specific files which appear to be erroneous

from phylotree.

petermr avatar petermr commented on August 25, 2024

There's an error in Github:

Sorry, we had to truncate this directory to 1,000 files. 7,798 entries were
omitted from the list.

On Fri, Aug 7, 2015 at 5:49 PM, Ross Mounce [email protected]
wrote:

Just uploaded it all to
https://github.com/ContentMine/phylotree/tree/master/errors/TreeGraph2-validation-tests

I will post a separate issue for the specific files which appear to be
erroneous


Reply to this email directly or view it on GitHub
#16 (comment)
.

Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069

from phylotree.

petermr avatar petermr commented on August 25, 2024

Are these all files in error? (.../errors/TreeGraph2-validation-tests
https://github.com/ContentMine/phylotree/tree/master/errors/TreeGraph2-validation-tests
)

We need a description of what these files are. They look like potential
input for tests, not errors.

On Fri, Aug 7, 2015 at 6:23 PM, Peter Murray-Rust <
[email protected]> wrote:

There's an error in Github:

Sorry, we had to truncate this directory to 1,000 files. 7,798 entries
were omitted from the list.

On Fri, Aug 7, 2015 at 5:49 PM, Ross Mounce [email protected]
wrote:

Just uploaded it all to
https://github.com/ContentMine/phylotree/tree/master/errors/TreeGraph2-validation-tests

I will post a separate issue for the specific files which appear to be
erroneous


Reply to this email directly or view it on GitHub
#16 (comment)
.

Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069

Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069

from phylotree.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.