Provide a mechanism for validating *.nwk output. As an example <a hr

Just uploaded it all to <a href="https://github.com/ContentMine/phylotree/tree/master/

validation of Newick output about phylotree HOT 7 OPEN

petermr commented on August 25, 2024

validation of Newick output

from phylotree.

Comments (7)

rossmounce commented on August 25, 2024

If that is implying that all trees have to be binary then no, that is not correct.

It is permitted in Newick to have a tri-furcation e.g. (A(B,C,D) or larger polytomy (A(B(C,D,E,F,G))).
There are unfortunately many slightly different ways of writing Newick.

from phylotree.

petermr commented on August 25, 2024

On Fri, Aug 7, 2015 at 4:49 PM, Ross Mounce [email protected]
wrote:

If that is implying that all trees have to be binary then no, that is not
correct.

That's what it implies, and it calls itself a Validator.

It is permitted in Newick to have a tri-furcation e.g. (A(B,C,D) or larger
polytomy (A(B(C,D,E,F,G))).
There are unfortunately many slightly different ways of writing Newick
https://en.wikipedia.org/wiki/Newick_format.

That's exactly why it's a problem. It may mean that I will have to create a
STK2-specific Newick. In any case the transfer has to be validated.

So the likelihood is that we have a single file of 5000 lines with Newick
in? In which case we will at some stage need a tool to summarize the CTrees
and create one [1].

[1] Yes we can find/grep/cat to concatenate output, but ultimately
summarisation should be done in AMI using some form of map/reduce strategy.

—

Reply to this email directly or view it on GitHub
#16 (comment)
.

Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069

from phylotree.

petermr commented on August 25, 2024

Does this mean that we can try with a small number of trees to test whether the supertree workflow works (even if the answers are not meaningful)?

from phylotree.

petermr commented on August 25, 2024

From Ross

??? Do you mean input into STK2 from ami? We just need to concatenate all the *.nwk files into one big *.tre file for STK2. One nwk per line in the STK2. No additional re-shaping or reformatting (provided that the taxon names have already been standardised). At most it will entail the subtraction or addition of semicolons at the end of each line.

We haven't decided where the *.nwk files are in the Ctree. Since there could be >1 image there will be >1 *.nwk

I have validated the Newick generated by AMI this morning. I used the command line mode of TreeGraph 2 to generate new images of the trees in .png & .svg from the .nwk files. 2195 / 2211 were successfully interpreted. Sorry I have not reported this sooner. I will get up details about the errors in the error folder on phylotree ASAP

So you will flag 16 files as errors in an issue, explain what is wrong and assign them as an issue for me?

I trust TreeGraph 2 as a validator. Some like DendroPy (Python) are useful but too strict - they throw a fit at all the unlabelled taxa, so not so useful at this stage.

That's your shout. My point is that I have to know that AMI output is valid. It sounds like some of it isn't

from phylotree.

rossmounce commented on August 25, 2024

Just uploaded it all to https://github.com/ContentMine/phylotree/tree/master/errors/TreeGraph2-validation-tests

I have now posted a separate issue here: #17 for the specific files which appear to be erroneous

from phylotree.

petermr commented on August 25, 2024

There's an error in Github:

Sorry, we had to truncate this directory to 1,000 files. 7,798 entries were
omitted from the list.

On Fri, Aug 7, 2015 at 5:49 PM, Ross Mounce [email protected]
wrote:

Just uploaded it all to
https://github.com/ContentMine/phylotree/tree/master/errors/TreeGraph2-validation-tests

I will post a separate issue for the specific files which appear to be
erroneous

—
Reply to this email directly or view it on GitHub
#16 (comment)
.

Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069

from phylotree.

petermr commented on August 25, 2024

Are these all files in error? (.../errors/TreeGraph2-validation-tests
https://github.com/ContentMine/phylotree/tree/master/errors/TreeGraph2-validation-tests
)

We need a description of what these files are. They look like potential
input for tests, not errors.

On Fri, Aug 7, 2015 at 6:23 PM, Peter Murray-Rust <
[email protected]> wrote:

There's an error in Github:

Sorry, we had to truncate this directory to 1,000 files. 7,798 entries
were omitted from the list.

On Fri, Aug 7, 2015 at 5:49 PM, Ross Mounce [email protected]
wrote:

Just uploaded it all to
https://github.com/ContentMine/phylotree/tree/master/errors/TreeGraph2-validation-tests

I will post a separate issue for the specific files which appear to be
erroneous

—
Reply to this email directly or view it on GitHub
#16 (comment)
.

Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069

Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069

from phylotree.

validation of Newick output about phylotree HOT 7 OPEN

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent