The mktests_jy from jayoung

fix plotting when seqs contain Ns

some plots do NOT have anything in gray for the polymorphisms.

it seems like it fails when the alignment contains Ns, but it's fine once the Ns are removed

Add methods tab to output

Add methods tab to spreadsheet to capture parameters and warnings, perhaps also a tab to name the seqs in each group

it could get really cluttered. one methods tab PER analysis? one seqs tab PER analysis? (we already have one sites tab per analysis)

check case-sensitivity

double-check code can handle upper and lower case alignments

should be just making everything uppercase before I even start

To help troubleshooting, make a trim function, to consider only some portions of the alignment (and it might be useful in general, e.g. to look only at some domains while retaining original alignment coordinates)

clearer message when one population has entirely stop codons

function stops with an error when all seqs from one population have a stop codon. Would be nice to say WHICH POPULATION the error comes from

in combineMKresults add ability to combine polarized and unpolarized results

they have different numbers of columns so we cannot rbind

Excel export rounding / NA / numbers stored as text issue

would be nice to round more columns.

I did it for the unpolarized results (asked Excel to SHOW the rounded values, didn't actually round them) but I didn't do it for the polarized results

how am I using the pop1alias and pop2alias values? I thought those would go in colnames of the excel export, but maybe not

check for overlapping seqnames between populations

add a check for situations where pop1 / pop2 / outgroups have seqnames in common. That's a user error.

pop aliases aren't always used in Excel output

clarify outgroup treatment

I wrote this in MKfunctions.R: "for now we are treating all outgroups (e.g. yak, sec) as one big group"

is that true? I thought I allowed the possibility of a hierarchical outgroup list

in any case, clarify in the README.md how to deal with outgroups

check seqlengths are the same for all seqs in an alignment

bug in polarizing when ancestor is ambiguous

still figuring it out.

see scripts/doMKtests_exampleProblemCodonPolarizing.R

add discussion of reconstructing ancestors

add this discussion to README.md

Ching-Ho takes a different approach when polarizing:
use multiple outgroup species (about 5? includes eugracilis), reconstruct the mel-sim ancestor. Use the ancestor instead of sim (or instead of mel) in my R scripts to run the unpolarized code. This is equivalent to getting the mel-branch-only MK result.

he uses MEGA for this: make a tree from the alignment (nucleotide level, not codon level), reconstruct ancestors, and it will spit out the most likely nucleotide at each site. Alternatively, you could also ask it to tell you probabilities, and perhaps replace any very uncertain positions with N.

he's also curious about IQtree to reconstruct ancestors. PAML is another possibility.

jayoung / mktests_jy Goto Github PK

mktests_jy's People

Contributors

Watchers

mktests_jy's Issues

fix plotting when seqs contain Ns

Add methods tab to output

check case-sensitivity

add trim option

clearer message when one population has entirely stop codons

in combineMKresults add ability to combine polarized and unpolarized results

Excel export rounding / NA / numbers stored as text issue

check for overlapping seqnames between populations

pop aliases aren't always used in Excel output

clarify outgroup treatment

check seqlengths are the same for all seqs in an alignment

bug in polarizing when ancestor is ambiguous

add discussion of reconstructing ancestors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent