Comments (3)
Can you share the parameter file you used for Admixtools 1, the exact commands you used in Admixtools 2, and the outputs you get?
If the results are substantially different, the most likely reason is that the settings don't match. If the settings match, there can still be a small difference; I never tracked down the reason for these very small differences. They seemed too small to be of much relevance. It's possible that it has to do with how the error matrix is estimated. If the boundaries of the jackknife blocks don't match perfectly between Admixtools 1 and 2, that can be enough to make the estimated variances slightly different.
from admixtools.
Thank you for quick answer. Here are the commands & results.
I don't understand why chisq values are significantly different while weight values are not.
Admixtools 1
Command
library(admixtools)
fn1 <- "test"
left = c('Muturu','Sahiwal')
right =c('EuropeanBison', 'AmericanBison', 'Simmental', 'Hanwoo')
target = 'Afar'
a <- qpadm_wrapper(fn1,left,right,target,'/opt/ohpc/pub/apps/AdmixTools/bin/qpAdm', parfile="./parfile", outdir = "./")
Results
weights
target left weight mean se z
1 Afar Muturu 0.273 0.273 0.005 54.5
2 Afar Sahiwal 0.727 0.727 0.005 145.
popdrop
pat wt dof chisq p Muturu Sahiwal feasible
1 00 0 2 2.65 0.266 0.273 0.727 TRUE
2 01 1 3 8971. 0 1 0 TRUE
3 10 1 3 2751. 0 0 1 TRUE
Admixtools 2
Commands
library(admixtools)
fn1 <- "test"
left = c('Muturu','Sahiwal')
right =c('EuropeanBison', 'AmericanBison', 'Simmental', 'Hanwoo')
target = 'Afar'
mypops <- c(left,right,target)
dt <- f2_from_geno(fn1, pops=mypops ,auto_only = F, maxmem = 10000, poly_only=F, blgsize=5000000)
b <- qpadm(dt,left,right,target)
Results
weights
target left weight se z
1 Afar Muturu 0.273 0.00488 55.8
2 Afar Sahiwal 0.727 0.00488 149.
popdrop
pat wt dof chisq p f4rank Muturu Sahiwal feasible best dofdiff
1 00 0 2 4.19 0.123 1 0.273 0.727 TRUE NA NA
2 01 1 3 10135. 0 0 1 NA TRUE TRUE 0
3 10 1 3 3098. 0 0 NA 1 TRUE TRUE NA
from admixtools.
Thank you for posting the commands!
I had another look at this, and I found a few reasons why qpadm chi-squared- and p-values can be different between Admixtools 1 and Admixtools 2.
- SNP block boundaries
In Admixtools 1, missing SNPs are not excluded when SNP block boundaries are calculated, but in Admixtools 2 they are excluded first when running f2_from_geno()
or extract_f2()
(but not when reading the data from genotype files directly). This can result in slightly different SNP blocks, which can affect the p-values a bit. I could change that so the SNP blocks are always computed based on all SNPs to make things more consistent, but it's not too straight-forward to change this, and it's possible that the potential for introducing new bugs is greater than the benefit from making that change.
- SNP block lengths
Admixtools 1 calculates SNP block boundaries based on all SNPs, but the calculated block lengths do not include missing SNPs. Admixtools 2, when reading the data from genotype files directly, calculated the block lengths based on all SNPs, which is wrong. I changed that in the latest version. So in the latest version, if you run your example again not with f2_from_geno()
, but reading the data from genotype files directly, you should see a smaller difference in p-values compared to Admixtools 1.
- Covariance matrix regularization
Before the covariance matrix is inverted, a regularization term is added to the diagonal elements, which is proportional to the fudge
parameter times the trace of the matrix. In Admixtools 1, this is done twice. I now added a parameter fudge_twice
to the qpadm
function so that this behavior can be imitated.
- Very large chi-squared values
After fixing the bug in 2., and setting fudge_twice = TRUE
, I now get very close to identical chi-squared and p-values in my tests for qpadm in Admixtools 1 and 2, except sometimes when the chi-squared values are very large. I might get 100 vs 1000 in Admixtools 1 vs 2, for example. I haven't looked into this further, because I don't think differences in that range matter for qpadm.
Please let me know if you still see differences after setting fudge_twice = TRUE
and reading the data from genotype files directly. If yes, the best way for me to look into this would be to replicate this on your data, if you can share it!
from admixtools.
Related Issues (20)
- Error in parse_qpadm_output HOT 1
- The order of pops in outgroup f3 test HOT 2
- Explanation of the functions qpdstat and qpf4ratio HOT 1
- Protocol Advice: Establishing a t hresholds for the number of allowed admixture events HOT 3
- qp3pop unique_only does not work as expected HOT 2
- Do qpWave, qpAdm, and f5 have requirements on the minimum number of individuals for each population? HOT 1
- est_to_boo does not preserve the SNP block names (number of SNPs), but est_to_loo() does. Is that intended? HOT 2
- large difference of f3 result between admixtools and admixtools2 HOT 1
- Zero drift edges HOT 2
- Behavior affected by other R packages? HOT 2
- Error in if (gimp > 0 && gimp%%plusminus_generations == 0) { : missing value where TRUE/FALSE needed HOT 2
- please add tags/releases HOT 2
- Inconsistent bootstrap significance testing HOT 5
- keyword argument typo lack of error HOT 3
- Comparing graphs with compare_fits HOT 2
- auto_only: change default value to FALSE HOT 1
- qpWave returning one fewer ranks than expected HOT 1
- qpAdm computation stucks when the num. of letf pops below 3
- Issue with f3 and f4 HOT 2
- running many replicates of find_graphs? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
š Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ššš
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ā¤ļø Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from admixtools.