Code Monkey home page Code Monkey logo

Comments (1)

uqrmaie1 avatar uqrmaie1 commented on September 2, 2024

Hi Ben,

thanks for the great suggestion!
I have incorporated your changes into the latest version. This added foreach and doParallel to the list of required packages, but that list is already so long that it barely matters.

Another thing you can do to speed up extract_f2 is to set the fst option to FALSE if you don't need it, because unlike the precomputed f2-statistics, precomputed Fst is not used any function other than fst.

The fact that it takes so long to run extract_f2 makes me think that you probably run it for a large number of populations. And I assume that in order to not lose all SNPs which are missing in one or more populations, you probably set maxmiss to values greater than 0. In most cases that's fine, but in some cases this can give you biased f4-statistics (this happens when SNP missingness status is correlated to allele frequency differences).
I think it can be useful to use maxmiss > 0 for many populations to screen a large number of f-statistics, but it's probably a good idea to later get precise estimates either through the default maxmiss = 0 setting, or (if too many SNPs are lost that way) by skipping precomputed f2-statistics altogether, so that different SNPs can be used for each f3- or f4-statistic.

Another option for using precomputed f2-statistics for many populations without losing most SNPs is to use the qpfstats option which I added to extract_f2 in the latest version. It's described here. It basically calculates smoothed f2-statistics by regressing all possible f2-, f3-, and f4-statistics on a design matrix that describes which f2-statistics contribute to which f2-, f3-, and f4-statistic. The implementation of this in extract_f2 doesn't overlap much with the standard calculation of f2-statistics, so none of the options for making the computation faster or more memory efficient apply here, and I doubt it will currently work for a large number of populations. But I can look into options for getting this to work for many populations, if this turns out to be useful!

Robert

from admixtools.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.