Code Monkey home page Code Monkey logo

Comments (1)

uqrmaie1 avatar uqrmaie1 commented on September 2, 2024

Thanks for bringing this to my attention!

The likely reason why you see the warning about the discarded blocks is that in these blocks both the numerator and the denominator of FST are 0 for one or more population pairs. This can happen if all SNPs in a block have the same genotype in two populations. It will happen more often if you read data for many populations, since every block where at least one pair is missing will be discarded. If you don't want those blocks to be discarded, you can pass the option remove_na = FALSE to fst(). This will still trigger a warning about blocks having missing data, but now they shouldn't be discarded.

I added the option to compute FST relatively late when making the package, and because of that some FST-related function don't always behave as expected. In particular, fst(f2_blocks) doesn't compute FST, it just turns a 3d-array of per-block-f2-statistics into a data frame of f2-statistics with standard errors. I changed the documentation of the fst() function to make this clearer. FST can't actually be computed from f2 alone. f2 and FST are calculated separately and are stored in different files. You can still pass a 3d-array of per-block numbers to the fst() function, because those numbers could be per-block FST estimates which could be turned into FST estimates plus standard errors. You can get per-block FST estimates by running f2_from_precomp() while setting fst = TRUE. There is nothing in that 3d-array of numbers that remembers whether f2 or FST was read by f2_from_precomp() (the function defaults to reading f2), so the fst() function doesn't complain about getting the wrong input when you pass it an array of per-block-f2-estimates. And it doesn't complain about missing data, because in places where FST estimates are 0/0 = NaN, the f2 estimates are just 0.

Another thing I noticed is that you use the option apply_corr = FALSE. This option only affects f2-statistics, not FST (where this correction is always applied), but I don't know if there are any cases where you want to not apply the correction factor. Without it, your estimates might be biased upwards. I just included this option to make debugging easier for myself!

Hope this makes a bit more sense now, and sorry that the documentation was misleading here!

from admixtools.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.