Code Monkey home page Code Monkey logo

Comments (5)

danking avatar danking commented on June 3, 2024

Hi @EugeneEA , if you want to convert every missing GT to homozygous reference, try this:

mt = mt.annotate_entries(GT = hl.coalesce(mt.GT, hl.call(0, 0)))

from hail.

EugeneEA avatar EugeneEA commented on June 3, 2024

Thanks a lot!

from hail.

ag14774 avatar ag14774 commented on June 3, 2024

This does not work for me unfortunately. I started with a table in coordinate format. I then converted to a matrix table. If I do mt.show() of course some values are correctly missing. But if I then do mt = mt.annotate_entries(GT = hl.coalesce(mt.GT, hl.call(0, 0))) nothing is filled. The NA values remain NA. If I do mt.entry.take(5) or mt.GT.take(5) there are no "None" values in the list. It's as if the mt.GT list only contains the non-missing data. Similarly if I do mt.annotate_entries(test=1), This entry is added only to the coordinates that have GT non-missing and test=NA for everything else! Any help would be appreciated

from hail.

danking avatar danking commented on June 3, 2024

Hey @ag14774 !

Short answer: add an mt.unfilter_entries() before you annotate_entries.

This is a bit confusing but there are two distinct ideas in Hail:

  1. filteredness
  2. missingness

A row can be filtered, a column can be filtered and an entry can be filtered. When rows or columns are filtered, all their entries are also filtered and the row or column key is excluded entirely. When entries are filtered, its just the individual entry that is removed.

In contrast, any particular value can be "missing". A field, like GT, can be missing, but so can a particular element of an array, or a particular field of an hl.struct.

We think of "filtered" data as not even existing. It's not included in the denominator of, say, hl.agg.mean.

We think of "missing" data as something that exists but is hidden from us. When we do statistics, we need to decide how to treat that data, sometimes we mean-impute, sometimes we convert to hom-ref, sometimes we use a sophisticated model that handles missingness directly.

from hail.

ag14774 avatar ag14774 commented on June 3, 2024

Thanks @danking that worked!

from hail.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.