Code Monkey home page Code Monkey logo

immerge's People

Contributors

pettyalex avatar wanying-zhu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

immerge's Issues

IMPUTED/TYPED flag not being passed through to the merged file

I am having an issue where the final merged output VCF file does not have the proper flag for whether the SNP was IMPUTED or TYPED. They are all correctly listed in the input info file and the individual input VCFs but then they all say IMPUTED for the final output VCF. I have tried this with the '--mixed_genotype_status' turned off (to output IMPUTED/TYPED based on the first input VCF) and with it turned on (to output ALL/SOME/NONE). Either way IMMERGE outputs that all SNPs were IMPUTED or NONE.

IMMerge Missing SNPs because of Sorting Method

This follows up my post about the updated TOPMed. In addition to the changes in the format of the info files, the TOPMed dosage files contain duplicate SNPs that are mishandled by IMMerge.

Our attempt to merge VCF files from TOPMed crashed with the following error: reached end of file...but SNP chr9:205964:G:A is not found. But in fact, chr9:205964:G:A and chr9:205964:A:G are both in the info files and dosage files. Here is the order of the SNPs in the info and dosage files:

23519 chr9 205964 9:205964  G  A
23520 chr9 205964 rs478882  A  G

But in the variants retained file, the order of the SNPs is reversed.

                SNP      REF.0.  ALT.1. 
175 chr9:205964:A:G      A      G       
176 chr9:205964:G:A      G      A   

The order of the SNPs is also reversed in the index file.

I believe this occurred because the SNPs are sorted by Position and the SNP when creating the retained and excluded lists. Therefore, when IMMerge walked down the retained SNP list, it found the A:G version on line 23520 of the dosage file. It then started searching for the next SNP in the retained list, the G:A version, on line 23251 of the dosage file and searched to the bottom of the file; of course, it missed the SNP since it was on the line above where the search started.

Do you have any suggestions for a quick fix of this problem? IMMerge has been very useful to us despite this glitch. We would like to continue to use it with the new TOPMed files.

--mixed_genotype_status

Hi there,
Has the optional command '--mixed_genotype_status' been disabled ? I tried to include the command '--mixed_genotype_status true' and got the error message 'merge_files.py: error: unrecognized arguments: --mixed_genotype_status true'.
I am using IMMerge version 0.0.3.
Thank you!

Do you plan to update IMMerge for use with TOPMed r3?

It appears that TOPMed has changed the format of the info files it produces. For example, rather than a single Genotyped field, there are now two fields, IMPUTED and TYPED. Do you plan to update the IMMerge to be compatible with the new info file format?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.