tbrunetti / gthack Goto Github PK
View Code? Open in Web Editor NEWCode to hack through, manipulate, and extract information from GTC files
License: MIT License
Code to hack through, manipulate, and extract information from GTC files
License: MIT License
My understanding is that the manipulateGTC option currently updates the genotype call and the base calls (1002 and 1003 byte arrays respectively) based on the input alleles in the updates file and the allele combo in the manifest ([A/B] in the manifest). However, the logic in manipulateGTC currently doesn't work in two scenarios:
When there are Indel updates specified in the updates.txt file with AA/BB combination (II,DD), the snpUpdate ignores this case. I assume that this is because of the conditional check in this line in snpUpdate function. This condition checks for only A,T,G,C in the input line. So, indels are ignored.
Currently base calls in the GTC file (1003 byte array) is updated based off the alleles mentioned in the input updates file - ref but, this sometimes gives us the wrong base calls as GTC files use the TOP strand alleles combination to generate base calls value for a SNP whereas the base calls generated using the allele combination for the SNP in the BeadPoolManifest might be different.
GTC file documentation reference: https://github.com/Illumina/BeadArrayFiles/blob/develop/docs/GTC_File_Format_v5.pdf (TOC Entry table)
I've managed to find a workaround for this by updating the base calls using the TOP strand combination found in the CSV format of the BeadPoolManifest.
BeadPoolManifest file documentation for reference: https://knowledge.illumina.com/microarray/general/microarray-general-reference_material-list/000001565
I've temporarily addressed both the scenarios and pushed the changes to a fork of this repo - https://github.com/sgopalan98/GThaCk/tree/fixing-bug-manipulate-gtc .
It would be really helpful if you could look at these bugs and find out if there is a better fix for this? Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.