Comments (5)
RaxML considers those invariant sites because the N could be an A, T, G, or C. My script should remove those two sites, which is what it needs to do because RaxML will crash otherwise. It's not the normal definition of monomorphic sites but that's how RaxML handles them.
from raxml_ascbias.
I'm trying to think what else could be causing the inflated counts because their sum (sum of A/C/G/Ts) is basically the number of sites of the entire bacterial genome: roughly 2.2M. If all the sites are invariant that would imply the isolates are clonal which they are not.
from raxml_ascbias.
It could be gaps. I'm not sure on that but if you want I can take a look this weekend. I seem to remember that this script might not be perfect in all datasets, and i think maybe gaps throw it off. I'm normally very thorough with my coding and testing practices but this was one of my early scripts when I was learning python and I've noticed it sometimes has issues. I also wrote this as a quick and dirty script for a specific need that I had.
If it's the same dataset you sent me the other day I'll go ahead and look at it. I'm currently not super speedy about getting to this stuff lately because I have a newborn baby at home but I'll try to work on it this weekend.
from raxml_ascbias.
It could be gaps. I'm not sure on that but if you want I can take a look this weekend. I seem to remember that this script might not be perfect in all datasets, and i think maybe gaps throw it off. I'm normally very thorough with my coding and testing practices but this was one of my early scripts when I was learning python and I've noticed it sometimes has issues. I also wrote this as a quick and dirty script for a specific need that I had.
If it's the same dataset you sent me the other day I'll go ahead and look at it. I'm currently not super speedy about getting to this stuff lately because I have a newborn baby at home but I'll try to work on it this weekend.
Congrats on the baby! I totally understand. I can already tell you that the whole genome SNP alignment was generated in snippy and there are gaps and Ns. I'm awful with python so kudos to you. I might need to send you a dropbox link though to another file. I'll probably send that later tonight. Thanks!
from raxml_ascbias.
Okay I am closing this for real now. You have been a big help. After thinking about this more...I found the solution. I used snp-sites -cbp outputprefix input.phylip
option to extract the monomorphic sites ACGT only and export in a phylip file. Then I used the phylip file as input into your script. The results are the exact same as the internal script. So everything is fine. Enjoy time with the new baby. You can ignore the .aln file I sent. Take care!
from raxml_ascbias.
Related Issues (4)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from raxml_ascbias.