niu-lab / gclust Goto Github PK

View Code? Open in Web Editor NEW

12.0 12.0 3.0 27.66 MB

genome sized sequences clustering

License: Apache License 2.0

Makefile 0.02% C++ 99.06% C 0.77% Perl 0.15%

bacterial-genomes clustering-methods metagenomics parallel-computing

gclust's People

Contributors

Stargazers

Watchers

Forkers

daichuang liruilin7086 guozihuaa

gclust's Issues

The coverage problem and (maybe) wrong cluster problem

Thanks for your wonderful tool !

My problem is

if there are the parameters related with the alignment coverage.
For example,

Just like the picture shows, the query genome and target genome share 99.46% identity but only 84% coverage. When I set the "-memiden" to 99, they will be assigned to the same cluster....

So, if there are some parameters about the "coverage" threashold filtering?
In my experiment ,there are 2 highly similar genome, their identity and coverage is displayed as below picture:

However, when I set the "-memiden" to 99, they are assigned to different clusters, that really makes me confused...I am not sure what's going on...

(All the alignment in the picture is done by the online megablast alignment tool.)

No output with the actual genome/contigs clusters sequence

Dear developers,

Thank you for the useful good tool. I have followed your instructions manual, however, by running gclust exactly as you did, no output file with the actual clusters nucleotide sequence is produced. Only a list of the clusters with the genomes/contigs in each.
It would be a much better and easy to use tool, if will produce a similar output like cd-hit does, with a representative clusters fasta file.

Thank you and best regards
Vadim

Update makefile

Heya great program really convenient. Just finished using it. Just wanted to note that you might want to update the line in the README on how to find representatives as I think it points to an old file. Should be:
make -f gclust/script/makefile_createreps
instead of
make -f gclust/script/makefile

Thank you for writing this!

multiple fasta as input

Hi.

The examples show how the input file is one fasta file. I have several files I need to cluster, is there a way to do this w/o changing the source code?

thanks.

-ricardo

niu-lab / gclust Goto Github PK

gclust's People

Contributors

Stargazers

Watchers

Forkers

gclust's Issues

The coverage problem and (maybe) wrong cluster problem

No output with the actual genome/contigs clusters sequence

Update makefile

multiple fasta as input

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent