Code Monkey home page Code Monkey logo

cobs's People

Contributors

bingmann avatar devgg avatar giang-nghg avatar iqbal-lab avatar jnalanko avatar leoisl avatar simongog avatar zhicheng-liu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

wook2014

cobs's Issues

Result output truncates reference files at first dot

Hello,

We built an index over RefSeq genomes. The downloaded filenames are named like this:

/path/GCF_000019125.1_ASM1912v1_genomic.fna.gz
/path/GCF_000019165.1_ASM1916v1_genomic.fna.gz
...

When searching the index, the result looks as follows:

*query1 XXX
GCF_000019125 XXX
GCF_000019165 XXX
...

Luckily for us, the names are still unique and we should be able to compare the output with some effort to reconstruct the full reference name.

This format is lossy if the names weren't unique before the first dot and might even lead to severe false negatives if not noticed by the user.

Best,
Svenja

Query reads from fastq.gz files

Hi,
it seems queries are only supported when stored in fasta format.
Do you have any plan to enable query from reads in fastq.gz files?

Thanks,
-Giulio

Format of output

Could the percentage/proportion of query kmers present in each sample be reported rather than the number of kmers present?

COBS for 600K genomes

Hi,
I am in charge to find presence of specific genes in 600.000 Salmonella's genomes.
I used COBS on few genomes for training
But I don't really understand the output...
I copied a subsequence (55 bp) from one of my genomes, and run COBS to see if it get it.
In the output I got 24 (see bellow).
And when I choose bigger sub sequence, sometimes it doesn't find it at all.

Another issue: how I can see if my query fully matchs or partially?

I ran these command:
cobs compact-construct index.cobs_compact
cobs query -i index.cobs_compact

--- end of document list (5 entries) ---
documents: 5
minimum 31-mers: 2811023
maximum 31-mers: 2874904
average 31-mers: 2834688
total 31-mers: 14173442
DIE: Output file exists, will not overwrite without --clobber @ /opt/conda/conda-bld/cobs_1646087618998/work/cobs/construction/compact_index.cpp:213
terminate called without an active exception

SRR18349609 24
SRR18349610 24
SRR18349611 24

TIMER info=search hashes=9.929e-06 io=0.000567883 total=0.000577812

Query length 55

I'd really appreciate your help

Thank you!

Improve Mac OS X delivery - make COBS compilable with clang

Right now COBS on Mac has to be compiled by users. We can't provide a conda recipe because COBS require gcc. We can't use containers as singularity is not supported on Mac. The real solution for this is for COBS to compile with clang, then it is straightforward to provide a COBS Mac bioconda recipe

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.