Code Monkey home page Code Monkey logo

repeatseq's People

Contributors

leecbaker avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

repeatseq's Issues

confusing output in .calls file: nL:50 as genotype

Thank you for developing repeatseq. It's a very useful tool for my current research project.
I have a question about the .calls output file.
Based on your document, there are three types of genotypes in the .call output:
NA, NhM or N (e.g. NA, 7h6, 17)

Example .call output in your document:
[region] [TRF string] [Genotype][Confidence]
2L:6146-6162 3.8_4_78_21_20_52_0_0_47_1.00_ATTA 17 39.3627
2L:7006-7017 4.0_3_100_0_24_66_0_0_33_0.92_AAT NA NA
2L:10589-10595 7.0_1_100_0_14_0_0_0_100_0.00_T 7h6 17.5857

But in my .call output file, there are also multiple results of nL:50 with no value of confidence (e.g. 15L:50, 20L:50).

Example output for my question about .calls file:
1:879055-879069 4_3.8_4_879055_879069_30_0_80_0_20_0.72_CCCT 15L:50
1:879887-879906 5_4.0_5_879887_879906_40_20_60_20_0_1.37_CCAGC 20L:50

I checked the .vcf output, but the nL:50 is not in the file. So I guess I should discard them.
However I also checked the .repeatseq out put and in that format, they 15L:50 would mean genotype 15 and likelihood of 50. So according to this, I should include them since 50 is a high likelihood.
Therefore, I am kind of confused. Does this nL:50 type of result mean it's the same genotype n as the reference, or it's an invalid result I should discard?

I am using RepeatSeq v0.8.2, with the command:

repeatseq -calls input.bam \
                Homo_sapiens_assembly19.fasta \
                /repeatseq/regions/hg19.2014.nochr.regions

Thanks!

format for regions file?

Hi,
I can get useful results using this command:
repeatseq A47294.bam GRCh37-lite.fa hg19.2014.noChr.regions

But when I try and make my own regions file using a subset of the lines in hg19.2014.noChr.regions, I only get a report in the .vcf for one of the regions I specified. I'm trying to match the same sort order, but I'm not having much luck getting results beyond a region of two from my list. Any ideas?

can't install

i followed all the steps:

git clone repeatseq
cd repeatseq
git clone bamtools
git clone fastahack

cd bamtools
mkdir build
cd build
cmake ..
make
cd ../fastahack
make
cd ..
make

But it returns the following error: epeatseq.cpp:1398:9: error: cannot convert ‘std::ifstream {aka std::basic_ifstream}’ to ‘bool’ in return
return ifile;
^~~~~
makefile:13: recipe for target 'repeatseq.o' failed
make: *** [repeatseq.o] Error 1

Any suggestion?

SegFault when I run repeatseq

I am a novice in bioinformatics. I want to get the 555 error matrix in your paper.
I downloaded the bam file of exome from 1000 genomes project and splited it into individual bam file by chromsomes. Then I downloaded the reference fasta file of chr1. Then I want to get the repeatseq file of chr1. However, when I ran repeatseq chr1.bam chr1.fa chr1.region
I encountered the SegFault. I want know where I made a mistake.

I have a problem when building with GCC 6.3

hi,
I have a problem when building with GCC 6.3
g++ -c -O3 -Ibamtools/src repeatseq.cpp
repeatseq.cpp: In function 'bool fileCheck(std::__cxx11::string)':
repeatseq.cpp:1398:9: error: cannot convert 'std::ifstream {aka std::basic_ifstream}' to 'bool' in return
return ifile;
^~~~~
How can I fix the error?
Thanks

format region file

Hi,

I am trying to use repeatseq tool, but I am getting following error in using region file.

improper column two or
terminate called after throwing an instance of 'char const*'
Aborted

How to create region file format, especially column two?

Segfault

Hi, I run the following sorted BAM (and BAM index) files which were aligned using bowtie2 to hg19.
http://wren.omrf.org/data/repeatseq/SRR057346.bam
http://wren.omrf.org/data/repeatseq/SRR057346.bam.bai

I also used UCSC's hg19 chromFa.zip concatenated into one large hg19.fa file and the provided hg19.max5.regions.

The program runs for about 20 minutes then segfaults. It creates the VCF file and the outputs a header into the VCF file, but no other rows. If I specify -counts, it does seem to output a complete counts file, but still segfaults. This is all on 64-bit Ubuntu compiled with g++ 4.6.1.

Also a few other miscellaneous things:

  • The program fails with a cryptic error unless an appropriate BAM index file is present.
  • The README refers to PDF documentation but I could not find any in the repository or on Mittelman lab web site. It would be especially nice to know what exactly the .counts file contains.

Thanks!

can't install

Hi , I have some problem about installing Repeatseq , the error is /usr/bin/ld: cannot find -lbamtools
collect2: error: ld returned 1 exit status ,make: *** [repeatseq] Error 1
help!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.