Comments (20)
OK, about the lex_count. Sorina found when she ran it, she had left==right for a call to lex_count.
When I ran it through gdb, I got
Program received signal SIGSEGV, Segmentation fault.
0x0000000000438e7c in sdsl::rank_support_v5<1u, 1u>::rank (this=0x7fffffffcd60, idx=249406034312) at /home/sm0179/include/sdsl/rank_support_v5.hpp:123
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.149.el6_6.9.x86_64 libgcc-4.4.7-11.el6.x86_64 libstdc++-4.4.7-11.el6.x86_64
(gdb) bt
#0 0x0000000000438e7c in sdsl::rank_support_v5<1u, 1u>::rank (this=0x7fffffffcd60, idx=249406034312) at /home/sm0179/include/sdsl/rank_support_v5.hpp:123
#1 0x0000000000438fba in sdsl::rank_support_v5<1u, 1u>::operator() (this=0x7fffffffcd60, idx=249406034312) at /home/sm0179/include/sdsl/rank_support_v5.hpp:136
#2 0x0000000000407298 in sdsl::wt_intsdsl::int_vector<1u, sdsl::rank_support_v5<1u, 1u>, sdsl::select_support_mcl<1u, 1u>, sdsl::select_support_mcl<0u, 1u> >::lex_count<std::tupl
e<unsigned long, unsigned long, unsigned long> > (this=0x7fffffffcd30, i=249406034312, j=72198331389279224, c=4) at /home/sm0179/include/sdsl/wt_int.hpp:551
#3 0x0000000000406946 in bidir_search (csa=..., left=@0x883bcf0, right=@0x883bcf8, left_rev=@0x883f5f0, right_rev=@0x883f5f8, c=4 '\004') at ./bidir_search.cpp:43
#4 0x0000000000409f54 in bidir_search_bwd (csa=..., left=249406034312, right=72198331389279224, left_rev=140737488343504, right_rev=140737488343504, pat_begin=1 '\001', pat_end=4 '
\004', sa_intervals=std::list = {...}, sa_intervals_rev=std::list = {...}, sites=std::list = {...}, mask_a=std::vector of length 3141174, capacity 4194304 = {...}, maxx=185, first_d
el=@0x7fffffffd1ef) at ./bidir_search_bwd.cpp:143
#5 0x0000000000440b61 in main (argc=12, argv=0x7fffffffd538) at ./map.cpp:101
and
(gdb) frame 3
#3 0x0000000000406946 in bidir_search (csa=..., left=@0x883bcf0, right=@0x883bcf8, left_rev=@0x883f5f0, right_rev=@0x883f5f8, c=4 '\004') at ./bidir_search.cpp:43
(gdb) p left
$1 = (uint64_t &) @0x883bcf0: 249406034312
(gdb) p right
$2 = (uint64_t &) @0x883bcf8: 72198331389279224
(gdb) frame 4
#4 0x0000000000409f54 in bidir_search_bwd (csa=..., left=249406034312, right=72198331389279224, left_rev=140737488343504, right_rev=140737488343504, pat_begin=1 '\001', pat_end=4 '
\004', sa_intervals=std::list = {...}, sa_intervals_rev=std::list = {...}, sites=std::list = {...}, mask_a=std::vector of length 3141174, capacity 4194304 = {...}, maxx=185, first_d
el=@0x7fffffffd1ef) at ./bidir_search_bwd.cpp:143
(gdb) p (_it).first
$3 = 249406034312
(gdb) p (_it.second)
$4 = 72198331389279224
(gdb) p (*it_rev).first
$5 = 140737488343504
(gdb) p sa_intervals
$6 = std::list = {
[0] = {first = 249406034312, second = 72198331389279224}
(gdb) frame 5
#5 0x0000000000440b61 in main (argc=12, argv=0x7fffffffd538) at ./map.cpp:101
(gdb) p (*it)
$7 = (std::pair<unsigned long, unsigned long> &) @0x7fffffffd1e0: {first = 249406034312, second = 72198331389279224}
So what the hell?
Where are these values coming from? From sa_intervals. Where is this set?
if (kmer_idx.find(kmer)!=kmer_idx.end() && kmer_idx_rev.find(kmer)!=kmer_idx_rev.end() && kmer_sites.find(kmer)!=kmer_sites.end()) {
sa_intervals=kmer_idx[kmer];
sa_intervals_rev=kmer_idx_rev[kmer];
sites=kmer_sites[kmer];
it=sa_intervals.begin();
it_rev=sa_intervals_rev.begin();
Ah. So - are we reading stuff in right from the precalc?
from gramtools.
Assign to @coelias briefly, so he sees this
from gramtools.
Also for the record this happens after the 84th read is processed ok:
vBWT /data3/projects/pf3k/pilot_paper_analysis/prg-constr/new_format/Chr10_div_in_msp34_prg_letter_added_between_adj_sites.in csa_file /data3/projects/pf3k/pilot_paper_analysis/prg-constr/simulate_reads_from_alt_ref/sim_reads_from_alt.fq /data3/projects/pf3k/pilot_paper_analysis/prg-constr/new_format/Mask_sites /data3/projects/pf3k/pilot_paper_analysis/prg-constr/new_format/Mask_alleles Coverages Reads_processed int_alphabet_file csa_memory 9 /data3/projects/gramtools/wabi/precalc-kmers/kmersandrev
Tue Apr 26 00:14:54 2016
CSA construction
PRG size: 3413540
Alphabet size: 185
Tue Apr 26 00:15:38 2016
Reading KMERS...
DONE!
Tue Apr 26 00:15:43 2016
completed zam 0
completed zam 1
completed zam 2
completed zam 3
...
completed zam 83
completed zam 84
Segmentation fault (core dumped)
from gramtools.
Here;s that read
completed zam 84
read is ATTTATATATTTATCATGTAAAATATATTATTATTCATACATAAAATTTATATGAAATGTATGTACAAAATTTATATTATTTCTCCCTTATTACAGGTCG
Segmentation fault (core dumped)
bash-4.1$
from gramtools.
When I
- Move that 85th read to the top of the fastq
- rerun
It still segfaults during the (new) 85th read
from gramtools.
If I step into the iteration of the loop that will segfault, i have
read is ATTTATATATTTATCATGTAAAATATATTATTATTCATACATAAAATTTATATGAAATGTATGTACAAAATTTATATTATTTCTCCCTTATTACAGGTCG
(gdb) p kmer_idx[kmer]
$9 = empty std::list
here in map.cpp
if (kmer_idx.find(kmer)!=kmer_idx.end() && kmer_idx_rev.find(kmer)!=kmer_idx_rev.end() && kmer_sites.find(kmer)!=kmer_sites.end()) {
sa_intervals=kmer_idx[kmer]; <<<<<
That should never happen. What is the kmer in question
p kmer
$10 = std::vector of length 9, capacity 9 = {4 '\004', 1 '\001', 2 '\002', 1 '\001', 3 '\003', 3 '\003', 4 '\004', 2 '\002', 3 '\003'}
ie TACAGGTCG
ie the last kmer in the read....
Are we failing to store the sa_intervals for last kmer in read?
from gramtools.
Ok so I guess that’s the problem. That kier should never be in the hash if its list is empty. Need to check precalc function
From: Zamin Iqbal <[email protected]mailto:[email protected]>
Reply-To: iqbal-lab/gramtools <[email protected]mailto:[email protected]>
Date: Tuesday, 26 April 2016 at 00:49
To: iqbal-lab/gramtools <[email protected]mailto:[email protected]>
Subject: Re: [iqbal-lab/gramtools] valgrind issues in parse_masks (#8)
If I step into the iteration of the loop that will segfault, i have
(gdb) p kmer_idx[kmer]
$9 = empty std::list
here in map.cpp
if (kmer_idx.find(kmer)!=kmer_idx.end() && kmer_idx_rev.find(kmer)!=kmer_idx_rev.end() && kmer_sites.find(kmer)!=kmer_sites.end()) {
sa_intervals=kmer_idx[kmer]; <<<<<
That should never happen
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHubhttps://github.com//issues/8#issuecomment-214565524
from gramtools.
Right so we need to put a condition in precalc to remove the khmers with empty values from the hash table.
From: Zamin Iqbal <[email protected]mailto:[email protected]>
Reply-To: iqbal-lab/gramtools <[email protected]mailto:[email protected]>
Date: Tuesday, 26 April 2016 at 00:49
To: iqbal-lab/gramtools <[email protected]mailto:[email protected]>
Subject: Re: [iqbal-lab/gramtools] valgrind issues in parse_masks (#8)
If I step into the iteration of the loop that will segfault, i have
(gdb) p kmer_idx[kmer]
$9 = empty std::list
here in map.cpp
if (kmer_idx.find(kmer)!=kmer_idx.end() && kmer_idx_rev.find(kmer)!=kmer_idx_rev.end() && kmer_sites.find(kmer)!=kmer_sites.end()) {
sa_intervals=kmer_idx[kmer]; <<<<<
That should never happen
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHubhttps://github.com//issues/8#issuecomment-214565524
from gramtools.
I've just deleted my last two comments as they were confusing. So we are finding the following
- Carlos' script makes a list of kmers in the graph
- We store kmer->interval in the hash. Since all these kmers come from the graph, they must have nonzero intervals
- Some time later, we query the hash and get a zero-length interval from a kmer.
This can happen because
a) carlos' script accidentally puts in a wrong kmer into the list
b) precalc puts in a wrong kmer
c) precalc correctly finds a non-empty interval for a kmer, but somehow we fail to push it onto the list - not impossible as there are a bunch of if's around there
d) precalc does everything right, but somehow the memory gets corrupted.
I'm digging
from gramtools.
OK, have found that the reason we were not hitting Sorina's assert()'s is that NDEBUG was set to 1.
Once we remove that, we get this
vBWT /data3/projects/pf3k/pilot_paper_analysis/prg-constr/new_format/Chr10_div_in_msp34_prg_letter_added_between_adj_sites.in csa_file zorro /data3/projects/pf3k/pilot_paper_analysis/prg-constr/new_format/Mask_sites /data3/projects/pf3k/pilot_paper_analysis/prg-constr/new_format/Mask_alleles Coverages Reads_processed int_alphabet_file csa_memory 9 bof
Tue Apr 26 11:04:47 2016
CSA construction
PRG size: 3413540
Alphabet size: 185
Tue Apr 26 11:05:20 2016
Reading KMERS...
DONE!
Tue Apr 26 11:05:25 2016
vBWT: ./bidir_search.cpp:70: uint64_t bidir_search(sdsl::csa_wtsdsl::wt_int<sdsl::int_vector<1u, sdsl::rank_support_v5<> >, 2u, 2u>&, uint64_t&, uint64_t&, uint64_t&, uint64_t&, uint8_t): Assertion `right_rev-left_rev == right-left' failed.
Aborted (core dumped)
bash-4.1$
Will run through gdb
from gramtools.
(gdb) p left_rev
$4 = (uint64_t &) @0x12b8ce0: 3393726
(gdb) p right_rev
$5 = (uint64_t &) @0x12b8ce8: 3406648
(gdb) p right
$6 = (uint64_t &) @0x88455f8: 3409822
(gdb) p left
$7 = (uint64_t &) @0x88455f0: 3396910
(gdb) p right-left
$8 = 12912
(gdb) p right_rev-left_rev
$9 = 12922
(gdb)
from gramtools.
Oh god, I know what the problem is. Search doesn’t cope with adjacent numbers, so K-mers overlapping adjacent numbers won’t map and will have an empty interval. Quick fix would be to put a condition in precalc, after biddersearchbwd is called, to remove the kmer from hash is the list is empty. When the kmer file is correct and the PRG is correct, this condition will never be applied. But Probably good to have it anyway, in case there are wrong kmers in the file
From: Zamin Iqbal <[email protected]mailto:[email protected]>
Reply-To: iqbal-lab/gramtools <[email protected]mailto:[email protected]>
Date: Tuesday, 26 April 2016 at 10:13
To: iqbal-lab/gramtools <[email protected]mailto:[email protected]>
Cc: Sorina Maciuca <[email protected]mailto:[email protected]>, Comment <[email protected]mailto:[email protected]>
Subject: Re: [iqbal-lab/gramtools] valgrind issues in parse_masks (#8)
I've just deleted my last two comments as they were confusing. So we are finding the following
- Carlos' script makes a list of kmers in the graph
- We store kmer->interval in the hash. Since all these kmers come from the graph, they must have nonzero intervals
- Some time later, we query the hash and get a zero-length interval from a kmer.
This can happen because
a) carlos' script accidentally puts in a wrong kmer into the list
b) precalc puts in a wrong kmer
c) precalc correctly finds a non-empty interval for a kmer, but somehow we fail to push it onto the list - not impossible as there are a bunch of if's around there
d) precalc does everything right, but somehow the memory gets corrupted.
—
You are receiving this because you commented.
Reply to this email directly or view it on GitHubhttps://github.com//issues/8#issuecomment-214679064
from gramtools.
I would never have figured that out!
from gramtools.
OK, tried this fix in precalc_kmer_matches.cpp
kmer_idx[kmer]=temp;
kmer_idx_rev[kmer]=temp;
kmer_sites[kmer]=temp2;
first_del=false;
std::vector<uint8_t>::iterator res_it=bidir_search_bwd(csa,0,csa.size(),0,csa.size(),(kmer).begin(),(kmer).end(),kmer_idx[kmer],kmer_idx_rev[kmer],kmer_sites[kmer],m\
ask_a,maxx,first_del);
if ( !((kmer_idx[kmer]).empty()) && !((kmer_idx_rev[kmer]).empty()) ) <<<<<<<<<
{
if (!first_del) kmers_in_ref.insert(kmer);
}
This fixed that problem, and we now go a bit further before hitting this assert just after the 84th read (as above):
completed zam 84
read is ATTTATATATTTATCATGTAAAATATATTATTATTCATACATAAAATTTATATGAAATGTATGTACAAAATTTATATTATTTCTCCCTTATTACAGGTCG
vBWT: ./bidir_search_bwd.cpp:32: std::vector::iterator bidir_search_bwd(sdsl::csa_wtsdsl::wt_int<sdsl::int_vector<1u, sdsl::rank_support_v5<> >, 2u, 2u>&, uint64_t, uint64_t, uint64_t, uint64_t, std::vector::iterator, std::vector::iterator, std::list<std::pair<long unsigned int, long unsigned int> >&, std::list<std::pair<long unsigned int, long unsigned int> >&, std::list<std::vector<std::pair<unsigned int, std::vector > > >&, std::vector&, uint64_t, bool&): Assertion `right<=csa.size()' failed.
Aborted (core dumped)
from gramtools.
Run in gdb:
(gdb) bt
#0 0x0000003a11e32625 in raise () from /lib64/libc.so.6
#1 0x0000003a11e33e05 in abort () from /lib64/libc.so.6
#2 0x0000003a11e2b74e in __assert_fail_base () from /lib64/libc.so.6
#3 0x0000003a11e2b810 in __assert_fail () from /lib64/libc.so.6
#4 0x00000000004099bc in bidir_search_bwd (csa=..., left=249406034312, right=72198331389279224, left_rev=140737488343216, right_rev=140737488343216, pat_begin=1 '\001', pat_end=4 '
\004', sa_intervals=empty std::list, sa_intervals_rev=empty std::list, sites=empty std::list, mask_a=std::vector of length 3141174, capacity 4194304 = {...}, maxx=185, first_del=@0x
7fffffffd0cf: true) at ./bidir_search_bwd.cpp:32
#5 0x0000000000441d70 in main (argc=12, argv=0x7fffffffd418) at ./map.cpp:111
(gdb) frame 4
#4 0x00000000004099bc in bidir_search_bwd (csa=..., left=249406034312, right=72198331389279224, left_rev=140737488343216, right_rev=140737488343216, pat_begin=1 '\001', pat_end=4 '
\004', sa_intervals=empty std::list, sa_intervals_rev=empty std::list, sites=empty std::list, mask_a=std::vector of length 3141174, capacity 4194304 = {...}, maxx=185, first_del=@0x
7fffffffd0cf: true) at ./bidir_search_bwd.cpp:32
(gdb) p right
$1 = 72198331389279224
from gramtools.
????????
What does <<<<<<<<<<<< mean after the if
From: Zamin Iqbal <[email protected]mailto:[email protected]>
Reply-To: iqbal-lab/gramtools <[email protected]mailto:[email protected]>
Date: Tuesday, 26 April 2016 at 11:47
To: iqbal-lab/gramtools <[email protected]mailto:[email protected]>
Cc: Sorina Maciuca <[email protected]mailto:[email protected]>, Comment <[email protected]mailto:[email protected]>
Subject: Re: [iqbal-lab/gramtools] valgrind issues in parse_masks (#8)
OK, tried this fix in precalc_kmer_matches.cpp
kmer_idx[kmer]=temp;
kmer_idx_rev[kmer]=temp;
kmer_sites[kmer]=temp2;
first_del=false;
std::vector<uint8_t>::iterator res_it=bidir_search_bwd(csa,0,csa.size(),0,csa.size(),(kmer).begin(),(kmer).end(),kmer_idx[kmer],kmer_idx_rev[kmer],kmer_sites[kmer],m\
ask_a,maxx,first_del);
if ( !((kmer_idx[kmer]).empty()) && !((kmer_idx_rev[kmer]).empty()) ) <<<<<<<<<
{
if (!first_del) kmers_in_ref.insert(kmer);
}
This fixed that problem, and we now go a bit further before hitting this assert just after the 84th read (as above):
completed zam 84
read is ATTTATATATTTATCATGTAAAATATATTATTATTCATACATAAAATTTATATGAAATGTATGTACAAAATTTATATTATTTCTCCCTTATTACAGGTCG
vBWT: ./bidir_search_bwd.cpp:32: std::vector::iterator bidir_search_bwd(sdsl::csa_wtsdsl::wt_int<sdsl::int_vector<1u, sdsl::rank_support_v5<> >, 2u, 2u>&, uint64_t, uint64_t, uint64_t, uint64_t, std::vector::iterator, std::vector::iterator, std::list >&, std::list >&, std::list > > >&, std::vector&, uint64_t, bool&): Assertion `right<=csa.size()' failed.
Aborted (core dumped)
—
You are receiving this because you commented.
Reply to this email directly or view it on GitHubhttps://github.com//issues/8#issuecomment-214699838
from gramtools.
(gdb) frame 5
#5 0x0000000000441d70 in main (argc=12, argv=0x7fffffffd418) at ./map.cpp:111
(gdb) p (*it)
$2 = (std::pair<unsigned long, unsigned long> &) @0x7fffffffd0c0: {first = 249406034312, second = 72198331389279224}
(gdb) p sa_intervals
$3 = empty std::list
(gdb) p it
$4 = {first = 249406034312, second = 72198331389279224}
What happens if you take an empty list and call .begin()? - looks like we have garbage in the iterator
from gramtools.
I've just rerun after modifying/fixing my fix so it erases kmers from the kmerhash if they have zero interval. I still end up in the above assert
from gramtools.
The <<<< is just meant to draw your attention to that line - its not there in the real code
from gramtools.
I think this issue has been solved.
from gramtools.
Related Issues (20)
- Per base coverage recording error
- Pf benchmarking HOT 1
- Handle no variants after clustering HOT 8
- Release 1.6.0 HOT 5
- Release 1.7 HOT 1
- 'N' in reference genome fasta not supported
- Error Installing gramtools HOT 8
- Boost download link is dead, breaks gramtools compilation process HOT 4
- Retire variant-aware kmer indexing code
- Unable to run the geamtools discover command successfully HOT 4
- Install error HOT 3
- Stop installing py-cortex-api by default
- Ubuntu 18.04 install woes HOT 3
- Migrate to pyproject.toml
- Error while running gramtools HOT 2
- gramtools build error HOT 1
- install error HOT 1
- gramtools gramtools backend expected at: /home/tools/anaconda3/lib/python3.7/site-packages/gramtools/bin/gram but not found HOT 2
- make[1]: *** [CMakeFiles/Makefile2:260: libgramtools/CMakeFiles/gram.dir/rule] Error 2 make: *** [Makefile:186: gram] Error 2 HOT 6
- How to use Gramtools build based on multiple VCF files? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gramtools.