Code Monkey home page Code Monkey logo

Comments (15)

mahulchak avatar mahulchak commented on July 19, 2024

To answer your question in short: yes. Quickmerge now produces a file called "residuals.fasta". This file has sequences that are alternate haplotypes (what you called as 'genomic equivalent') of some sequences or sequence segments present in the hybrid assembly. However, this is an experimental feature and therefore I can't guarantee its performance.

from quickmerge.

danshu avatar danshu commented on July 19, 2024

Thanks for you quick reply. By 'genomic equivalent' I mean the exact same sequence instead of alternate haplotypes. It is difficult to determine if something is alternate haplotype or not by sequence alignment alone. Just use the previous example again, when using contig A in the reference assembly to improve contig B in the hybrid assembly, will quickmerge remove any sequences in the hybrid assembly that are contained in contig A?
The reason why I have this question is that the contig number remains unchanged while some contigs get longer after quickmerge.

from quickmerge.

danshu avatar danshu commented on July 19, 2024

In the paper it is mentioned that "Query contigs that are completely contained within a reference contig are also removed from the final merged assembly to prevent sequence duplication in the merged assembly". But I have noticed that some merged contigs have sequences highly similar to other contigs in the hybrid assembly.

from quickmerge.

mahulchak avatar mahulchak commented on July 19, 2024

from quickmerge.

danshu avatar danshu commented on July 19, 2024

Yes, and here are alignment results using minimap.
Seg994 2248806 1149769 1628589 - merged_Seg1 2658806 1189716 1666912 451180 478820 255 cm:i:81833
Seg994 2248806 1663017 2049238 - merged_Seg1 2658806 744945 1132078 368850 387133 255 cm:i:67327
Seg994 2248806 2195015 2248800 - merged_Seg1 2658806 544989 597384 49643 53785 255 cm:i:9069
Seg994 2248806 966275 1151975 - merged_Seg1 2658806 1662404 1848049 152864 185700 255 cm:i:27154
Seg994 2248806 2049279 2225766 - merged_Seg1 2658806 566818 741515 141384 176487 255 cm:i:25835
Seg994 2248806 888703 977868 - merged_Seg1 2658806 1836692 1925811 84519 89165 255 cm:i:15664
Seg994 2248806 1634098 1664804 - merged_Seg1 2658806 1158593 1189715 27331 31122 255 cm:i:4779
Seg994 2248806 1663679 1683033 - merged_Seg1 2658806 1139520 1159670 11662 20150 255 cm:i:1748

from quickmerge.

danshu avatar danshu commented on July 19, 2024

┌────┬────────┬─────────────────────────────────────────────────────────────┐
│Col │ Type │ Description │
├────┼────────┼─────────────────────────────────────────────────────────────┤
│ 1 │ string │ Query sequence name │
│ 2 │ int │ Query sequence length │
│ 3 │ int │ Query start coordinate (0-based) │
│ 4 │ int │ Query end coordinate (0-based) │
│ 5 │ char │ +' if query and target on the same strand; -' if opposite │
│ 6 │ string │ Target sequence name │
│ 7 │ int │ Target sequence length │
│ 8 │ int │ Target start coordinate on the original strand │
│ 9 │ int │ Target end coordinate on the original strand │
│ 10 │ int │ Number of matching bases in the mapping │
│ 11 │ int │ Number bases, including gaps, in the mapping │
│ 12 │ int │ Mapping quality (0-255 with 255 for missing) │
└────┴────────┴─────────────────────────────────────────────────────────────┘
Alignment format

from quickmerge.

mahulchak avatar mahulchak commented on July 19, 2024

from quickmerge.

mahulchak avatar mahulchak commented on July 19, 2024

from quickmerge.

danshu avatar danshu commented on July 19, 2024

nucmer -l 100 -prefix test merged_Seg1.fasta Seg994.fasta

merged_Seg1.fasta Seg994.fasta
NUCMER

merged_Seg1 Seg994 2658806 2248806
544984 555139 2248806 2238651 5 5 0
-692
16
-16
492
0

from quickmerge.

mahulchak avatar mahulchak commented on July 19, 2024

is that all of the delta file ?

from quickmerge.

danshu avatar danshu commented on July 19, 2024

yes.

This may not be a good example. I just find another hybrid contig in the merged.fasta that is highly similar to a merged contig.
Here is the alignment using minimap:
Seg481 51259 1 51256 - merged_Seg1 2658806 1910244 1961531 47437 51287 255 cm:i:8774

Alignment using nucmer:

merged_Seg1 Seg481 2658806 51259
1910242 1920391 51259 41139 118 118 0
151
370
115
109
240
-96
237
29
14
74
353
120
25
227
112
999
115
1
406
1
299
19
23
90
24
180
94
137
19
118
15
-103
19
23
-11
19
221
9
-34
45
151
35
77
36
5
53
79
88
-13
36
24
9
-109
165
378
10
-213
-245
-69
-16
-12
-5
-7
-18
-4
-7
-9
12
-116
-43
-368
-9
-35
0

from quickmerge.

mahulchak avatar mahulchak commented on July 19, 2024

from quickmerge.

danshu avatar danshu commented on July 19, 2024

Seg994.txt
Seg481.txt

from quickmerge.

mahulchak avatar mahulchak commented on July 19, 2024

from quickmerge.

danshu avatar danshu commented on July 19, 2024

Thanks!

from quickmerge.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.