Code Monkey home page Code Monkey logo

Comments (12)

zhangrengang avatar zhangrengang commented on July 20, 2024

Yes, it is because preceding subgenome assignment is failed. Can you post the clustering heatmap?

from subphaser.

zhangrengang avatar zhangrengang commented on July 20, 2024

@mrmrwinter The immediate cause should be that there is no subgenome-specific LTR for building tree. You can skip this step by -disable_ltrtree. And check the results according to the Supplementary Material:

On the clustering heatmap (Fig. 2B) and PCA plot (Fig. 2C), a subgenome is defined as well-phased if it has clearly distinguishable patterns of both differential k-mers and homeologous chromosomes, indicating that each subgenome shares subgenome-specific features as expected. If the subgenomes are not well-phased, the following analyses are meaningless and should be ignored.

from subphaser.

mrmrwinter avatar mrmrwinter commented on July 20, 2024

Hi @zhangrengang, thanks for the fast reply. I had a look at the subgenome assignment results and only one scaffold was assigned to subgenome 2.

image

I double checked the content of this scaffold and it is definitely not a contaminant or particularly difficult region - it is around 120 Kbp and contains a ribosomal subunit. When i remove this scaffold from the assembly and from the configureation file i get the following error:

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/531734/.conda/envs/SubPhaser/lib/python3.8/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/531734/.conda/envs/SubPhaser/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/home/531734/.conda/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.5-py3.8.egg/subphaser/Jellyfish.py", line 641, in _filter_kmer
    return kmer, [c/l for c,l in zip(counts, lengths)], tot
  File "/home/531734/.conda/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.5-py3.8.egg/subphaser/Jellyfish.py", line 641, in <listcomp>
    return kmer, [c/l for c,l in zip(counts, lengths)], tot
ZeroDivisionError: division by zero
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/531734/.conda/envs/SubPhaser/bin/subphaser", line 33, in <module>
    sys.exit(load_entry_point('subphaser==1.2.5', 'console_scripts', 'subphaser')())
  File "/home/531734/.conda/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.5-py3.8.egg/subphaser/__main__.py", line 779, in main
    pipeline.run()
  File "/home/531734/.conda/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.5-py3.8.egg/subphaser/__main__.py", line 412, in run
    d_mat = dumps.filter(d_mat, lengths, self.sgs, outfig=histfig, #d_targets=d_targets, 
  File "/home/531734/.conda/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.5-py3.8.egg/subphaser/Jellyfish.py", line 487, in filter
    for kmer, freqs, tot_freq in pool_func(_filter_kmer, args, self.ncpu, 
  File "/home/531734/.conda/envs/SubPhaser/lib/python3.8/site-packages/subphaser-1.2.5-py3.8.egg/subphaser/RunCmdsMP.py", line 346, in pool_func
    for returned in pool_map(func, iterable, **kargs):
  File "/home/531734/.conda/envs/SubPhaser/lib/python3.8/multiprocessing/pool.py", line 364, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/home/531734/.conda/envs/SubPhaser/lib/python3.8/multiprocessing/pool.py", line 771, in get
    raise self._value
ZeroDivisionError: division by zero

I really hope i can get this working as the software looks very promising.

Many thanks

from subphaser.

zhangrengang avatar zhangrengang commented on July 20, 2024

@mrmrwinter How about the plot *.kmer.mat.pdf? Your genome may be not a neoallopolyploid. Do you have a dot plot to show the homoeologs?
By the way, you do not need to remove the seqence from the assembly.

from subphaser.

mrmrwinter avatar mrmrwinter commented on July 20, 2024

@zhangrengang Here is the plot you requested:
image

Though definitely allopolyploid, im not sure how recently the hybridisation would have been, but certainly nearer the half a million year mark than last month. This species is also a mitotic parthenogen, with potential for homeologous recombination. I do not have a dotplot showing the dotplots but i have CDS homology data shown in the circos plot below, and links in the chromatin contact map between suspected pairings.

image

image

Hope this helps.

from subphaser.

zhangrengang avatar zhangrengang commented on July 20, 2024

@mrmrwinter According to the heatmap and PCA plot, Subphaser is failed to your assembly.
From the Hi-C map, there are many assembly errors. Some contigs are mis-joined into one chromosome and some contigs that belong to the same chromosome are placed into different chromosomes. These massive errors will disrupt SubPhaser. You need to reassembly your chromosomes firstly.
And the homoelogous relationships in the circos plot are not clear (maybe due to the assembly errors). After reassembly, if it is still not clear, you may need to make a dot plot (to show clear relationships) and Ks plot (to show divergent time) using MCSCAN and/or wgdi. According to the Hi-C map, it is probablely a neopolyploid.

from subphaser.

mrmrwinter avatar mrmrwinter commented on July 20, 2024

Thankyou for your feedback. The HiC scaffolding was performed by a third party company so i had little control over the process. I will get in touch with them and see what has happened. What things indicate assembly errors? This is their second attempt at Hi-c scaffolding.

from subphaser.

zhangrengang avatar zhangrengang commented on July 20, 2024

360截图17060223645074
@mrmrwinter Three cases as shown in arrows: the green indicates clear boundary between chromosoms that should be split; the yellow indicates contacts that should be joined; and the blue indicates signals that may be homoelogs.

from subphaser.

mrmrwinter avatar mrmrwinter commented on July 20, 2024

@zhangrengang thankyou for taking the time out to address this for me. I had put my faith in the third party but some things have clearly been overlooked. I will fix these things and rerun my analyses. Hopefully it will clear up the multiple hits in the CDS pairings, and allow SubPhaser to run correctly. Please could you recommend any resources for learning to interpret contact maps? I read all i could but still did not feel i could confidently callout features like misassemblies etc.

Again thankyou @zhangrengang!

from subphaser.

zhangrengang avatar zhangrengang commented on July 20, 2024

@mrmrwinter You can have a look on this paper:

Dudchenko O, Batra S S, Omer A D et. al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds [J]. Science, 2017, 356 (6333): 92

And if possible, make a practice on Hi-C assembly.

from subphaser.

mrmrwinter avatar mrmrwinter commented on July 20, 2024

Many thanks @zhangrengang.
I will hopefully be in touch when ive fixed these things and have SUbPhaser running!

from subphaser.

zhangrengang avatar zhangrengang commented on July 20, 2024

@mrmrwinter OK. Looking forward to your feedback.

from subphaser.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.