Code Monkey home page Code Monkey logo

Comments (7)

oushujun avatar oushujun commented on August 27, 2024 2

@wensulin93 this seems to be a generic task, you may try agat: https://www.biostars.org/p/9465973/

from tesorter.

zhangrengang avatar zhangrengang commented on August 27, 2024

You may just use the concatenated domain CDS sequences for each LTR, or the region start at the first domain and end at the last domain (CM030788.1 10392696 - 10396473 ), or the entire LTR (CM030788.1:10391873..10397114) sequence.

from tesorter.

Wenwen012345 avatar Wenwen012345 commented on August 27, 2024

Excellent advice, lucky you can see this post! @oushujun

from tesorter.

Wenwen012345 avatar Wenwen012345 commented on August 27, 2024

@wensulin93 this seems to be a generic task, you may try agat: https://www.biostars.org/p/9465973/

Hello, I thought about this question carefully yesterday. It might help you optimize your software.
Because the genome of the selected species is incomplete (including much information missing from the genomic GFF3 file), it seems that extracting CDS is not an easy task.
I looked at the genetic structure of one LTR. It is found that all the sequences generated by the corresponding.dom. Gff3 file are not complete CDS sequences after splicing together, but they cover most of them. The main parts not covered are the beginning (including ATG) or the end or whatever. However, in the LTR sequence I observed, the EXACT division of CDS sequence was not observed in the GFF3 file of the genome, or even directly skipped that section (not shown in the GFF file). This represents an incomplete GFF file for the genome.
Therefore, if there is no new progress, I will splicing the CDS sequences of all domains together and perform synteny analysis with MCscanX. The ultimate goal is to perform synteny analysis. I think stitching together CDS sequences of all domains will also give me the ideal synteny data. @zhangrengang

from tesorter.

Wenwen012345 avatar Wenwen012345 commented on August 27, 2024

I tried the concatenate_domains.py script, but I got an error message like this:

Traceback (most recent call last):
File "/home/manager/miniconda3/bin/concatenate_domains.py", line 33, in
sys.exit(load_entry_point('TEsorter==1.4.1', 'console_scripts', 'concatenate_domains.py')())
File "/home/manager/miniconda3/bin/concatenate_domains.py", line 25, in importlib_load_entry_point
return next(matches).load()
StopIteration

@zhangrengang

from tesorter.

zhangrengang avatar zhangrengang commented on August 27, 2024

@wensulin93
Yes, TEsorter do not define the EXACT start, end, or division positions of domains. But these are not neccesary for synteny analyses, so previously I gave you three solutions: the first contains only domains, the second contians almost the full GAG-POL except that the start and end regions may be not incomplete, and the second contains full CDS but also incudes non-coding regions.

Regarding the issue of concatenate_domains.py, how do you use the script? It should be used like:

concatenate_domains.py rice6.9.5.liban.rexdb.cls.pep RH RT INT > rice6.9.5.liban.rexdb.cls.pep_RT_RH_INT.aln

In this example, the RH, RT and INT domains are aligned seperately and then concatennated together. The LTR-RTs that do not contain all the three domains will be excluded.

from tesorter.

Wenwen012345 avatar Wenwen012345 commented on August 27, 2024

@wensulin93 Yes, TEsorter do not define the EXACT start, end, or division positions of domains. But these are not neccesary for synteny analyses, so previously I gave you three solutions: the first contains only domains, the second contians almost the full GAG-POL except that the start and end regions may be not incomplete, and the second contains full CDS but also incudes non-coding regions.

Regarding the issue of concatenate_domains.py, how do you use the script? It should be used like:

concatenate_domains.py rice6.9.5.liban.rexdb.cls.pep RH RT INT > rice6.9.5.liban.rexdb.cls.pep_RT_RH_INT.aln

In this example, the RH, RT and INT domains are aligned seperately and then concatennated together. The LTR-RTs that do not contain all the three domains will be excluded.

Hello,
I think I might solve a problem. Tesorter was previously installed with Conda (1.3.0, "Conda Insall Tesorter"). Then I tried to find the "concatenate_domains.py" script. The result is in a "python-scripts" folder in the conda folder. So I typed the command:
~/miniconda3/pkgs/tesorter-1.3.0-py_0/python-scripts/concatenate_domains.py 2_oute.rexdb-plant.cls.pep RT > rt2.aln

Results hint:
"ZSH: / home/manager/miniconda3 / PKGS/tesorter - 1.3.0 - py_0 / python scripts/concatenate_domains.py: bad interpreter: /opt/conda/condabld/tesorter_1604435924377/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold _place: no such file or directory"

And then I realized that the first line seems to be wrong. So I changed it to
"#! /usr/bin/env python"

Then came the reminder:
"Traceback (most recent call last):
File "/home/manager/miniconda3/pkgs/tesorter-1.3.0-py_0/python-scripts/concatenate_domains.py", line 7, in
from .RunCmdsMP import run_cmd
ImportError: attempted relative import with no known parent package"

I didn't find the script ".RunCmdsMP". Finally I found the script in the "modules" folder. I also found that "concatenate_domains.py" in the modules folder only works. Then the instructions were given:
~/miniconda3/pkgs/tesorter-1.3.0-py_0/site-packages/TEsorter/modules/concatenate_domains.py 2_oute.rexdb-plant.cls.pep RT > rt2.aln

The result file was successfully obtained. The conda installation cauesed the problem. Hope it can be reference! Thank you anyway!

from tesorter.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.