Code Monkey home page Code Monkey logo

Comments (3)

ksahlin avatar ksahlin commented on August 26, 2024

Hi @unique379r,

uLTRA requires an annotation, and also will not have any advantage over e.g., minimap2 for annotation-free alignment.

I recommend that you use minimap2 or deSALT for annotation-free alignment.

Best,
K

from ultra.

unique379r avatar unique379r commented on August 26, 2024

Hi Again @ksahlin
Thank you for your suggestion but i really wanted to try uLTRA too along with minimap2 and deSALT. Therefore, i got the gtf (https://tinyurl.com/5e8emmf2) but when i tried to make indices, some utility script of uLTRa gives me error. Now sure, why as it seems fine to me, please have a look and looking forward for your help.

uLTRA index chm13v2.0.fasta CHM13.v1.v2.gtf .

ERROR

Traceback (most recent call last):
  File "/Apps/envs/ultra/bin/uLTRA", line 714, in <module>
    prep_splicing(args, refs_lengths)
  File "=Apps/envs/ultra/bin/uLTRA", line 80, in prep_splicing
    max_intron_chr, exon_choordinates_to_id, chr_to_id, id_to_chr = augmented_gene.create_graph_from_exon_parts(db, args.flank_size, args.small_exon_threshold, args.min_segm, refs_lengths)
  File "=/Apps/envs/ultra/lib/python3.9/site-packages/modules/create_augmented_gene.py", line 323, in create_graph_from_exon_parts
    exon_gene_ids = exon.attributes["gene_id"] # is a list of strings
  File "/Apps/envs/ultra/lib/python3.9/site-packages/gffutils/attributes.py", line 63, in __getitem__
    v = self._d[k]
KeyError: 'gene_id'

My gtf

chr1	Liftoff	exon	11136	11635	.	-	.	gene_id "LOFF_G0000001"; transcript_id "LOFF_T0000001"; gene_name "AL627309.3";
chr1	Liftoff	exon	11630	11831	.	+	.	gene_id "LOFF_G0000002"; transcript_id "LOFF_T0000002"; gene_name "AP006222.2";
chr1	Liftoff	exon	11639	12457	.	-	.	gene_id "LOFF_G0000001"; transcript_id "LOFF_T0000001"; gene_name "AL627309.3";
chr1	Liftoff	exon	12900	13433	.	+	.	gene_id "LOFF_G0000002"; transcript_id "LOFF_T0000002"; gene_name "AP006222.2";
chr1	CAT	exon	14253	14325	.	+	.	gene_id "CHM13_G0000001"; transcript_id "CHM13_T0000001"; gene_name "CHM13_G0000001";
chr1	CAT	exon	14292	14353	.	+	.	gene_id "CHM13_G0000001"; transcript_id "CHM13_T0000002"; gene_name "CHM13_G0000001";
chr1	CAT	exon	20566	20905	.	+	.	gene_id "CHM13_G0000001"; transcript_id "CHM13_T0000002"; gene_name "CHM13_G0000001";
chr1	CAT	exon	20566	21099	.	+	.	gene_id "CHM13_G0000001"; transcript_id "CHM13_T0000001"; gene_name "CHM13_G0000001";
chr1	Liftoff	exon	52976	53422	.	-	.	gene_id "LOFF_G0000003"; transcript_id "LOFF_T0000003"; gene_name "AL731661.1";
chr1	Liftoff	exon	53560	53826	.	-	.	gene_id "LOFF_G0000003"; transcript_id "LOFF_T0000003"; gene_name "AL731661.1";

from ultra.

ksahlin avatar ksahlin commented on August 26, 2024

Hi @unique379r,

I tested it on your dataset. Some of the lines in your file have been switched to geneID instead of gene_id. The fields should be consistent as gene_id, so I believe this is a file format error. This can be easily fixed by replacing geneID by gene_id on the lines where it occurs in your GTF file.

The first offending line in the GTF file is pasted below (but there are several lines where this happens)

chrY    Gnomon  exon    19436164        19436323        .       +       .       geneID "gene-PRYP5"; transcript_id "rna-XM_017030085.2"; gene_name "HSFY1";

Best,
K

from ultra.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.