Comments (2)
Hi, @crazyhottommy
I just took a look at the gtf file you used, and it doesn't seem like it is in a format supported by kb
.
kb
is only able to parse GTF files formatted as so:
1 transcribed_unprocessed_pseudogene gene 11869 14409 . + . gene_id "ENSG00000223972"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene";
1 processed_transcript transcript 11869 14409 . + . gene_id "ENSG00000223972"; transcript_id "ENST00000456328"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; transcript_name "DDX11L1-002"; transcript_source "havana";
Specifically, kb
looks for gene_id
gene_name
transcript_id
GTF attributes, and these attributes must be semicolon-separated list of tag-value pairs separated by a single space.
Whereas the GTF file you used has lines like so (notice the last part of the lines are different).
NC_013896.1 RefSeq region 1 210400635 . + . ID=NC_013896.1:1..210400635;Dbxref=taxon:9483;Name=1;chromosome=1;gbkey=Src;genome=chromosome;mol_type=genomic DNA
NC_013896.1 Gnomon gene 166187 208374 . - . ID=gene-LOC108590668;Dbxref=GeneID:108590668;Name=LOC108590668;gbkey=Gene;gene=LOC108590668;gene_biotype=lncRNA
If there is any way you can get a GTF file that is in a different format, or manually modify the GTF entries to be in that format, it should work.
from kb_python.
Thanks! The one I downloaded is gff file, I converted to gtf
gffread GCF_000004665.1_Callithrix_jacchus-3.2_genomic.gff -T -o GCF_000004665.1_Callithrix_jacchus-3.2_genomic.gtf
This still does not work
NC_013896.1 Gnomon exon 166187 166804 . - . transcript_id "rna0"; gene_id "gene0"; gene_name "LOC108590668";
NC_013896.1 Gnomon exon 198620 198707 . - . transcript_id "rna0"; gene_id "gene0"; gene_name "LOC108590668";
NC_013896.1 Gnomon exon 208220 208374 . - . transcript_id "rna0"; gene_id "gene0"; gene_name "LOC108590668";
I guess the gene_0 gene_1 for each gene for this conversion is the problem. I will hack around. thanks
from kb_python.
Related Issues (20)
- Getting different counts when running the standard workflow vs the nucleus workflow HOT 4
- error when running kb count -tcc HOT 3
- kb-python for isoform level analysis on bulk RNAseq HOT 3
- Exon Quantification HOT 7
- TCC matrix on 10x data HOT 11
- Issue trying to use batch file for fastq pairs that are already demultiplexed HOT 4
- requirements.txt says pandas>1.0.0 but breaks on pandas 2 HOT 7
- unclear which genomic fasta to use HOT 2
- Run Time HOT 3
- kb count error HOT 2
- Print bug during kallisto part HOT 2
- Tutorial question on expected_num_cells HOT 2
- Tutorial question on filtering out by mitochondrial content HOT 5
- the kb automatically terminate HOT 2
- fixed RNA profiling HOT 2
- Support for SMARTSEQ3 single end data. HOT 14
- Support for Demultiplexed Smart-seq3 data HOT 3
- Effective length normalization for full-length UMI scRNA-seq data HOT 8
- kallisto index file not found index.idx_cdna HOT 3
- If it doesn't exist yet, is it possible to add an option that filters the genes by abundance HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kb_python.