The gramene from warelab

Rice multi-gene track caption is misleading

In summary, the track is made up of multiple gene sets and the name of the track only shows one of the gene sets. I initially proposing deleting the obsolete gene sets, but eg proposed keeping them, only make them non-displayable by setting the displayable flag in analysis_description off. This was clever, as late on Pankaj arguing keeping the old low quality genes. Finally I found we can separate these gene sets to different tracks by changing the key in their analysis_description web_data field.

In this hash data structure, the 'key' => ''ensembl' dictates which analysis goes to the same gene track on the browser. Various analyses sharing the same key will be put in the same track.

The 'caption' => 'IRGSP gene': dictate the track name on left side of the track.

The 'name' => 'IRGSP Genes': dictate the configuration pop up window for you to select which gene track to load to this page.

{'multi_name' => 'IRGSP Genes','colour_key' => '[biotype]','caption' => 'IRGSP gene','name' => 'IRGSP Genes','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ensembl'}

For core database:

update analysis_description set web_data= replace(web_data, 'ensembl', 'ena') where analysis_id=355;

update analysis_description set web_data= replace(web_data, 'ensembl', 'ena_ncrna') where analysis_id=358;

update analysis_description set web_data= replace(web_data, 'ensembl', 'rfam') where analysis_id=467;

update analysis_description set web_data= replace(web_data, 'ensembl', 'trnascan') where analysis_id=468;

update analysis_description set web_data=replace(web_data, '\'name\' => \'Genes\'', '\'name\' => \'ncRNA Genes\'') where analysis_id=358;
update analysis_description set web_data=replace(web_data, '\'name\' => \'Genes\'', '\'name\' => \'Rfam RNA Genes\'') where analysis_id=467
update analysis_description set web_data=replace(web_data, '\'name\' => \'Genes\'', '\'name\' => \'tRNA Genes\'') where analysis_id=468

For otherfeatures:

update analysis_description set web_data= replace(web_data, 'ensembl', 'rappred') where analysis_id=3;

update analysis_description set web_data= replace(web_data, 'ensembl', 'rapnc') where analysis_id=4;

This is the email to EG
For rice Japonica, the issue was the track name rotates
among the three analysis Caption names. Sometimes it is 'MSU gene',
some times it is 'IRGSP gene prediction', sometimes it is 'IRGSP
nonCoding gene prediction’. See the following screenshot

I examined the three analyses, they all come from otherfeatures
database. And except for MSU gene, the other two are obsolete 2017
gene predictions from RAP-DB. I suggest we drop these two gene sets
for good.

*** oryza_sativa_otherfeatures_61_96_7

msu_gene: 55,799 protein coding genes (keep)
irgspv1.0-20170804-predicted-genes: 8,115 ab-initio predicted
genes (delete, analysis_id=3)
irgspv1.0-20170804-noncoding-genes: 2,387 predicted noncoding
genes (delete, analysis_id=4)

In the core database, we also have 4 gene sets.
The RAP-DB genes was just updated in this release from RAP-DB 2018-11
release. The other three are nonCoding genes, except for tRNA gene,
the other two RNA gene sets were also said to be not very reliable,
can be dropped too.

** oryza_sativa_core_61_96_7

gff3_genes: 37,849 protein coding genes from RAP-DB 2018-11
release. (analysis_id=531)
trnascan_gene: 191 tRNAscan genes (analysis_id=468)
rfam_12.2_gene: 758 Rfam RNA genes (delete, analysis_id=467)
ena_rna: 68 ncRNA gene from ENA (delete, analysis_id=358)

In summary, I am proposing deleting the following gene sets:

oryza_sativa_otherfeatures:

analysis_id=3, irgspv1.0-20170804-predicted-genes
analysis_id=4, irgspv1.0-20170804-noncoding-genes

oryza_sativa_core:

analysis_id=467, rfam_12.2_gene
analysis_id=358, ena_rna:

warelab / gramene Goto Github PK

gramene's People

Contributors

Watchers

Forkers

gramene's Issues

assembly mapping pipeline

Rice multi-gene track caption is misleading

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent