Code Monkey home page Code Monkey logo

broaderpredicate_uberon's Introduction

Overview

The RIKEN BioResource Research Center (BRC) is exploring the model organisms which are expected to be available for medical science research by executing the SPARQL queries for the RIKEN bioresource Knowledge graph integrated with the Bgee, a gene expression database, the Orthologous MAtrix (OMA, an orthology database), the DisGeNET, a human gene-disease association, Mouse Genome Informatics (MGI), UniProt, and four disease ontologies stored in the RIKEN BioResource MetaDatabase. This page shares the SPARQL query examples and the related data.

Reference

https://github.com/tarcisiotmf/swat4hcls
Querying the Bgee Knowledge Graph with SPARQL

SPARQL endpoint


SPARQL query examples, the query results, and the related data

Additional file 1 (SPARQL query Example 1-1)

Example1-1_Additional_file_1.rq
Description: A federated SPARQL query for Alzheimer's disease using DisGeNET and one subquery
Search parameters:

  • Federated (F) or Centralized (C): F
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
    • 4 (Bgee)
  • Disease: Alzheimer's disease (umls:C0002395)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: prefrontal cortex (UBERON:0000451)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 1
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

  • > 600 sec / all rows [Execution date: 6 June 2024]
  • 112 sec / 100 rows [Execution date: 6 June 2024]
  • 44 sec / all rows, Expression Score: > 99 [Execution date: 10 August 2023]
  • 59 sec / all rows, Expression Score: > 99 [Execution date: 4 August 2023]

Statistics of results:

Query approach No. of RIKEN mice No. of genes Gene labels (Ensembl Gene IDs)
Example 1-1: Federated query for AD 55 14 PICALM (ENSG00000073921)
PSEN1 (ENSG00000080815)
NPY (ENSG00000122585)
APOE (ENSG00000130203)
APP (ENSG00000142192)
PSEN2 (ENSG00000143801)
ACE (ENSG00000159640)
INSR (ENSG00000171105)
BCL2 (ENSG00000171791)
BDNF (ENSG00000176697)
MAPT (ENSG00000186868)
CD2AP (ENSG00000198087)
INS (ENSG00000254647)
Novel protein (ENSG00000288674)

Additional file 2 (SPARQL query Example 2-1)

Example2-1_Additional_file_2.rq
Description: A centralized SPARQL query for Alzheimer's disease using DisGeNET and one subquery
Search parameters:

  • Federated (F) or Centralized (C): C
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
    • 4 (Bgee)
  • Disease: Alzheimer's disease (umls:C0002395)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: prefrontal cortex (UBERON:0000451)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 1
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

Statistics of results:

Query approach No. of RIKEN mice No. of genes Gene labels (Ensembl Gene IDs)
Example 2-1: Centralized query for AD 55 14 PICALM (ENSG00000073921)
PSEN1 (ENSG00000080815)
NPY (ENSG00000122585)
APOE (ENSG00000130203)
APP (ENSG00000142192)
PSEN2 (ENSG00000143801)
ACE (ENSG00000159640)
INSR (ENSG00000171105)
BCL2 (ENSG00000171791)
BDNF (ENSG00000176697)
MAPT (ENSG00000186868)
CD2AP (ENSG00000198087)
INS (ENSG00000254647)
Novel protein (ENSG00000288674)

Additional file 3 (SPARQL query Example 3-1)

Example3-1_Additional_file_3.rq
Description: A federated SPARQL query for melanoma using DisGeNET and one subquery
Search parameters:

  • Federated (F) or Centralized (C): F
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
    • 4 (Bgee)
  • Disease: melanoma (umls:C0025202)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: skin of body (UBERON:0002097)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 1
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

  • > 600 sec / all rows [Execution date: 6 June 2024]
  • > 300 sec / 100 rows [Execution date: 6 June 2024]
  • 21 sec / all rows, Expression Score: > 99 [Execution date: 10 August 2023]
  • 38 sec / all rows, Expression Score: > 99 [Execution date: 4 August 2023]

Statistics of results:

Query approach No. of RIKEN mice No. of genes Gene labels (Ensembl Gene IDs)
Example 3-1: Federated query for melanoma 102 14 TYR (ENSG00000077498)
PPP6C (ENSG00000119414)
PIK3CA (ENSG00000121879)
BRCA2 (ENSG00000139618)
TP53 (ENSG00000141510)
AKT1 (ENSG00000142208)
ATM (ENSG00000149311)
KIT (ENSG00000157404)
TERT (ENSG00000164362)
CTNNB1 (ENSG00000168036)
PTEN (ENSG00000171862)
HRAS (ENSG00000174775)
MITF (ENSG00000187098)
NRAS (ENSG00000213281)

Additional file 4 (SPARQL query Example 4-1)

Example4-1_ Additional_file_4.rq
Description: A centralized SPARQL query for melanoma using DisGeNET and one subquery
Search parameters:

  • Federated (F) or Centralized (C): C
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
    • 4 (Bgee)
  • Disease: melanoma (umls:C0025202)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: skin of body (UBERON:0002097)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 1
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

Statistics of results:

Query approach No. of RIKEN mice No. of genes Gene labels (Ensembl Gene IDs)
Example 4-1: Centralized query for melanoma 102 14 TYR (ENSG00000077498)
PPP6C (ENSG00000119414)
PIK3CA (ENSG00000121879)
BRCA2 (ENSG00000139618)
TP53 (ENSG00000141510)
AKT1 (ENSG00000142208)
ATM (ENSG00000149311)
KIT (ENSG00000157404)
TERT (ENSG00000164362)
CTNNB1 (ENSG00000168036)
PTEN (ENSG00000171862)
HRAS (ENSG00000174775)
MITF (ENSG00000187098)
NRAS (ENSG00000213281)

Additional file 5 (SPARQL query Example 1-2)

Example1-2_Additional_file_5.rq
Description: A federated SPARQL query for Alzheimer's disease using DisGeNET and two subqueries
Search parameters:

  • Federated (F) or Centralized (C): F
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
    • 4 (Bgee)
  • Disease: Alzheimer's disease (umls:C0002395)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: prefrontal cortex (UBERON:0000451)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 2
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

Statistics of results:

Query approach No. of RIKEN mice No. of genes Gene labels (Ensembl Gene IDs)
Example 1-2: Federated query for AD 55 14 PICALM (ENSG00000073921)
PSEN1 (ENSG00000080815)
NPY (ENSG00000122585)
APOE (ENSG00000130203)
APP (ENSG00000142192)
PSEN2 (ENSG00000143801)
ACE (ENSG00000159640)
INSR (ENSG00000171105)
BCL2 (ENSG00000171791)
BDNF (ENSG00000176697)
MAPT (ENSG00000186868)
CD2AP (ENSG00000198087)
INS (ENSG00000254647)
Novel protein (ENSG00000288674)

Additional file 6 (SPARQL query Example 2-2)

Example2-2_Additional_file_6.rq
Description: A centralized SPARQL query for Alzheimer's disease using DisGeNET and two subqueries
Search parameters:

  • Federated (F) or Centralized (C): C
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
    • 4 (Bgee)
  • Disease: Alzheimer's disease (umls:C0002395)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: prefrontal cortex (UBERON:0000451)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 2
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

Statistics of results:

Query approach No. of RIKEN mice No. of genes Gene labels (Ensembl Gene IDs)
Example 2-2: Centralized query for AD 55 14 PICALM (ENSG00000073921)
PSEN1 (ENSG00000080815)
NPY (ENSG00000122585)
APOE (ENSG00000130203)
APP (ENSG00000142192)
PSEN2 (ENSG00000143801)
ACE (ENSG00000159640)
INSR (ENSG00000171105)
BCL2 (ENSG00000171791)
BDNF (ENSG00000176697)
MAPT (ENSG00000186868)
CD2AP (ENSG00000198087)
INS (ENSG00000254647)
Novel protein (ENSG00000288674)

Additional file 7 (SPARQL query Example 3-2)

Example3-2_Additional_file_7.rq
Description: A federated SPARQL query for melanoma using DisGeNET and two subqueries
Search parameters:

  • Federated (F) or Centralized (C): F
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
    • 4 (Bgee)
  • Disease: melanoma (umls:C0025202)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: skin of body (UBERON:0002097)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 2
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

Statistics of results:

Query approach No. of RIKEN mice No. of genes Gene labels (Ensembl Gene IDs)
Example 3-2: Federated query for melanoma 102 14 TYR (ENSG00000077498)
PPP6C (ENSG00000119414)
PIK3CA (ENSG00000121879)
BRCA2 (ENSG00000139618)
TP53 (ENSG00000141510)
AKT1 (ENSG00000142208)
ATM (ENSG00000149311)
KIT (ENSG00000157404)
TERT (ENSG00000164362)
CTNNB1 (ENSG00000168036)
PTEN (ENSG00000171862)
HRAS (ENSG00000174775)
MITF (ENSG00000187098)
NRAS (ENSG00000213281)

Additional file 8 (SPARQL query Example 4-2)

Example4-2_Additional_file_8.rq
Description: A centralized SPARQL query for melanoma using DisGeNET and two subqueries
Search parameters:

  • Federated (F) or Centralized (C): C
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
    • 4 (Bgee)
  • Disease: melanoma (umls:C0025202)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: skin of body (UBERON:0002097)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 2
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

Statistics of results:

Query approach No. of RIKEN mice No. of genes Gene labels (Ensembl Gene IDs)
Example 4-2: Centralized query for melanoma 102 14 TYR (ENSG00000077498)
PPP6C (ENSG00000119414)
PIK3CA (ENSG00000121879)
BRCA2 (ENSG00000139618)
TP53 (ENSG00000141510)
AKT1 (ENSG00000142208)
ATM (ENSG00000149311)
KIT (ENSG00000157404)
TERT (ENSG00000164362)
CTNNB1 (ENSG00000168036)
PTEN (ENSG00000171862)
HRAS (ENSG00000174775)
MITF (ENSG00000187098)
NRAS (ENSG00000213281)

Additional file 9 (SPARQL query Example 1-3)

Example1-3_Additional_file_9.rq
Description: A federated SPARQL query for Alzheimer's disease using DisGeNET and three subqueries
Search parameters:

  • Federated (F) or Centralized (C): F
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
    • 4 (Bgee)
  • Disease: Alzheimer's disease (umls:C0002395)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: prefrontal cortex (UBERON:0000451)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 3
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

Statistics of results:

Query approach No. of RIKEN mice No. of genes Gene labels (Ensembl Gene IDs)
Example 1-3: Federated query for AD 55 14 PICALM (ENSG00000073921)
PSEN1 (ENSG00000080815)
NPY (ENSG00000122585)
APOE (ENSG00000130203)
APP (ENSG00000142192)
PSEN2 (ENSG00000143801)
ACE (ENSG00000159640)
INSR (ENSG00000171105)
BCL2 (ENSG00000171791)
BDNF (ENSG00000176697)
MAPT (ENSG00000186868)
CD2AP (ENSG00000198087)
INS (ENSG00000254647)
Novel protein (ENSG00000288674)

Additional file 10 (SPARQL query Example 2-3)

Example2-3_Additional_file_10.rq
Description: A centralized SPARQL query for Alzheimer's disease using DisGeNET and three subqueries
Search parameters:

  • Federated (F) or Centralized (C): C
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
    • 4 (Bgee)
  • Disease: Alzheimer's disease (umls:C0002395)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: prefrontal cortex (UBERON:0000451)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 3
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

Statistics of results:

Query approach No. of RIKEN mice No. of genes Gene labels (Ensembl Gene IDs)
Example 2-3: Centralized query for AD 55 14 PICALM (ENSG00000073921)
PSEN1 (ENSG00000080815)
NPY (ENSG00000122585)
APOE (ENSG00000130203)
APP (ENSG00000142192)
PSEN2 (ENSG00000143801)
ACE (ENSG00000159640)
INSR (ENSG00000171105)
BCL2 (ENSG00000171791)
BDNF (ENSG00000176697)
MAPT (ENSG00000186868)
CD2AP (ENSG00000198087)
INS (ENSG00000254647)
Novel protein (ENSG00000288674)

Additional file 11 (SPARQL query Example 3-3)

Example3-3_Additional_file_11.rq
Description: A federated SPARQL query for melanoma using DisGeNET and three subqueries
Search parameters:

  • Federated (F) or Centralized (C): F
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
    • 4 (Bgee)
  • Disease: melanoma (umls:C0025202)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: skin of body (UBERON:0002097)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 3
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

Statistics of results:

Query approach No. of RIKEN mice No. of genes Gene labels (Ensembl Gene IDs)
Example 3-3: Federated query for melanoma 102 14 TYR (ENSG00000077498)
PPP6C (ENSG00000119414)
PIK3CA (ENSG00000121879)
BRCA2 (ENSG00000139618)
TP53 (ENSG00000141510)
AKT1 (ENSG00000142208)
ATM (ENSG00000149311)
KIT (ENSG00000157404)
TERT (ENSG00000164362)
CTNNB1 (ENSG00000168036)
PTEN (ENSG00000171862)
HRAS (ENSG00000174775)
MITF (ENSG00000187098)
NRAS (ENSG00000213281)

Additional file 12 (SPARQL query Example 4-3)

Example4-3_Additional_file_12.rq
Description: A centralized SPARQL query for Alzheimer's disease using DisGeNET and three subqueries.
Search parameters:

  • Federated (F) or Centralized (C): C
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
    • 4 (Bgee)
  • Disease: melanoma (umls:C0025202)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: skin of body (UBERON:0002097)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 3
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

Statistics of results:

Query approach No. of RIKEN mice No. of genes Gene labels (Ensembl Gene IDs)
Example 4-3: Centralized query for melanoma 102 14 TYR (ENSG00000077498)
PPP6C (ENSG00000119414)
PIK3CA (ENSG00000121879)
BRCA2 (ENSG00000139618)
TP53 (ENSG00000141510)
AKT1 (ENSG00000142208)
ATM (ENSG00000149311)
KIT (ENSG00000157404)
TERT (ENSG00000164362)
CTNNB1 (ENSG00000168036)
PTEN (ENSG00000171862)
HRAS (ENSG00000174775)
MITF (ENSG00000187098)
NRAS (ENSG00000213281)

Additional file 13 (SPARQL query Example 5-0)

Example5-0_Additional_file_13.rq
Description: A centralized SPARQL query of Query subparts 1 to 3 for Alzheimer's disease using DisGeNET and without the subquery.
Search parameters:

  • Federated (F) or Centralized (C): C
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
  • Disease: Alzheimer's disease (umls:C0002395)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: Not specified
  • Confidence Level: Not specified
  • Expression Level: Not specified
  • No. of subqueries: 0
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

Statistics of results:

Query approach No. of RIKEN mice No. of genes Gene labels (Ensembl Gene IDs)
Example 5-0: Centralized query for AD 56 15 PICALM (ENSG00000073921)
PSEN1 (ENSG00000080815)
NPY (ENSG00000122585)
APOE (ENSG00000130203)
APP (ENSG00000142192)
PSEN2 (ENSG00000143801)
ACE (ENSG00000159640)
INSR (ENSG00000171105)
BCL2 (ENSG00000171791)
LEP (ENSG00000174697)
BDNF (ENSG00000176697)
MAPT (ENSG00000186868)
CD2AP (ENSG00000198087)
INS (ENSG00000254647)
Novel protein (ENSG00000288674)

Additional file 14 (SPARQL query Example 5-1)

Example5-1_Additional_file_14.rq
Description: A centralized SPARQL query of Query subparts 1 to 3 for Alzheimer's disease using DisGeNET and one subquery.
Search parameters:

  • Federated (F) or Centralized (C): C
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
  • Disease: Alzheimer's disease (umls:C0002395)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: Not specified
  • Confidence Level: Not specified
  • Expression Level: Not specified
  • No. of subqueries: 1
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

Statistics of results:

Query approach No. of RIKEN mice No. of genes Gene labels (Ensembl Gene IDs)
Example 5-1: Centralized query for AD 56 14 PICALM (ENSG00000073921)
PSEN1 (ENSG00000080815)
NPY (ENSG00000122585)
APOE (ENSG00000130203)
APP (ENSG00000142192)
PSEN2 (ENSG00000143801)
ACE (ENSG00000159640)
INSR (ENSG00000171105)
BCL2 (ENSG00000171791)
LEP (ENSG00000174697)
BDNF (ENSG00000176697)
MAPT (ENSG00000186868)
CD2AP (ENSG00000198087)
INS (ENSG00000254647)
Novel protein (ENSG00000288674)

Additional file 15 (SPARQL query Example 5-2)

Example5-2_Additional_file_15.rq
Description: A centralized SPARQL query of Query subparts 1 to 3 for Alzheimer's disease using DisGeNET and two subqueries.
Search parameters:

  • Federated (F) or Centralized (C): C
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
  • Disease: Alzheimer's disease (umls:C0002395)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: Not specified
  • Confidence Level: Not specified
  • Expression Level: Not specified
  • No. of subqueries: 2
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

Statistics of results:

Query approach No. of RIKEN mice No. of genes Gene labels (Ensembl Gene IDs)
Example 5-2: Centralized query for AD 56 15 PICALM (ENSG00000073921)
PSEN1 (ENSG00000080815)
NPY (ENSG00000122585)
APOE (ENSG00000130203)
APP (ENSG00000142192)
PSEN2 (ENSG00000143801)
ACE (ENSG00000159640)
INSR (ENSG00000171105)
BCL2 (ENSG00000171791)
LEP (ENSG00000174697)
BDNF (ENSG00000176697)
MAPT (ENSG00000186868)
CD2AP (ENSG00000198087)
INS (ENSG00000254647)
Novel protein (ENSG00000288674)

Additional file 16 (SPARQL query Example 6-0)

Example6-0_Additional_file_16.rq
Description: A centralized SPARQL query of Query subparts 1 to 3 for melanoma using DisGeNET and without the subquery.
Search parameters:

  • Federated (F) or Centralized (C): C
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
  • Disease: melanoma (umls:C0025202)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: Not specified
  • Confidence Level: Not specified
  • Expression Level: Not specified
  • No. of subqueries: 0
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

Statistics of results:

Query approach No. of RIKEN mice No. of genes Gene labels (Ensembl Gene IDs)
Example 6-0: Centralized query for melanoma 102 14 TYR (ENSG00000077498)
PPP6C (ENSG00000119414)
PIK3CA (ENSG00000121879)
BRCA2 (ENSG00000139618)
TP53 (ENSG00000141510)
AKT1 (ENSG00000142208)
ATM (ENSG00000149311)
KIT (ENSG00000157404)
TERT (ENSG00000164362)
CTNNB1 (ENSG00000168036)
PTEN (ENSG00000171862)
HRAS (ENSG00000174775)
MITF (ENSG00000187098)
NRAS (ENSG00000213281)

Additional file 17 (SPARQL query Example 6-1)

Example6-1_Additional_file_17.rq
Description: A centralized SPARQL query of Query subparts 1 to 3 for melanoma using DisGeNET and one subquery.
Search parameters:

  • Federated (F) or Centralized (C): C
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
  • Disease: melanoma (umls:C0025202)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: Not specified
  • Confidence Level: Not specified
  • Expression Level: Not specified
  • No. of subqueries: 1
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

Statistics of results:

Query approach No. of RIKEN mice No. of genes Gene labels (Ensembl Gene IDs)
Example 6-1: Centralized query for melanoma 102 14 TYR (ENSG00000077498)
PPP6C (ENSG00000119414)
PIK3CA (ENSG00000121879)
BRCA2 (ENSG00000139618)
TP53 (ENSG00000141510)
AKT1 (ENSG00000142208)
ATM (ENSG00000149311)
KIT (ENSG00000157404)
TERT (ENSG00000164362)
CTNNB1 (ENSG00000168036)
PTEN (ENSG00000171862)
HRAS (ENSG00000174775)
MITF (ENSG00000187098)
NRAS (ENSG00000213281)

Additional file 18 (SPARQL query Example 6-2)

Example6-2_Additional_file_18.rq
Description: A centralized SPARQL query of Query subparts 1 to 3 for melanoma using DisGeNET and two subqueries.
Search parameters:

  • Federated (F) or Centralized (C): C
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
  • Disease: melanoma (umls:C0025202)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: Not specified
  • Confidence Level: Not specified
  • Expression Level: Not specified
  • No. of subqueries: 2
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

Statistics of results:

Query approach No. of RIKEN mice No. of genes Gene labels (Ensembl Gene IDs)
Example 6-2: Centralized query for melanoma 102 14 TYR (ENSG00000077498)
PPP6C (ENSG00000119414)
PIK3CA (ENSG00000121879)
BRCA2 (ENSG00000139618)
TP53 (ENSG00000141510)
AKT1 (ENSG00000142208)
ATM (ENSG00000149311)
KIT (ENSG00000157404)
TERT (ENSG00000164362)
CTNNB1 (ENSG00000168036)
PTEN (ENSG00000171862)
HRAS (ENSG00000174775)
MITF (ENSG00000187098)
NRAS (ENSG00000213281)

Additional file 19 (SPARQL query Example 7)

Example7_Additional_file_19.rq
Description: A federated SPARQL query of Query subpart 4 (Bgee) for the "prefrontal cortex". Search parameters:

  • Federated (F) or Centralized (C): F
  • Query subparts:
    • 4 (Bgee)
  • Disease: Not used
  • GDA: Not specified
  • Anatomical parts: prefrontal cortex (UBERON:0000451)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 0
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

Statistics of results:

Query approach No. of genes
Example 7: Federated query of Query subpart 4 (Bgee) for the "prefrontal cortex" 42,448

Additional file 20 (SPARQL query Example 8)

Example8_Additional_file_20.rq
Description: A federated SPARQL query of Query subpart 4 (Bgee) for the "skin of body".
Search parameters:

  • Federated (F) or Centralized (C): F
  • Query subparts:
    • 4 (Bgee)
  • Disease: Not used
  • GDA: Not specified
  • Anatomical parts: skin of body (UBERON:0002097)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 0
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

  • 58 sec / all rows [Execution date: 6 June 2024]
  • 3 sec / 100 rows [Execution date: 6 June 2024, 100 rows]
  • 29 sec / all rows, Expression Score: > 99 [Execution date: 4 August 2023]

Statistics of results:

Query approach No. of genes
Example 8: Federated query of Query subpart 4 (Bgee) for the "skin of body" 45,724

Additional file 21 (SPARQL query Example 9)

Example9_Additional_file_21.rq
Description: A centralized SPARQL query of Query subpart 4 (Bgee) for the "prefrontal cortex".
Search parameters:

  • Federated (F) or Centralized (C): C
  • Query subparts:
    • 4 (Bgee)
  • Disease: Not used
  • GDA: Not specified
  • Anatomical parts: prefrontal cortex (UBERON:0000451)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 0
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

Statistics of results:

Query approach No. of genes
Example 9: Centralized query of Query subpart 4 (Bgee) for the "prefrontal cortex" 42,448

Additional file 22 (SPARQL query Example 10)

Example10_Additional_file_22.rq
Description: A centralized SPARQL query of Query subpart 4 (Bgee) for the "skin of body".
Search parameters:

  • Federated (F) or Centralized (C): C
  • Query subparts:
    • 4 (Bgee)
  • Disease: Not used
  • GDA: Not specified
  • Anatomical parts: skin of body (UBERON:0002097)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 0
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

  • 16 sec / all rows [Execution date: 6 June 2024]
  • 3 sec / 100 rows [Execution date: 6 June 2024]
  • 11 sec / all rows, Expression Score: > 99 [Execution date: 4 August 2023]

Statistics of results:

Query approach No. of genes
Example 10: Centralized query of Query subpart 4 (Bgee) for the "skin of body" 45,724

Additional file 23 (Python script)

tsv2rdf_uberonKgx_to_uberon_broader20230723.py
Description: This is a Python script for converting the latest uberon_kgx_tsv_edge.tsv from the kg-uberon webpage in the KG-OBO project of the KG-Hub to two ttl format files includes subject_broader_object_from_BFO_0000050.ttl (Additional file 24) and subject_broader_object_from_subClassOf.ttl (Additional file 25). We loaded both turtle files as the graph http://metadb.riken.jp/db/uberonRDF_broader_fromKGX to the RIKEN Bioresource MetaDabase to enable executing a transitive search for the Uberon terms.

a_partOf_image_additional_file_5

Additional file 24 (RDF data)

subject_broader_object_from_BFO_0000050.ttl
Description: This is a turtle file converted from the latest uberon_kgx_tsv_edge.tsv in the kg-uberon webpage in the KG-OBO project of the KG-Hub. In this file, the BFO:0000050 (pratOf) relations among the Uberon terms were converted to http://purl.org/rbrc/resource/broader relations among them.

Sample:
sample_subject_broader_object_from_BFO_0000050_02


Additional file 25 (RDF data)

subject_broader_object_from_subClassOf.ttl
Description: This is a turtle file converted from the latest uberon_kgx_tsv_edge.tsv in the kg-uberon webpage in the KG-OBO project of the KG-Hub. In this file, the rdfs:subClassOf relations among the Uberon terms were converted to http://purl.org/rbrc/resource/broader relations among them.

Sample:
sample_subject_broader_object_from_subClassOf_02


Overview of Converting the uberon.owl to the turtle file using the “broader” predicate. (Image file)

image_additional_file_5.tiff
Title: Overview of Converting the uberon.owl to the turtle file using the “broader” predicate.
Description: A path between the “skin of limb” (UBERON_0001419) to the “skin of body” (UBERON_0002097) in the uberon.owl (A) and that in the graph http://purl.org/rbrc/resource/broader within the RIKEN Bioresource MetaDatabase (B). The diagrams were created using https://www.kanzaki.com/works/2009/pub/graph-draw.

image_additional_file_5


Additional file 26 (SPARQL query Example 11-1)

Example11-1_Additional_file_26.rq
Description: A centralized query for melanoma using DisGeNET and the uberonRDF-KGX (skin of body (UBERON:0002097))

Search parameters:

  • Federated (F) or Centralized (C): C
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
    • 4 (Bgee)
  • Disease: melanoma (umls:C0025202)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: skin of body (UBERON:0002097)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 1
  • Property paths: Yes
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

  • 627 sec / all rows [Execution date: 6 June 2024]
  • 143 sec / 100 rows [Execution date: 6 June 2024]
  • 499 sec / all rows, Expression Score: > 99 [Execution date: 6 June 2024]
  • 67 sec / all rows, Expression Score: > 99 [Execution date: 4 August 2023]

Statistics of results:

Query approach No. of mice No. of genes Gene labels (Ensembl Gene IDs) No. of anatomical entities Anatomical entity labels (Uberon IDs)
Example 11-1: Centralized query for melanoma using the broader predicate 102 14 TYR (ENSG00000077498)
PPP6C (ENSG00000119414)
PIK3CA (ENSG00000121879)
BRCA2 (ENSG00000139618)
TP53 (ENSG00000141510)
AKT1 (ENSG00000142208)
ATM (ENSG00000149311)
KIT (ENSG00000157404)
TERT (ENSG00000164362)
CTNNB1 (ENSG00000168036)
PTEN (ENSG00000171862)
HRAS (ENSG00000174775)
MITF (ENSG00000187098)
NRAS (ENSG00000213281)
13 zone of skin (UBERON_0000014)
skin epidermis (UBERON_0001003)
skin of abdomen (UBERON_0001416)
skin of limb (UBERON_0001419)
skin of leg (UBERON_0001511)
skin of hip (UBERON_0001554)
hair follicle (UBERON_0002073)
skin of body (UBERON_0002097)
forelimb skin (UBERON_0003531)
hindlimb skin (UBERON_0003532)
upper leg skin (UBERON_0004262)
upper arm skin (UBERON_0004263)

Additional file 27 (SPARQL query Example 11-2)

Example11-2_Additional_file_27.rq
Description: A federated query for melanoma using DisGeNET and the Ubergraph (Endpoint: https://ubergraph.apps.renci.org/sparql), skin of body (UBERON:0002097)
Search parameters:

  • Federated (F) or Centralized (C): F
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
    • 4 (Bgee)
  • Disease: melanoma (umls:C0025202)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: skin of body (UBERON:0002097)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 1
  • Property paths: Yes
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

  • Transaction timed out (over 3600 sec) [Execution date: 6 June 2024]
    • Results: ND
  • 3604 sec / 100 rows [Execution date: 6 June 2024]
  • 1875 sec / all rows, Expression Score: > 99 [Execution date: 6 June 2024]
  • Transaction timed out (over 600 sec) [Execution date: 4 August 2023]

SPARQL query Example 12

Example12.rq
Description: A centralized query for Alzheimer's disease 1 using MedGen
Search parameters:

  • Federated (F) or Centralized (C): C
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
    • 4 (Bgee)
  • Alzheimer's disease type1 (AD1) (umls:C1863052, medgen:C1863052)
  • GDA: MedGen (https://www.ncbi.nlm.nih.gov/medgen/)
  • Anatomical parts: prefrontal cortex (UBERON:0000451)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 1
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

Statistics of results:

Query approach No. of RIKEN mice No. of genes Gene labels (Ensembl Gene IDs)
Example 12: Centralized query for AD1 9 1 APP (ENSG00000142192)

SPARQL query Example 13

Example13.rq
Description: A centralized query for Alzheimer's disease using MGI
Search parameters:

  • Federated (F) or Centralized (C): C
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
    • 4 (Bgee)
  • Alzheimer's disease (umls:C0002395)
  • GDA:MGI (https://www.informatics.jax.org/)
  • Anatomical parts: prefrontal cortex (UBERON:0000451)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 1
  • Property paths: No
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

Statistics of results:

Query approach No. of RIKEN mice No. of genes Gene labels (Ensembl Gene IDs)
Example 13: Centralized query for AD 13 4 PSEN1 (ENSG00000080815)
PIN1 (ENSG00000127445)
IL33 (ENSG00000137033)
APP (ENSG00000142192)

SPARQL query Example 14

Example14.rq
Description: A centralized query for Alzheimer's disease using DisGeNET and the uberonRDF-KGX (prefrontal cortex (UBERON:0000451))

Search parameters:

  • Federated (F) or Centralized (C): C
  • Query subparts:
    • 1 (DisGeNET)
    • 2 (OMA)
    • 3 (Bioresource)
    • 4 (Bgee)
  • Disease: Alzheimer's disease (umls:C0002395)
  • GDA: DisGeNET (https://www.disgenet.org/)
  • Anatomical parts: prefrontal cortex (UBERON:0000451)
  • Confidence Level: high
  • Expression Score: 0 - 100
  • Sex: any
  • No. of subqueries: 1
  • Property paths: Yes
  • Limit on the number of rows returned: 100 and No (all rows)

The average runtime (10 times) and the query results:

  • 358 sec / all rows [Execution date: 9 June 2024]
  • 357 sec / 100 rows [Execution date: 9 June 2024]
  • 318 sec / all rows, Expression Score: > 99 [Execution date: 9 June 2024]

Statistics of results:

Query approach No. of mice No. of genes Gene labels (Ensembl Gene IDs) No. of anatomical entities Anatomical entity labels (Uberon IDs)
Example 14: Centralized query for AD using the broader predicate 55 14 PICALM (ENSG00000073921)
PSEN1 (ENSG00000080815)
NPY (ENSG00000122585)
APOE (ENSG00000130203)
APP (ENSG00000142192)
PSEN2 (ENSG00000143801)
ACE (ENSG00000159640)
INSR (ENSG00000171105)
BCL2 (ENSG00000171791)
BDNF (ENSG00000176697)
MAPT (ENSG00000186868)
CD2AP (ENSG00000198087)
INS (ENSG00000254647)
Novel protein (ENSG00000288674)
2 prefrontal cortex (UBERON_0000451)
Brodmann (1909) area 10 (UBERON_0013541)

Licence

BioResource MetaDatabase by RIKEN BRC is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0)
If you use data from this database, please be sure attribute this database as follows:
"BioResource Metadatabase (https://knowledge.brc.riken.jp/) © RIKEN BRC licensed under CC Attribution 4.0 International".

The Bioresource MetaDatabase integrates the BRC's research results using the following external datasets.
OMA (Orthologs) licensed under CC Attribution-Share Alike 2.5 (CC BY-SA 2.5).
Bgee (Gene expression) licensed under CC0.
DisGeNET (Disease-gene interaction) licensed under Attribution-NonCommercial-ShareAlike 4.0 International.

broaderpredicate_uberon's People

Contributors

kushidat avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.