Code Monkey home page Code Monkey logo

go-pombase's Introduction

go-pombase

Code for generating process-centric GO-CAM models from GAFs.

Working towards creating GO-CAMS by inputting a list of annotations (either GAF or directly ontobio association objects). Right now, this takes a GO biologcal process term as input, does some heuristic gene set calculation and generates a GO-CAM ttl for the BP term’s gene set. Separating this gene set logic from the annotation-to-GO-CAM logic is another goal.

Running

pip install -r requirements.txt

As this is coded right now for a specific use case, this can be ran simply by inputting a GO BP term, source GAF filename, and a destination filename:

python3 generate_rdf.py -t "GO:0010971" -g "gene_association.pombase" -f "filename.ttl"

With the source GAF filename argument this now frees up the library to create GO-CAM models from any set of GAF's, not just ones pertaining to S. pombe. The example GAF can be downloaded from ftp://ftp.geneontology.org/pub/go/gene-associations/.

Running for generating PomBase GO-CAM models

For my purpose right now I'm running generate_pombase_model.py specifying BP term (-t), output filename (-f), and GAF input file (-g):

python3 generate_pombase_model.py -t 'GO:0031929' -f 'TOR signaling.ttl' -g 'gene_association.pombase'

Reusing computed gene-to-BP term dictionary data

You can also specify the data (-j) to use in the first step in order to speed up processing during repeated runs (~1.5 min -> 10 sec):

python3 generate_pombase_model.py -j 'tad_go_gafs.json' -t 'GO:0031929' -f 'TOR signaling.ttl' -g 'gene_association.pombase'

To dump out this data into a reusable JSON, you can run:

python3 pombase_direct_bp_annots_query.py -j 'json_outfile.json' -g 'gene_association.pombase'

With -j specifying the JSON output path.

Dependencies

Requires ontobio.

go-pombase's People

Contributors

dougli1sqrd avatar dustine32 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

go-pombase's Issues

Standardize model titles

Follow something like:

PomBase_[BP term]_[BP term label]

Also be aware that minerva dumping out the imported model will name the file [UUID].ttl, which can be different than the initially imported file name and thus can result in duplicate models. I should figure out how to pre-import assign the correct UUID filename.

Only use annotations with experimental evidence codes (or ND)

Need to add a filter to annotations for only experimental evidence codes or ND:
EXP
IDA
IPI
IMP
IGI
IEP
ND

Example: in model generated for GO:0031929 - TOR signaling, “enzyme regulator activity” is used for ste20 even though its evidence code is "IBA".

Should only use protein binding annotations if there’s no extension that already links the two gene products (via an MF node)

The code to generate models using generate_pombase_model.py currently adds "protein binding with" connections between two genes even if there's already a extensions-derived connection (e.g. has_direct_input) for those two genes. For example:
image
In this image there are redundant "with" connections (displayed as two triples: gene1-enabled by- "protein binding" activity and "protein binding"-has_input-gene2) between sty1-atf1 and sty1-wis1.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.