Code Monkey home page Code Monkey logo

dreamteam's People

Contributors

alishaang avatar anastasia-pav avatar hannz88 avatar mo105 avatar sheridans97 avatar

Watchers

 avatar  avatar  avatar

Forkers

nehakhairnar7

dreamteam's Issues

Database

-document how the schema was designed
-why they were designed that way
-what was used to create the database

Possible alternate phosphosite search page

I've drafted a new possible alternate phosphosite search page for everyone to see, just to make sure we're all on the same page. Please let me know if I misunderstood anything.

User Data Input: parameters

For the user defined parameters, I have been thinking that it would be good to have this functionality:

  • P-Value Threshold: where the user has the option to choose either 0.05 or 0.01 significance
  • Coefficience of Variance Threshold (%): Either where the user can either input a value between 0 or 100
  • Fold Change Significance Threshold: where the user can input a value between 0 and 5.

Do these sound okay?

Issues with running app.py on Windows

Following my update on Sunday, the file app.py could not be ran on Windows. I've make a few amendments:

  • Removed the auto generated init.py from app directory
  • Removed the '.' from Database and forms in app.py
  • Made a copy of Database/kinase_database.db in app (replacing the symlink because windows can't see it)

Then, I've tested on both Windows and Linux, using Powershell, Ubuntu (WSLv2), Ubuntu (Linux) and Pycharm. The following methods work fine when I tested them

  • On powershell:
    python3 app.py

  • On Ubuntu within Windows both of these worked:
    python3 app.py
    and
    flask run

  • On linux both of these worked:
    python3 app.py
    and
    flask run

  • On pycharm it works on Windows and Linux using the run button (provided the Working directory is set).

Sorry for the inconvenience

change function code to return protein name and gene aliases.

def get_gene_protein_name(kinase_input):
"""
Returns a list of dictionary.
In the dictionary, there are gene name and protein name.
Returns empty list when no match is found.
>> kin = "AKT"
>> get_gene_protein_name(kin)
[{'Gene Name': 'AKT', 'Protein Name': 'RAC-alpha serine/threonine-protein kinase'},
{'Gene Name': 'AKT', 'Protein Name': 'RAC-beta serine/threonine-protein kinase'},
{'Gene Name': 'AKT', 'Protein Name': 'RAC-gamma serine/threonine-protein kinase'}]
"""
like_kin = "%{}%".format(kinase_input)
tmp = []
kinase_query = s.query(KinaseGeneMeta).join(KinaseGeneName).filter(or_(KinaseGeneName.gene_alias.like(like_kin), KinaseGeneMeta.uniprot_entry.like(like_kin),
KinaseGeneMeta.uniprot_number.like(like_kin), KinaseGeneMeta.protein_name.like(like_kin))).all()
for row in kinase_query:
results = {}
results["Gene_Name"] = row.to_dict()["gene_name"]
results["Protein_Name"] = row.to_dict()["protein_name"]
tmp.append(results)
return tmp
kin = "AKT"

Input software creation

-the part where the user upload a file and we need to display the relative activity visually

Database

So I had a look at the sites that we talked about yesterday and at uniprot again. I have a few points which I'd like some input on.

  1. Most sites do not disclose their databases openly.
  2. Most sites do not have API that allows users to retrieve data from. There are ways to retrieve them but it's uber complicated.
  3. Uniprot have API as well as all the information on point 1 (ie name, gene, where in cell, phophosites).
  4. Even if we all use Uniprot, each of us could still work on getting something different then we could merge it using SQL afterwards (ie kinase & gene, kinase & location, kinase & family etc).
  5. Uniprot has a lot of data. We should decide on what else to include.
  6. The inhibitor sites allow download of the database, so it's not a problem.

Tldr: I think we should use uniprot for the kinase info part but we'll need to discuss which part we need. Inhibitor website is ok to export the data.

Data mining: subcellular location

-document the how's, why's, what's and where's of getting the subcellular location for the kinase
-include a brief README.md to say which file is what

Task of controller

-when user search for something, the website would need to pull the information from the database
-more info on this: need to read around the subject

Inhibitor search bar

Question: In the search bar in the inhibitor page, are we allowing user to search an inhibitor or a gene? If we're allowing both, can we have a box that drop down next to it to specify inhibitor or gene?

Data mining: inhibitors

-documents the how's, what's, why's and where's of getting the information on inhibitors
-include a brief README.md in the folder to say what file is what

Inhibitor csv MK II

Getting additional information for inhibitors so the final individual inhibitor page contains:
-inhibitor name
-inhibitor aliases
-empirical formula
-molecular weight
-smiles
-pubchem ID
-Inchi
-Image url
-target

Project Plan

I am in the process of making the project plan at the moment . So i was wondering does this look ok ?

Week 1:
Research (everyone)
Data mining (everyone)
Web Site Creation (Mo)

Week 2:
Data mining continue (everyone)
Database creation with SQL and organisation (Han)
Input software creation (Sheridan)

Week 3 :
Data mining continue (everyone)
Database creation with SQL and organisation Continue (Han)
Website creation (Mo)
- Anastasia : Help Page
- Task of Controller : Alisha
Documentation (everyone)

Phosphosite pages

So I've drafted what we said about the phosphosite search pages. Let me know if this was what everyone had in mind.

image

Issue with 'Function to return the inhibitors from a kinase'.

def get_inhibitors_from_gene(kinase_gene):
"""
Take a string and return a list of dictionaries.
Returns empty list if there are no inhibitors.
>> get_inhibitors_from_gene("SGK1")
['GSK650394A', 'SGK-Sanofi-14i','SGK1-Sanofi-14g', 'SGK1-Sanofi-14h', 'SGK1-Sanofi-14n']
"""
results = []
kinase_query = s.query(KinaseGeneName).filter(KinaseGeneName.gene_alias==kinase).one()
for inhibitor in kinase_query.inhibitors:
results.append(inhibitor.inhibitor)
return results
kinase = "SGK1"
get_inhibitors_from_gene(kinase)

Genomic location data

Final genomic location file has duplicated results and some data loss so I'm looking into fixing this.

SQL

Please go over sql

Column names

Hey, could everyone give me their respective column names? If you decide to change it at the end, it'll be fine, just send me a message.

Inhibitor csv

Create the inhibitor csv.
Should contain the following columns:

  • gene_name
  • structure (as jpg or png in folder + filename in csv)
  • molecular_weight
  • empirical_formula

Data mining: Phosphosites

-documents the where's, when's, why's and how's of getting the details of phosphosites and substrates.
-include a brief README.md to say what file is what

SQL - database creation

-using sqlite and sqlalchemy to create the database
-creating the functions to parse the info

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.