Code Monkey home page Code Monkey logo

Comments (18)

saramsey avatar saramsey commented on September 24, 2024 1

PTGS2 is now expanded in the new KG

from rtx.

saramsey avatar saramsey commented on September 24, 2024

So, there are a couple of ways we could fix this, depending on how one defines "fix".

We could specifically seed uniprot protein IDs for genes that are in the COP spreadsheet; that would take care of the COP spreadsheet, at least.

More generally, here is the thing. If we are expanding 3X then we can always get to some proteins in the third hop, and these by definition won't be expanded. Does that make sense?

I guess asymptotically if you did enough rounds of expansion, eventually all nodes would be expanded (as there are a finite number of nodes). I have no idea how many rounds it would take to reach this limit. Do you want to explore it?

from rtx.

dkoslicki avatar dkoslicki commented on September 24, 2024

@saramsey I'm beginning to think that more expanding might be useful. In particular, for the COP questions, I am noticing that some of the "correct" protein targets do not connect to the desired disease (while other sub-optimal protein targets do connect). Note that COP's consist of 5 hops.

from rtx.

saramsey avatar saramsey commented on September 24, 2024

from rtx.

dkoslicki avatar dkoslicki commented on September 24, 2024

Yeah, more seeding might work. I'm thinking that seeding the protein nodes would help (get's us effectively one more hop for the COP's), and perhaps the anatomy too. I don't have much use for GO nodes yet.

from rtx.

jaredroach avatar jaredroach commented on September 24, 2024

I wonder whether trying to connect primarily molecular/cellular processes to particular pieces of anatomy is a fundamentally flawed strategy. Once naproxen reaches "generalized anti-inflammatory effect on cells" (assuming that is a node), then going one hop forward to the target anatomy is trivial and un-interesting. Because it hits ANY anatomy, as long as it can be influenced by inflammation (which is almost all anatomy)>

from rtx.

dkoslicki avatar dkoslicki commented on September 24, 2024

@saramsey Just to close the loop on this one: I'm still getting PTGS1 not expanded in the new KG at rtx.ncats.io:

match (n:protein{description:"PTGS1"}) return n.expanded
False

However you want to do it, can get get this (and other protein nodes) expanded?

from rtx.

dkoslicki avatar dkoslicki commented on September 24, 2024

@jaredroach Yes, it may be a flawed strategy, but that's what NCATS gave us to work with. Feel free to propose a different definition of a COP and I'd be happy to discuss/put it on the team meeting agenda. In a separate issue of course, since this issue (#21) is for nodes not being expanded in the KG.

from rtx.

saramsey avatar saramsey commented on September 24, 2024

from rtx.

saramsey avatar saramsey commented on September 24, 2024

Making progress on this. I am seeing that for 395 proteins in the KG, the "symbol" and "name" fields are inexplicably swapped in Neo4j.

screen shot 2018-05-02 at 9 56 07 am

Wondering if it is a bug in this code in Orangeboard, on line 301-302:
screen shot 2018-05-02 at 9 56 57 am

from rtx.

saramsey avatar saramsey commented on September 24, 2024

Updated the code to correctly also set the "symbol" field in this code branch, if it is a microRNA or protein node:
screen shot 2018-05-02 at 9 57 51 am

from rtx.

edeutsch avatar edeutsch commented on September 24, 2024

Do we ever use symbol for other node types, like "ALS" in diseases?

from rtx.

saramsey avatar saramsey commented on September 24, 2024

no, as far as I know that is not part of the spec.

from rtx.

edeutsch avatar edeutsch commented on September 24, 2024

What do you mean, "not part of the spec"? "symbol" is a field in the spec, isn't it?

from rtx.

saramsey avatar saramsey commented on September 24, 2024

yes but it is an optional field. and I think it is intended only for genes or proteins?

from rtx.

saramsey avatar saramsey commented on September 24, 2024

I'm beginning to think this is a bug. Neo4j weirdness delayed my testing but I am running the tests now, should be done in a couple of hours. Provisionally reclassifying this issue as a bug.

from rtx.

saramsey avatar saramsey commented on September 24, 2024

I believe I have identified the root cause of this bug. Attempted code fix committed. Testing now on rtxsteve.

from rtx.

saramsey avatar saramsey commented on September 24, 2024

OK it looks like with the current code in master, the resulting KG does still have proteins that are not "expanded", however, they are not human (thus I'm hoping they are not problematic for reasoning).
I want to get these non-human proteins out of the KG (their presence is a bug by definition) but I also want to keep mindful of the priority list for KG enhancements.

from rtx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.