Comments (18)
PTGS2 is now expanded in the new KG
from rtx.
So, there are a couple of ways we could fix this, depending on how one defines "fix".
We could specifically seed uniprot protein IDs for genes that are in the COP spreadsheet; that would take care of the COP spreadsheet, at least.
More generally, here is the thing. If we are expanding 3X then we can always get to some proteins in the third hop, and these by definition won't be expanded. Does that make sense?
I guess asymptotically if you did enough rounds of expansion, eventually all nodes would be expanded (as there are a finite number of nodes). I have no idea how many rounds it would take to reach this limit. Do you want to explore it?
from rtx.
@saramsey I'm beginning to think that more expanding might be useful. In particular, for the COP questions, I am noticing that some of the "correct" protein targets do not connect to the desired disease (while other sub-optimal protein targets do connect). Note that COP's consist of 5 hops.
from rtx.
from rtx.
Yeah, more seeding might work. I'm thinking that seeding the protein nodes would help (get's us effectively one more hop for the COP's), and perhaps the anatomy too. I don't have much use for GO nodes yet.
from rtx.
I wonder whether trying to connect primarily molecular/cellular processes to particular pieces of anatomy is a fundamentally flawed strategy. Once naproxen reaches "generalized anti-inflammatory effect on cells" (assuming that is a node), then going one hop forward to the target anatomy is trivial and un-interesting. Because it hits ANY anatomy, as long as it can be influenced by inflammation (which is almost all anatomy)>
from rtx.
@saramsey Just to close the loop on this one: I'm still getting PTGS1 not expanded in the new KG at rtx.ncats.io
:
match (n:protein{description:"PTGS1"}) return n.expanded
False
However you want to do it, can get get this (and other protein nodes) expanded?
from rtx.
@jaredroach Yes, it may be a flawed strategy, but that's what NCATS gave us to work with. Feel free to propose a different definition of a COP and I'd be happy to discuss/put it on the team meeting agenda. In a separate issue of course, since this issue (#21) is for nodes not being expanded in the KG.
from rtx.
from rtx.
Making progress on this. I am seeing that for 395 proteins in the KG, the "symbol" and "name" fields are inexplicably swapped in Neo4j.
Wondering if it is a bug in this code in Orangeboard, on line 301-302:
from rtx.
Updated the code to correctly also set the "symbol" field in this code branch, if it is a microRNA or protein node:
from rtx.
Do we ever use symbol for other node types, like "ALS" in diseases?
from rtx.
no, as far as I know that is not part of the spec.
from rtx.
What do you mean, "not part of the spec"? "symbol" is a field in the spec, isn't it?
from rtx.
yes but it is an optional field. and I think it is intended only for genes or proteins?
from rtx.
I'm beginning to think this is a bug. Neo4j weirdness delayed my testing but I am running the tests now, should be done in a couple of hours. Provisionally reclassifying this issue as a bug.
from rtx.
I believe I have identified the root cause of this bug. Attempted code fix committed. Testing now on rtxsteve.
from rtx.
OK it looks like with the current code in master, the resulting KG does still have proteins that are not "expanded", however, they are not human (thus I'm hoping they are not problematic for reasoning).
I want to get these non-human proteins out of the KG (their presence is a bug by definition) but I also want to keep mindful of the priority list for KG enhancements.
from rtx.
Related Issues (20)
- increase root filesystem size on kg2cplover.rtx.ai HOT 11
- We need to prototype potential next query types for UI HOT 1
- RTX-KG2 is returning subclasses without the appropriate `query_id` HOT 9
- Test triples: can this 32 MB file be distributed via kg2webhost.rtx.ai? HOT 2
- background tasker should indicate the parent process TCP port number
- make background tasker go away if parent process goes away HOT 2
- Need to update CI test runner so it creates the KP info cache
- Error in CI test build HOT 4
- Improper response returned when furnished id does not exist HOT 8
- Can we add medicament to our block list? HOT 4
- shutdown bug in ARAX HOT 1
- ARAX opentelemetry/jaeger not working in ITRB HOT 1
- Bug in `connect`
- Changelog since 2023-12-15 deployment to TEST HOT 1
- Tests fail on test instance HOT 3
- Still collecting incorrect synonymizations? HOT 2
- Missing scores sometimes? HOT 22
- Changelog since 2024-01-12 deployment to TEST HOT 1
- Update ARAX Example Notebooks? HOT 1
- MySQL error? HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rtx.