Comments (6)
What I did in a previous project was to cluster the proteins using foldseek (all vs all) and to create a graph using all the protein as vertices and putting edges between paired proteins (receptor - ligand) and proteins in a cluster. Then I used the biggest clusters to create the training set and the smallest for validation and testing (90-5-5).
What may be an option could also be to characterize the binding pocket and split the data according to this characterization, but I miss knowledge to do that kind of things.
from equidock_public.
Thank you for sharing!
Yes, I am also considering to create a split based on interface similarity using a tool like this.
from equidock_public.
Hey !
I don't remember finding any code for the split, but you can certainly use create a simple script to cluster your proteins using foldseek or something similar and dgl, networkx or any other graph library you want. The only thing you need to output is then the list of files in the same format than you could find in the original splits definition.
Sincerly meow !
from equidock_public.
Hi, @AxelGiottonini!
Thank you very much for you response. Foldseek looks perfect, I did not know about it. What exactly do you mean by using a graph library? To cluster PPIs using graph metrics based on their EquiDock graph representations? Also, I am still curios how exactly PPIs were split based on the folds of individual interacting partners. If PPI1
has partners with folds A
and B
and PPI2
with C
and D
, are they decided to be separated if {A
, B
} != {C
, D
} or more strictly {A
, B
} and {C
, D
} are disjoint
from equidock_public.
You're welcome ! I did not look for such tool but that seems promising !
Also, when I was working with EquiDock, I had results with a bad accuracy considering only the ligand RMSD (as the receptor RMSD is always 0). I'll share my code and results in the next days, but could you consider sharing your results if something similar occurred?
from equidock_public.
Hi! I do not use EquiDock and I was mainly interested in the data split. I am working on a related problem of predicting binding affinity change upon mutation (based on the SKEMPI2 data). It as about learning from already bound structures, so its a bit different.
from equidock_public.
Related Issues (20)
- Question about the Fig.12 HOT 4
- inference script has no docs HOT 1
- Hyperparameters HOT 2
- Why optimal transport matrix is not used? HOT 4
- Installation problems HOT 1
- Matrix Product Error in Kabsh Model HOT 1
- How to run inference on custom PDB + Problems with Installation HOT 5
- best validation score & some other variations
- Thank you for your generous open source work, salute your work, and wish your soul peace
- Requesting for a requirements.txt for pip
- what is the difference between rigid protein docking and protein-protein docking?
- can not achieve the performance which mentioned in the original paper HOT 8
- error when I run preprocess_raw_data.py. How can I fix it? HOT 4
- where is the docked pose? HOT 8
- How to get the complex pose? HOT 13
- to speed up rsync
- deallock in make_dataset HOT 3
- DIPS dataset HOT 2
- about preprocess_raw_data.py HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from equidock_public.