Code Monkey home page Code Monkey logo

Comments (6)

AxelGiottonini avatar AxelGiottonini commented on May 24, 2024 1

What I did in a previous project was to cluster the proteins using foldseek (all vs all) and to create a graph using all the protein as vertices and putting edges between paired proteins (receptor - ligand) and proteins in a cluster. Then I used the biggest clusters to create the training set and the smallest for validation and testing (90-5-5).

What may be an option could also be to characterize the binding pocket and split the data according to this characterization, but I miss knowledge to do that kind of things.

from equidock_public.

anton-bushuiev avatar anton-bushuiev commented on May 24, 2024 1

Thank you for sharing!

Yes, I am also considering to create a split based on interface similarity using a tool like this.

from equidock_public.

AxelGiottonini avatar AxelGiottonini commented on May 24, 2024

Hey !

I don't remember finding any code for the split, but you can certainly use create a simple script to cluster your proteins using foldseek or something similar and dgl, networkx or any other graph library you want. The only thing you need to output is then the list of files in the same format than you could find in the original splits definition.

Sincerly meow !

from equidock_public.

anton-bushuiev avatar anton-bushuiev commented on May 24, 2024

Hi, @AxelGiottonini!

Thank you very much for you response. Foldseek looks perfect, I did not know about it. What exactly do you mean by using a graph library? To cluster PPIs using graph metrics based on their EquiDock graph representations? Also, I am still curios how exactly PPIs were split based on the folds of individual interacting partners. If PPI1 has partners with folds A and B and PPI2 with C and D, are they decided to be separated if {A, B} != {C, D} or more strictly {A, B} and {C, D} are disjoint 🤔? It may be important from the perspective of data leakage.

from equidock_public.

AxelGiottonini avatar AxelGiottonini commented on May 24, 2024

You're welcome ! I did not look for such tool but that seems promising !

Also, when I was working with EquiDock, I had results with a bad accuracy considering only the ligand RMSD (as the receptor RMSD is always 0). I'll share my code and results in the next days, but could you consider sharing your results if something similar occurred?

from equidock_public.

anton-bushuiev avatar anton-bushuiev commented on May 24, 2024

Hi! I do not use EquiDock and I was mainly interested in the data split. I am working on a related problem of predicting binding affinity change upon mutation (based on the SKEMPI2 data). It as about learning from already bound structures, so its a bit different.

from equidock_public.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.