Code Monkey home page Code Monkey logo

high-ppi's Introduction

zqgao22

high-ppi's People

Contributors

zqgao22 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

high-ppi's Issues

Questions about the Metrictor_PPI function

I would like to express my gratitude for making the code open-source. I have a question regarding the Metrictor_PPI function in HIGH-PPI/utils.py and I was hoping you could assist me.

  • It seems that the arguments of the sklearn precision_recall_curve function are reversed. Could you kindly clarify if this is the case?
  • self.pre is (0,1) label instead of the predicted probability $P(y=1)$. Line 150 in HIGH-PPI/model_train.py. I am concerned that this could affect the correctness of AURPC.

How to generate the Fig.2a?

I am primarily focused on a specific PR-curve. PPI presents a multi-label problem, and to the best of my knowledge, it's standard to draw a PR-curve for each class.
However, in Fig.2a, a model is represented by a single line. How is the transition made from a multi-label PR-curve to a single curve?

F1

Hello! Wondering why the experiment turned out differently every time you set up a random seed, failing to review your best metrics

Environment not working

Hi, Ziqi. I've been trying to use environment.yml file to create my Conda environment. However, some error occurs to me that some packages were found to be conflicted. I was wandering if you mind providing another basic environment.yml file.

Questions about the Data Processing for New Datasets

Hello Ziqi, we have some issues about the data processing for new datasets.

  1. We used ID in ensp_uniprot.txt to download native protein structures, but we discovered that over 600 native protein structures cannot be found. Could you please provide the address of the native protein structures or can you provide the detailed process of data processing, thanks.
  2. We used "generate_adj.py" and "generate_feat.py" to process the downloaded native protein structures, but found that they did not match. Is there any problem that needs to be remedied? Thank you.

Question about how to obtain the PDB files.

Hi~

Thank you for the release of this great work! I need some help about how to obtain the PDB files we needed.

Because I am not familar to the PDB website, any suggestions are helpful for me. Taking the SHS27k dataset as an example, I have typed the "SHS27k" into the search box, and it returns nothing.

Could you please give some solutions to get the right PDB file for the SHS27k dataset? Thank you in advance!

Questions about the gnn_models_sag.py

Hello Ziqi, I checked the gnn_models_sag.py file and observed that the model architectures are quite different from the models depicted in Supplementary Fig. 1 of the paper.

Regarding BGNN, several discrepancies are present, including the feature dimension, activation function type, GCN block count, the order of activation function and batch normalization layers, and additional GCNConv modules following SAG pooling, which contradicts Supplementary Fig. 1. Similarly, in TGNN, replacing the feature_fusion argument with 'concat' (i.e., concatenation used in the paper) is insufficient, and I must modify the feature dimension of fc2. I may have overlooked other discrepancies.

Based on the above findings, it appears that there may be a final version of the model structure. Would it be possible for you to release the final version of the gnn_models_sag.py file so that we can directly implement the best model architecture as described in the paper? Thank you!

About vec5_CTC.txt: what is the basis for determining these vectors?

Thanks you for making such a project public.
At the same time, I have some issues about this project:

In the vec5_CTC.txt file, each amino acid have a corresponding vector. What is the basis for determining these vectors? Or is it according to the conventions of a previous project?
Why not to use the Protein Language Models to get the amino acid embedding?

error of generate_adj.py and generate_feat.py

Hello!I have some problem when running the generate_adj.py & generate_feat.py,it shows that:
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.
Thanks for your reply.

environment.yml

Hi! I ran your conda env create -f environment.yml and found that the version incompatibility does not cause the creation of a virtual environment, can you provide some solutions?

Question of ppi label

Thanks for your fantastic work. I have a question.

In the Suppl. Data 2, you show the groud truth label of the PPI of "9606.ENSP00000254722 9606.ENSP00000261349" (index of 748) is "Reaction Binding Ptmod Activation Inhibition Catalysis Expression = 1 1 0 0 0 1 0". However, The interaction type of "9606.ENSP00000254722 9606.ENSP00000261349" found in the file link (https://drive.google.com/file/d/1CtS2V52lCG0bEjss0MguesJq19ZZ2LCZ/view?usp=drive_link), which is also you provided, is "9606.ENSP00000254722 9606.ENSP00000261349 inhibition inhibition f f 800". The two labels of the same ppi data from two files both you provided are not same. Are there some other definitions of the ppi label?

Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.