Code Monkey home page Code Monkey logo

pipgcn's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

pipgcn's Issues

How to convert the .pdb file into the graph.

I am looking for the script to preprocess my protein data i.e. '.pdb' file. Can you provide me with the exact file in the GitHub repo for the process?
Thank you for your help.

Order of node features

I am trying to determine how removing certain node features changes the predictive performance. Are the 70 node features in the array in the same order in which they are presented in Appendix A.1 (e.g. first 20 being PSSM features)?

Stacking dropout layers?

I just wanted to check with you about the design of some of the layers in nn_components.py.

In the nn_components,py file, the dense() function starts and ends with a call to the dropout layer ie:

def dense(input, params, out_dims=None, dropout_keep_prob=1.0, nonlin=True, trainable=True, **kwargs):

   input = tf.nn.dropout(input, dropout_keep_prob)

   # some other code...

   Z = tf.nn.dropout(Z, dropout_keep_prob)

   return Z, params

This means that when this two layers are stacked together, then the output of the dense layer 1 (which already has dropout applied to it) will be passed through another dropout layer at the start of dense layer 2. A similar situation also occurs in no_conv layer.

Was this what was intended?

How to compute node features

Dear authors,

Thank you for the code! Could you point to a reference or a script on how to obtain amino acid features? Its not clear from the pkl files how to do that.

Question about counting nonzero nh_indices in node_edge_average()

In line 59 of the nn_components.py file, there is this line:

nh_sizes = tf.expand_dims(tf.count_nonzero(nh_indices + 1, axis=1, dtype=tf.float32), -1) # for fixed number of neighbors, -1 is a pad value

What is the purpose of counting the non-zero elements in nh_indices + 1? The number of non-zero elements in each row of nh_indices + 1 is always 20.

tensorflow placeholder for vertex and transition

 self.in_vertex1 = tf.placeholder(tf.float32, [None, self.in_nv_dims], "vertex1")
            self.in_vertex2 = tf.placeholder(tf.float32, [None, self.in_nv_dims], "vertex2")
            if self.diffusion:
                self.power_transition1 = tf.placeholder(tf.float32, [None, self.maxpower, None], name="power_transition_matrices")
                self.power_transition2 = tf.placeholder(tf.float32, [None, self.maxpower, None], name="power_transition_matrices")
                input1 = self.in_vertex1, self.power_transition1
                input2 = self.in_vertex2, self.power_transition2

Can I know why None is used here instead of an actual value?

Application to additional proteins

Hi,
Thanks for posting the code. Is there a script to get the features for a custom protein from a pdb/seq? Maybe I missed it somewhere.
Cheers,

about node features

Hello, Thank you for your sharing!
I am trying to reproduce your code. However, l meet some questions about node features.
I have computed he rASA of each residue, but I don't understand how you normalize the data.
Could you describe it?
Thank you!

What does each node feature represents?

Thanks for sharing your code!

The node features have 70 elements for each node, but what does each represent?
The paper describes that 20 of them are amino acid id and others are conservation score, accessible surface etc. but I can't figure out which is which.
Could you please specify the order of the node features?

some question about data meaning

In line 59 of the pw_classifier.py, there is this line:

self.in_nhood_size = train_data[0]["l_hood_indices"].shape[1]

i know that the shape of train_data[0]["l_hood_indices"] is (185, 20, 1),but i don't know the meaning of each dimension.Could u tell me the meaning of each dimension of train_data[0]["l_hood_indices"] ,or how to get the matrix。i want to apply your method to my own graph.
Thx!

What does the pipgcn really learn?

For the case of node average (Equation 1), the features of center node i and its neighbor nodes were used to learn a map at each node in the graph which has the form z_i by activating a non-linear function.

For each input pairwise ligand and receptor protein with different number of residues, the node-averaged graphs with different number of nodes will be got. What does the pipgcn really learn? What is the role of the activation function?

By the way how to handle the graphs with different nodes, or protein with different residues?

Looking forward to your explanation. Thanks.

The sequence information

Hi! I am doing some experiments using your code and data, while I cannot find any residue type information about the sequences (I mean like fasta files). May I have the sequences present in the data given?

training and testing data

What tools were used to generate the data matrix of structure information from pdb file? I'd like to add other complexes to train the model.

data processing part

Hello,

In your data processing part, why there are only 20 neighbors in each node? I have calculated the distance between each node and all other nodes once in a graph. If it is less than 6A, it is considered that there are edges. The number of neighbors in this way may only be 12 or other numbers, but not 20. How do you deal with your 20?

Datasets

Hello, the work is very interesting, and i am very interested in your work. Could you share your training dataset and validation dataset separately in experiments? Thanks!

have a hard time to understand the input to the GCN

Thanks for publishing your well-organized code and the data.

I visualized the graph of the GCN (as below). Besides vertex1 and vertex2, there are other four inputs: edge1, edge2, hood_indices1 and hood_indices2. But the last four inputs are not connected to the computation graph. In the corresponding code, I saw only the first element of the input was used. So I was thinking about the reason why we included the edge data and hood_indices data.

In the node_edge_average method, the edges and hood_indices are used. But this method was not used. Did I miss anything?

model

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.