Comments (2)
Hi,
thank you for your interest and great question!
This is not a bug, but rather our purpose.
In specific, we indeed use the col-wise softmax to make the overall architecture as powerful as the WL test. Considering the example of Graph Isomorphism Network (GIN) that uses the summation over all node representations for approximating the WL test. You can check the details in Section 3.3, Section 3.4, and Proof in Appendix A of our paper.
You can also use the row-wise softmax without using the cluster option in Line #29 of layers.py, but we found that this row-wise GMT mostly underperforms our proposed col-wise GMT.
Finally, the above result only happens in the final layer, where we reduce all remaining nodes into one particular node with one seed vector, thus this is the same as the sum pooling. You are correct. However, please note that, when we reduce n nodes into k different nodes, the col-wise softmax correctly assigns k cluster values (summed to one) for each node; the matrix is not the all-1 matrix.
from gmt.
Hi, thank you for your interest and great question!
This is not a bug, but rather our purpose. In specific, we indeed use the col-wise softmax to make the overall architecture as powerful as the WL test. Considering the example of Graph Isomorphism Network (GIN) that uses the summation over all node representations for approximating the WL test. You can check the details in Section 3.3, Section 3.4, and Proof in Appendix A of our paper.
You can also use the row-wise softmax without using the cluster option in Line #29 of layers.py, but we found that this row-wise GMT mostly underperforms our proposed col-wise GMT.
Finally, the above result only happens in the final layer, where we reduce all remaining nodes into one particular node with one seed vector, thus this is the same as the sum pooling. You are correct. However, please note that, when we reduce n nodes into k different nodes, the col-wise softmax correctly assigns k cluster values (summed to one) for each node; the matrix is not the all-1 matrix.
Get it. Thanks for your reply. While reducing n nodes into k different nodes, col-wise softmax is necessary (same as DiffPool). But, to my knowledge and experiments, in hiv datasets, a mean or sum global pooling doesn't perform better than GlobalAttentionPool which assign nodes different weights. Maybe the method to get node weights of the row-wise GMT doesn't good enough. I think GMPool_l+SelfAtt+ some better global pooling may be work. I'all try it.
Thanks again.
from gmt.
Related Issues (6)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gmt.