GCN layer is sensitive to distribution of node features. When graphs falling into separate classes, have less amount of similar node features among each other, GCN is powerful as much as WL-test. This explains why GCN outperformed GIN model in our case.(see this paper)
Knowing that, since all the nodes have a similar element (GPE value along depth dim which is fixed to 0.0) in their node feature, we suspect this element reduces performance of the GCN when dataset with GPE is fed. This hypothesis is yet to be checked.
check if GPE without z coordinate (let's denote GPE -xy) is more suited than GPE with z coordinate (GPE-xyz)