tum-daml / gemnet_pytorch Goto Github PK

GemNet model in PyTorch, as proposed in "GemNet: Universal Directional Graph Neural Networks for Molecules" (NeurIPS 2021)

Home Page: https://www.daml.in.tum.de/gemnet

License: Other

Python 85.93% Jupyter Notebook 14.07%

paper gnn pytorch graph-neural-networks

gemnet_pytorch's People

Stargazers

Watchers

gemnet_pytorch's Issues

Gradients from analytical differentiation

Dear developers,
I apologize in advance for my question, which may be trivial. I would like to train a differentiable NN model on energies, with gradient from analytic differentiation. I don't want to train directly on gradients, but I would like to get them as energy derivative. Basically I will have my own dataset to train the model on energies, get gradients from the ML model differentiation, and then compare them with a few gradients calculated from very expensive QM calculations. That's indeed why I cannot train directly on gradients, it would be impossible for me to create a sizeable training set.
In case this is possible, could you provide a minimal example or reference?

Thank you in advance.

Best regards,
Leo

Question about pretrained weights

Hi there,

Thank you for providing the codebase.
I am using GemNet-T pertained weights, and I get better performance in the downstream task.
But I can't find how to get the GemNet-T pretrained weights.

Can I know what database and target is used in pretraining?

QM9 dataset

Hi, I notice you implemented experiments in extensive datasets. However, most of your baseline models show their performance in the popular QM9 dataset. Can you please provide this sort of information so that we can have a more clear understanding of how well your model is? Thanks.

Error at line 287 in basis_layers.py

Traceback (most recent call last):
File "", line 1, in
File "/Users/ngoccuongnguyen/GitHub/gemnet_pytorch/gemnet/model/gemnet.py", line 14, in
from .layers.basis_layers import BesselBasisLayer, SphericalBasisLayer, TensorBasisLayer
File "/Users/ngoccuongnguyen/GitHub/gemnet_pytorch/gemnet/model/layers/basis_layers.py", line 287
Kmax = if sph.shape[0]==0 else torch.max(torch.max(Kidx + 1), torch.tensor(0))
^
SyntaxError: invalid syntax

A potential bug in data_provider.py

Thank you for your awesome work! When using the proposed model in another downstream task, I found that there is a potential bug in https://github.com/TUM-DAML/gemnet_pytorch/blob/master/gemnet/training/data_provider.py#L31-L49, it could causes evaluating the model on the training data rather than the testing subset.

For example, our indices are [n_train: n_train+n_val] when using split="val". As the shuffle=False, the idx sampler is SequentialSampler(Subset(data_container, indices)). Notice that this sampler produces indices in range [0, n_val].

However,

        super().__init__(
            data_container, ## here is the full set.
            sampler=batch_sampler,
            collate_fn=lambda x: collate(x, data_container),
            pin_memory=True,  # load on CPU push to GPU
            **kwargs
        )

as shown in this code snippet, the "dataset" passed to the DataLoader is the full dataset. Then, the iterator of data loader would take sample according to the indices provided by the sampler. As illustrated above, the sampler produces indices in range [0, n_val]. Therefore, it actually takes data from a subset of training part.

I noticed that such a problem is avoid in train.ipynb by using two data containers. However, this part would be ambiguous for users who want to generalize this model to other datasets.

I tried to fix it as:

class CustomDataLoader(DataLoader):
    def __init__(
        self, data_container, batch_size, indices, shuffle, seed=None, **kwargs
    ):

        if shuffle:
            generator = torch.Generator()
            if seed is not None:
                generator.manual_seed(seed)
            idx_sampler = SubsetRandomSampler(indices, generator)
        else:
            idx_sampler = SequentialSampler(Subset(data_container, indices))

        batch_sampler = BatchSampler(
            idx_sampler, batch_size=batch_size, drop_last=False
        )
        # Note: a bug here if we do not use subset.
        # Sequential sampler on subset returns index like (0, 1, 2, 3...)
        # However, the returned index is on the full data. 
        # If we do not take Subset here, it uses data from training subset. 
        dataset = data_container if shuffle else Subset(data_container, indices)

        super().__init__(
            dataset ,
            sampler=batch_sampler,
            collate_fn=data_container.collate_fn,
            pin_memory=True,  # load on CPU push to GPU
            **kwargs
        )

Some questions regarding reducing the size of both input and model?

Hi, thanks for sharing the code of GemNet, wonderful work for the prediction of energy.

However, I intend to adopt your model to macromolecules such as proteins instead of small molecules. However, as you know, proteins have far more atoms, which unavoidably leads to much more GPU memories. In order to prevent the explosion of GPU memories, I had to control the size of model inputs. Thus, I am here to ask for some advice on how to reduce the size of your model input.

Specifically, a straightforward way is to decrease the cutoff distance. So that the edges become less. But I believe that is not a good practice. Can you give me some other solutions (for other hyperparameters)?

the dihedral angle difinition

GemNet is a great work for molecules representation learning.I have a question about dihedral angle "cabd" definition,the definition in your paper seems not consider the direction of dihedral angle , what i means is that the dihedral angles in the flowing picture is the same in your definition, but we konw the dihedral angles is different .The difference may have a influence on message passing and your model.

Question about basis functions used

Your series of studies are excellent! I really enjoy studying them. However, I have a question about this paper:

For these three basis equations. My understanding is you trying to use the spherical solution form of the Schrodinger equation which split the solution into the radial part and the angular part as the inductive bias of the machine learning model. This can ensure the input feature information has certain invariants. However, I have two major questions.

In my understanding we shall have a fixed origin thus a fixed axis system to have such a solution form. However, as shown in the figure the three equations used geometrical information do not seem like traditional quantum physics. They seem like from different origins. For the dimenet works, I think I can find an origin so I can understand. However, for Gemnet it seems difficult to understand.
As you know, such an invariant form of information will be transformed by learnable weight matrices, will the symmetrical form as well as invariant property still be maintained after the transformation? I think this may be destroyed by the transformation, then will it still be that significant to have these kinds of preprocessing?

Thank you so much for such inspiring studies! I am really curious about the above two questions.

Kindly Regards,

Jiali

Question about the gradient of positions

Hi, great work, and thanks for sharing the code. I have a small question regarding the gradient of inputs["R"].
As far as I know, inputs["R"] represents the positions of atoms. Why do you make its requires_grad True?

        if not self.direct_forces:
            inputs["R"].requires_grad = True

Package installation throws errors

There were several issues prohibiting the installation of gemnet as package:

The setup.py file referred to a gemnet_pytorch package even though the folder name is gemnet
python, cudatoolkit, pytorch and torch_geometric are invalid pip packages.

Most of these should be fixed by 1922774

The code that predicts on the test set?

The predict.py file you submitted is about predictions for a molecule, can you upload the code that predicts on the test set? I would be very grateful.

Question about creating graph

Dear developers,

For datasets with chemical bonds, there are generally two ways to build a molecule graph:

Based on the radius graph
Using chemical bonds as edges directly

Which way do you think is better? If there is a large-scale dataset with 1M samples, using the radius graph model will become very heavy and difficult to train.

Thanks:)

Reproducing Questions

Hi there,

Thank you for providing the codebase. I am trying to reproduce GemNet on COLL and MD17, and have two questions below:

For COLL, I'm wondering what is the total training time? On my end, it seems to be quite large (300-400 hours), so just want to double-check with you.
For MD17, can you help provide the config.yml and scaling_factors.json files on each task?

Any help is appreciated.

Coll dataset

Hi,

Thanks for sharing this GemNet implementation.
Where can I download coll dataset in npz format? Or how it can be created?

Thanks,
Boris

tum-daml / gemnet_pytorch Goto Github PK

gemnet_pytorch's People

Stargazers

Watchers

Forkers

gemnet_pytorch's Issues

Recommend Projects

Recommend Topics

Recommend Org