Code Monkey home page Code Monkey logo

schnetpack's Introduction

SchNetPack - Deep Neural Networks for Atomistic Systems

Code style: black

SchNetPack is a toolbox for the development and application of deep neural networks to the prediction of potential energy surfaces and other quantum-chemical properties of molecules and materials. It contains basic building blocks of atomistic neural networks, manages their training and provides simple access to common benchmark datasets. This allows for an easy implementation and evaluation of new models.

The documentation can be found here.

Features
  • SchNet - an end-to-end continuous-filter CNN for molecules and materials [1-3]
  • PaiNN - equivariant message-passing for molecules and materials [4]
  • Output modules for dipole moments, polarizability, stress, and general response properties
  • Modules for electrostatics, Ewald summation, ZBL repulsion
  • GPU-accelerated molecular dynamics code incl. path-integral MD, thermostats, barostats

Installation

Install with pip

The simplest way to install SchNetPack is through pip which will automatically get the source code from PyPI:

pip install schnetpack

Install from source

You can also install the most recent code from our repository:

git clone https://github.com/atomistic-machine-learning/schnetpack.git
cd schnetpack
pip install .

Visualization with Tensorboard

SchNetPack supports multiple logging backends via PyTorch Lightning. The default logger is Tensorboard. SchNetPack also supports TensorboardX.

Getting started

The best place to get started is training a SchNetPack model on a common benchmark dataset via the command line interface (CLI). When installing SchNetPack, the training script spktrain is added to your PATH. The CLI uses Hydra and is based on the PyTorch Lightning/Hydra template that can be found here. This enables a flexible configuration of the model, data and training process. To fully take advantage of these features, it might be helpful to have a look at the Hydra and PyTorch Lightning docs.

Example 1: QM9

In the following, we focus on using the CLI to train on the QM9 dataset, but the same procedure applies for the other benchmark datasets as well. First, create a working directory, where all data and runs will be stored:

mkdir spk_workdir
cd spk_workdir

Then, the training of a SchNet model with default settings for QM9 can be started by:

spktrain experiment=qm9_atomwise

The script prints the defaults for the experiment config qm9_atomwise. The dataset will be downloaded automatically to spk_workdir/data, if it does not exist yet. Then, the training will be started.

All values of the config can be changed from the command line, including the directories for run and data. By default, the model is stored in a directory with a unique run id hash as a subdirectory of spk_workdir/runs. This can be changed as follows:

spktrain experiment=qm9_atomwise run.data_dir=/my/data/dir run.path=~/all_my_runs run.id=this_run

If you call spktrain experiment=qm9_atomwise --help, you can see the full config with all the parameters that can be changed. Nested parameters can be changed as follows:

spktrain experiment=qm9_atomwise run.data_dir=<path> data.batch_size=64

Hydra organizes parameters in config groups which allows hierarchical configurations consisting of multiple yaml files. This allows to easily change the whole dataset, model or representation. For instance, changing from the default SchNet representation to PaiNN, use:

spktrain experiment=qm9_atomwise run.data_dir=<path> model/representation=painn

It is a bit confusing at first when to use "." or "/". The slash is used, if you are loading a preconfigured config group, while the dot is used changing individual values. For example, the config group "model/representation" corresponds to the following part of the config:

    model:
      representation:
        _target_: schnetpack.representation.PaiNN
        n_atom_basis: 128
        n_interactions: 3
        shared_interactions: false
        shared_filters: false
        radial_basis:
          _target_: schnetpack.nn.radial.GaussianRBF
          n_rbf: 20
          cutoff: ${globals.cutoff}
        cutoff_fn:
          _target_: schnetpack.nn.cutoff.CosineCutoff
          cutoff: ${globals.cutoff}

If you would want to additionally change some value of this group, you could use:

spktrain experiment=qm9_atomwise run.data_dir=<path> model/representation=painn model.representation.n_interactions=5

For more details on config groups, have a look at the Hydra docs.

Example 2: Potential energy surfaces

The example above uses AtomisticModel internally, which is a pytorch_lightning.LightningModule, to predict single properties. The following example will use the same class to predict potential energy surfaces, in particular energies with the appropriate derivates to obtain forces and stress tensors. This works since the pre-defined configuration for the MD17 dataset, provided from the command line by experiment=md17, is selecting the representation and output modules that AtomisticModel is using. A more detailed description of the configuration and how to build your custom configs can be found here.

The spktrain script can be used to train a model for a molecule from the MD17 datasets

spktrain experiment=md17 data.molecule=uracil

In the case of MD17, reference calculations of energies and forces are available. Therefore, one needs to set weights for the losses of those properties. The losses are defined as part of output definitions in the task config group:

    task:
      outputs:
        - _target_: schnetpack.task.ModelOutput
          name: ${globals.energy_key}
          loss_fn:
            _target_: torch.nn.MSELoss
          metrics:
            mae:
              _target_: torchmetrics.regression.MeanAbsoluteError
            mse:
              _target_: torchmetrics.regression.MeanSquaredError
          loss_weight: 0.005
        - _target_: schnetpack.task.ModelOutput
          name: ${globals.forces_key}
          loss_fn:
            _target_: torch.nn.MSELoss
          metrics:
            mae:
              _target_: torchmetrics.regression.MeanAbsoluteError
            mse:
              _target_: torchmetrics.regression.MeanSquaredError
          loss_weight: 0.995

For a training on energies* and forces, we recommend to put a stronger weight on the loss of the force prediction during training. By default, the loss weights are set to 0.005 for the energy and 0.995 for forces. This can be changed as follow:

spktrain experiment=md17 data.molecule=uracil task.outputs.0.loss_weight=0.005 task.outputs.1.loss_weight=0.995

Logging

Beyond the output of the command line, SchNetPack supports multiple logging backends over PyTorch Lightning. By default, the Tensorboard logger is activated. If TensorBoard is installed, the results can be shown by calling:

tensorboard --logdir=<rundir>

Furthermore, SchNetPack comes with configs for a CSV logger and Aim. These can be selected as follows:

spktrain experiment=md17 logger=csv

LAMMPS interface

SchNetPack comes with an interface to LAMMPS. A detailed installation guide is linked in the How-To section of our documentation.

Extensions

SchNetPack can be used as a base for implementations of advanced atomistic neural networks and training tasks. For example, there exists an extension package called schnetpack-gschnet for the most recent version of cG-SchNet [5], a conditional generative model for molecules. It demonstrates how a complex training task can be implemented in a few custom classes while leveraging the hierarchical configuration and automated training procedure of the SchNetPack framework.

Citation

If you are using SchNetPack in your research, please cite:

K.T. Schütt, S.S.P. Hessmann, N.W.A. Gebauer, J. Lederer, M. Gastegger. SchNetPack 2.0: A neural network toolbox for atomistic machine learning. J. Chem. Phys. 2023, 158 (14): 144801. 10.1063/5.0138367.

K.T. Schütt, P. Kessel, M. Gastegger, K. Nicoli, A. Tkatchenko, K.-R. Müller. SchNetPack: A Deep Learning Toolbox For Atomistic Systems. J. Chem. Theory Comput. 2019, 15 (1): 448-455. 10.1021/acs.jctc.8b00908.

@article{schutt2023schnetpack,
    author = {Sch{\"u}tt, Kristof T. and Hessmann, Stefaan S. P. and Gebauer, Niklas W. A. and Lederer, Jonas and Gastegger, Michael},
    title = "{SchNetPack 2.0: A neural network toolbox for atomistic machine learning}",
    journal = {The Journal of Chemical Physics},
    volume = {158},
    number = {14},
    pages = {144801},
    year = {2023},
    month = {04},
    issn = {0021-9606},
    doi = {10.1063/5.0138367},
    url = {https://doi.org/10.1063/5.0138367},
    eprint = {https://pubs.aip.org/aip/jcp/article-pdf/doi/10.1063/5.0138367/16825487/144801\_1\_5.0138367.pdf},
}
@article{schutt2019schnetpack,
    author = {Sch{\"u}tt, Kristof T. and Kessel, Pan and Gastegger, Michael and Nicoli, Kim A. and Tkatchenko, Alexandre and Müller, Klaus-Robert},
    title = "{SchNetPack: A Deep Learning Toolbox For Atomistic Systems}",
    journal = {Journal of Chemical Theory and Computation},
    volume = {15},
    number = {1},
    pages = {448-455},
    year = {2019},
    doi = {10.1021/acs.jctc.8b00908},
    URL = {https://doi.org/10.1021/acs.jctc.8b00908},
    eprint = {https://doi.org/10.1021/acs.jctc.8b00908},
}

Acknowledgements

CLI and hydra configs for PyTorch Lightning are adapted from this template:

References

  • [1] K.T. Schütt. F. Arbabzadah. S. Chmiela, K.-R. Müller, A. Tkatchenko. Quantum-chemical insights from deep tensor neural networks. Nature Communications 8. 13890 (2017) 10.1038/ncomms13890

  • [2] K.T. Schütt. P.-J. Kindermans, H. E. Sauceda, S. Chmiela, A. Tkatchenko, K.-R. Müller. SchNet: A continuous-filter convolutional neural network for modeling quantum interactions. Advances in Neural Information Processing Systems 30, pp. 992-1002 (2017) Paper

  • [3] K.T. Schütt. P.-J. Kindermans, H. E. Sauceda, S. Chmiela, A. Tkatchenko, K.-R. Müller. SchNet - a deep learning architecture for molecules and materials. The Journal of Chemical Physics 148(24), 241722 (2018) 10.1063/1.5019779

  • [4] K. T. Schütt, O. T. Unke, M. Gastegger Equivariant message passing for the prediction of tensorial properties and molecular spectra. International Conference on Machine Learning (pp. 9377-9388). PMLR, Paper.

  • [5] N. W. A. Gebauer, M. Gastegger, S. S. P. Hessmann, K.-R. Müller, K. T. Schütt Inverse design of 3d molecular structures with conditional generative neural networks. Nature Communications 13. 973 (2022) 10.1038/s41467-022-28526-y

schnetpack's People

Contributors

bartolsthoorn avatar chgaul avatar dependabot[bot] avatar divide-by-0 avatar dom1l avatar dumkar avatar epens94 avatar farnazh avatar giadefa avatar jan-janssen avatar jduerholt avatar jhrmnn avatar jnsls avatar khaledkah avatar ktschuett avatar maltimore avatar mgastegger avatar niklasgebauer avatar nzhan avatar p16i avatar pankessel avatar robertnf avatar rsaite avatar sgugler avatar sirmarcel avatar stefaanhessmann avatar vosatorp avatar wardlt avatar zyt0y avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

schnetpack's Issues

Share trained model

Hi,

I am really interested in the SchNet's performance in crystals. I wonder if you could share the best trained models, especailly the one trained with Materials Project data?

Thanks!

Weike

cleaning up the parser arguments in schnetpack_qm9.py

I was wondering, couldn't the call of schnetpack_qm9 be further simplified, at least for the evaluation? Some of the arguments, like property, are general arguments, not training arguments, however, if the model was trained for a certain property it makes little sense to evaluate it on another, right?

Then in line 308 and 310, the train_args are set to args or loaded from the json file created when training the model, however at some points it seems a bit arbitrary when train_args or args are used, e.g. in line 314 when the qm9 dataset is loaded, train_args.property is used, but then below when the atomref is loaded, args.property is used instead.

I also noticed that pool_mode has disappeared from the arguments, but I assume it is still relevant to set this to 'avg' instead of 'sum' for properties like LUMO, so could that be added back in (or set automatically depending on the property)?

Force training

Was wondering what is the proper way to add forces to the loss function?
Currently I have my loss function defined like this

def loss(batch, result):
	
	N = torch.sum((batch['_atomic_numbers'] != 0).float(), 1, keepdim=True) # Nr of atoms per image
	
	gamma = (result['y']/N - batch['energy']/N)**2
	
	return torch.sum(gamma)

which works fine.
However if i try to add forces to the loss function like this

def loss(batch, result):
	
	N = torch.sum((batch['_atomic_numbers'] != 0).float(), 1, keepdim=True) # Nr of atoms per image
	
	gammaE = (result['y']/N - batch['energy']/N)**2
	gammaF = torch.sum(torch.sum((result['dydx'] - batch['forces'])**2, 2), 1, keepdim=True)/(3*N)
	
	return torch.sum(gammaE + gammaF)

it stops working. I get a value of the loss function for the first image then the function just returns nan. Not sure what the problem is here, is there a better way to include forces in the loss function? or is there a bug somewhere.

Reproducing paper result

Hi,

Can you let me know what setting of parameters, optimizers, split and seed to reproduce the result of
https://arxiv.org/pdf/1712.06113.pdf?

For homo, I tried:
python3 src/scripts/schnetpack_qm9_new.py train schnet data/seed_2000/homo/data/qm9.db data/seed_2000/homo/model --split 109000 1000 --cuda --property homo --seed 2000 --parallel

I got train/val/test mae as 0.00497/0.04701/0.04877, but the test mae stated in the paper is 0.041. What should I do to get this result?

Should I change the split to "--split 110000 1000" and batch size to "--batch_size 32" following the arxiv paper?

Restarting training in CUDA mode

Hi!
I am still working with the src/scripts/schnetpack_qm9.py script.
When the training is stopped, I suppose I should be able to restart it from the checkpoint-xx.pth.tar files.
However, if the training is done with CUDA, an error occurs the second time:

INFO:root:training...
Traceback (most recent call last):
  File "/home/olimt/projects/rrg-cotemich-ac/olimt/programs/schnetpack/src/scripts/schnetpack_qm9.py", line 134, in <module>
    train(args, model, train_loader, val_loader, device, metrics=metrics)
  File "/home/olimt/miniconda3/lib/python3.7/site-packages/scripts/script_utils/training.py", line 54, in train
    trainer.train(device, n_epochs=args.n_epochs)
  File "/home/olimt/miniconda3/lib/python3.7/site-packages/schnetpack/train/trainer.py", line 245, in train
    raise e
  File "/home/olimt/miniconda3/lib/python3.7/site-packages/schnetpack/train/trainer.py", line 175, in train
    self.optimizer.step()
  File "/home/olimt/miniconda3/lib/python3.7/site-packages/torch/optim/adam.py", line 93, in step
    exp_avg.mul_(beta1).add_(1 - beta1, grad)
RuntimeError: expected backend CPU and dtype Float but got backend CUDA and dtype Float

There's probably a missing .to('cuda') somewhere.
I'll try to look into it more, but I wanted to let you know.
Thanks!

GDML from the sgdml package

I talked to @stefanch, and maybe it makes more sense to have the GDML code in the sgdml package, and just import it in schnetpack and have sgdml as a dependency. What do you think about that? If you agree, should I make sgdml a hard or an optional "extras" dependency of schnetpack?

sorry, but master branch still not work...

python spk_run.py train schnet qm9 qm9_data qm9_model --split 1000 200
Traceback (most recent call last):
File "spk_run.py", line 5, in
from schnetpack.utils.script_utils import settings
ImportError: cannot import name 'settings' from 'schnetpack.utils.script_utils'

Extremely low GPU utilization

Hi, so I was playing a bit around with training SchNetPack. I noticed that when using --cuda on my Titan V it has very very low GPU utilization. It uses around 10GB of GPU memory but is practically 90% of the time at 0% GPU utilization with some very short spikes in between (of around a second) where it seems to compute. I tried it with the QM9 and ANI1 examples. QM9 has slightly better utilization than ANI1 but still around 60-70% idle.

Have you also noticed such behavior?

Storing the symmetry functions calculated

Hi,

I am wondering whether there is an easy way to store the symmetry functions calculated for the data, so it doesn't need to calculate them for every epoch. It is a very time consuming step for my model. Thanks!

Best,
Mingjie

wrong split argument in evaluation when getting loaders

For spk_run the split argument for train is [n_train, n_val] while for eval it's ["train", "validation", "test"]. When getting the loaders for train/val/test data (line 34), args is passed, which works in train mode but in eval mode it's not the number of samples anymore but instead the list with up to 3 fold names (with all 3 that's also one argument to many that is passed to the train_test_split function). Not sure though if it should be replaced by train_args instead since then the batch_size is not what was specified.

Force training with angular symmetry functions for clusters with different number of atoms

Hi,

It seems that the trainer is not working properly when I did force training with angular symmetry functions for clusters with different number of atoms. (ElementalEnergy)

When I did the training for clusters with same number of atoms, it is OK. By looking at the log, it seems that the losses are going down. However, if I use it for clusters with different number of atoms, the losses become nan. This applies to both "Behler" and "Weighted" mode. Changing the learning rate of the Adam optimizer does not help. I don't know exactly why this is the case. My guess is that it could be due to the padding you do to the "representation" when you have clusters of different size. Could you have a look at that? Thanks!

from schnetpack.data import AtomsData
import torch
import torch.nn.functional as F
from torch.optim import Adam
from torch.optim import LBFGS
import pickle 
import schnetpack as spk
import schnetpack.atomistic as atm
import schnetpack.representation as rep
from schnetpack.datasets import *
import schnetpack.evaluation as eva
from schnetpack.metrics import MeanSquaredError

Name = 'debug'  # (ref:Name)

if not os.path.exists(Name):
    os.makedirs(Name)


data = AtomsData('./db/master-jb.db', required_properties=['energy','forces'],collect_triples=True)

#40 training data, 5 val data, rest are for test
train, val, test = data.create_splits(30, 30)
train_loader = spk.data.AtomsLoader(train, batch_size=3, num_workers=0)
val_loader = spk.data.AtomsLoader(val)
test_loader = spk.data.AtomsLoader(test)
loader = [train_loader, val_loader, test_loader]
pickle.dump(loader, open('./'+Name+'/loader.sav','wb'))

reps = rep.BehlerSFBlock(n_radial=2, n_angular=2, elements=frozenset([46,79]), cutoff_radius=6.0, mode = 'weighted')
#print(reps.n_symfuncs)
output = ElementalEnergy(n_in=reps.n_symfuncs,n_layers=3,n_hidden=10, elements=frozenset([46,79]),return_force=True,create_graph=True)
model = atm.AtomisticModel(reps, output)

trainable_params = filter(lambda p: p.requires_grad, model.parameters())

metric_E = [MeanSquaredError(target = 'energy',model_output='y'),MeanSquaredError(target = 'forces',model_output='dydx')]
hook_E = spk.train.CSVHook("./"+Name, metric_E)

opt = Adam(trainable_params, lr=1e-4)

loss = lambda b, p: F.mse_loss(p["y"], b['energy'])+F.mse_loss(p["dydx"], b['forces'])
trainer = spk.train.Trainer(Name+"/", model, loss,
                      opt, train_loader, val_loader,hooks = [hook_E])

# start training
trainer.train(torch.device("cpu"))

Best,
Mingjie

LoggingHook for training data

Hi,

There is a problem in the LoggingHook for training batch.

def on_batch_end(self, trainer, train_batch, result, loss):
        if self.log_train_loss:
            self._train_loss += float(loss.data)
            self._counter += 1

The self.train_loss is never divided by self._counter so when it is logged, it depends on the how many batches the training data is divided into. It is not simply divided by the count, since the batch size might not be a factor of the total size, so I don't know what is the best way to fix it. Could you fix that? Thanks!

JCTC paper training example

Thanks a lot for the great code!

If I tried to run the training example from the paper (Chart 1), I got some errors. If the ´import´ section should stay untouched the following lines should be changed to:
'''
15: loader = spk.data.AtomLoader...
18: val_loader = spk.data.AtomsLoader...
28: trainer = spk.train.Trainer...

Error when running schnetpack_qm9.py in wacsf mode

Hi,
when running the schnetpack_qm9.py script this way:
schnetpack_qm9.py train wacsf qm9.db model2/ --split 100 100

I got the output:

INFO:root:Random state initialized with seed 3298687774
INFO:root:QM9 will be loaded...
INFO:schnetpack.data.atoms:The dataset has already been downloaded and stored at qm9.db
INFO:root:create splits...
INFO:root:load data...
INFO:root:calculate statistics...
INFO:schnetpack.data.loader:statistics will be calculated...
Traceback (most recent call last):
  File "/home/oliviermt/miniconda3/bin/schnetpack_qm9.py", line 78, in <module>
    representation = get_representation(args, train_loader=train_loader)
  File "/home/oliviermt/miniconda3/lib/python3.6/site-packages/scripts/script_utils/model.py", line 12, in get_representation
    if args.cutoff_function == "hard":
AttributeError: 'Namespace' object has no attribute 'cutoff_function'

And there is no error when using the schnet mode.

In file src/scripts/script_utils/model.py, the # build cutoff module block should be inside the if args.model == "schnet" block if it is not used in wacsf.

I corrected it here and commented the old block and it seems to be working.

Thanks a lot!

qm9_*.py in example folder not work...

python qm9_tutorial.py
INFO:root:get dataset
INFO:schnetpack.data.atoms:Starting download
INFO:root:Downloading GDB-9 data...
INFO:root:Done.
INFO:root:Extracting files...
INFO:root:Done.
INFO:root:Parse xyz files...
INFO:root:Parsed: 10000 / 133885
INFO:root:Parsed: 20000 / 133885
INFO:root:Parsed: 30000 / 133885
INFO:root:Parsed: 40000 / 133885
INFO:root:Parsed: 50000 / 133885
INFO:root:Parsed: 60000 / 133885
INFO:root:Parsed: 70000 / 133885
INFO:root:Parsed: 80000 / 133885
INFO:root:Parsed: 90000 / 133885
INFO:root:Parsed: 100000 / 133885
INFO:root:Parsed: 110000 / 133885
INFO:root:Parsed: 120000 / 133885
INFO:root:Parsed: 130000 / 133885
INFO:root:Write atoms to db...
INFO:root:Done.
INFO:root:Downloading GDB-9 atom references...
INFO:root:Done.
INFO:schnetpack.data.loader:statistics will be calculated...
INFO:root:build model
Traceback (most recent call last):
File "qm9_tutorial.py", line 38, in
spk.Atomwise(
AttributeError: module 'schnetpack' has no attribute 'Atomwise'

and

python qm9_schnet.py
Traceback (most recent call last):
File "qm9_schnet.py", line 20, in
output = schnetpack.atomistic.Atomwise()
TypeError: init() missing 1 required positional argument: 'n_in'

help help, I use the latest code pulled from github and python3.7, it seems the schnetpack version I use is wrong?

Problem with schnetpack_qm9.py

Hello,

I would like to ask if schnetpack_qm9.py is still working. No matter what I try, I eventually run into RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:441.

I only create a new virtualenv, pip install schnetpack and then I run:
schnetpack_qm9.py train schnet QM9 TestRun --split 10000 10000 --cuda

Output is following (when I already downloaded QM9):

INFO:root:Random state initialized with seed 3679303364
INFO:root:QM9 will be loaded...
INFO:root:create splits...
INFO:root:load data...
INFO:root:calculate statistics...
INFO:root:cached statistics was loaded...
INFO:root:The model you built has: 1676133 parameters
INFO:root:training...
Traceback (most recent call last):
  File "/home/kubaw/.virtualenvs/schnet/bin/schnetpack_qm9.py", line 357, in <module>
    train(args, model, train_loader, val_loader, device)
  File "/home/kubaw/.virtualenvs/schnet/bin/schnetpack_qm9.py", line 177, in train
    trainer.train(device)
  File "/home/kubaw/.virtualenvs/schnet/lib/python3.7/site-packages/schnetpack/train/trainer.py", line 216, in train
    raise e
  File "/home/kubaw/.virtualenvs/schnet/lib/python3.7/site-packages/schnetpack/train/trainer.py", line 144, in train
    result = self._model(train_batch)
  File "/home/kubaw/.virtualenvs/schnet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/kubaw/.virtualenvs/schnet/lib/python3.7/site-packages/schnetpack/atomistic.py", line 55, in forward
    inputs['representation'] = self.representation(inputs)
  File "/home/kubaw/.virtualenvs/schnet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/kubaw/.virtualenvs/schnet/lib/python3.7/site-packages/schnetpack/representation/schnet.py", line 191, in forward
    r_ij = self.distances(positions, neighbors, cell, cell_offset)
  File "/home/kubaw/.virtualenvs/schnet/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/kubaw/.virtualenvs/schnet/lib/python3.7/site-packages/schnetpack/nn/neighbors.py", line 76, in forward
    return atom_distances(positions, neighbors, cell, cell_offsets, return_directions=self.return_directions)
  File "/home/kubaw/.virtualenvs/schnet/lib/python3.7/site-packages/schnetpack/nn/neighbors.py", line 36, in atom_distances
    offsets = cell_offsets.bmm(cell)
RuntimeError: cublas runtime error : the GPU program failed to execute at /pytorch/aten/src/THC/THCBlas.cu:441

I am not sure where this error comes from because for me a basic pytorch nn is working.
I have also noticed that the version installed with pip needs a folder as datadir, the version installed with setup.py needs a .db file as datadir. It also seems that there is recent work on scripts using sacred, so it may also be, that old scripts are not working anymore. It would be great if you could confirm, that the script is working as then I could search for the origin of the error elsewhere; or if you had an idea where this error comes from.

md.py load_model error

By trying to load an existing model for md-calculations, I run into this error:

model = md.load_model("test_model")
Traceback (most recent call last):
File "", line 1, in
File "/usr/lib/python3.6/site-packages/schnetpack/md.py", line 377, in load_model
model.load_state_dict(torch.load(os.path.join(modelpath, 'best_model')))
File "/usr/lib/python3.6/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for AtomisticModel:
Unexpected key(s) in state_dict: "output_modules.atomref.weight".

Possible incorrect variable used in schnetpack_matproj.py

The following part in "schnetpack_matproj.py" near line 257:

    split_path = os.path.join(args.modelpath, 'split.npz')
    if args.mode == 'train':
        if args.split_path is not None:
            copyfile(args.split_path, split_path)

    data_train, data_val, data_test = mp.create_splits(*train_args.split, split_file=split_path)

seems to create problems when the split is being specified as two integers in train_args.split instead of a split file in args.split_path. The mp.create_splits()-function will ignore the train_args.split parameters as the split_path-variable will always be a valid string. One way to correct the behaviour could be with:

    split_path = None
    if args.mode == 'train':
        if args.split_path is not None:
            split_path = os.path.join(args.modelpath, 'split.npz')
            copyfile(args.split_path, split_path)

    data_train, data_val, data_test = mp.create_splits(*train_args.split, split_file=split_path)

WarmRestartHook wrong relational operator?

Hey,
in the function 'WarmRestartHook', self.best_current is initialized with infinity.
on_validation_end it calls an if-condition, when the val_loss is bigger than best_current. This can not be reached. I think it should be val_loss is smaller than best_current.
Than there is a condition, where it asks if best_current is bigger than best_previous. I think also here it should be reverse.

Maybe I misunderstood some things. In this case, I would be very happy for a short explanation.

I have 2 more questions, which I want to put in this ticket. I hope you can give your opinions.

  1. does it make sense to combine warmResarts with ReduceLROnPlateauHook?
  2. Did you have experience with a huge amount of space, which is necessary? Have small areas of proteins (10x10x10a) with cutoff 6a.
    Schnet has n_atom_basis 64, filters 64, n_interactions 10. A batchsize bigger than 2 seems to big for a 8GB 1080GTX. An increment of the network to 128/128 with Batch_size 1 seems to be the maximum. Have you experience with this and an idea how to handle this? Batch_size of 1 or 2 seems very small. Did I assume a to big network or cutoff maybe? Maybe you can share your experience, which could help me at tuning.

Scalability of SchNet

Hi guys,

I was playing around with the package and wanted to know what the limits of SchNet are. So I tried to feed a protein (trypsin, 1700 atoms) into the network using the default settings and ran into some Cuda out of memory errors (TitanV 12GB).
I tried to scale the features, number of interactions blocks etc. down while using a batch size of 1 and still did not get it to work.
So what is your experience with the scalability of this network? Do you think it is due to the model itself (including distance matrices, features, rbf etc.), the implementation (optimizing it a bit more) or did I approach it wrong?

In case you want to reproduce it, I made a small script and a .db file including 100x trypsin with a dummy energy value.

Thanks for your help and this nice package.

import torch
import torch.nn.functional as F
from torch.optim import Adam

import schnetpack as spk
from schnetpack.data import AtomsData
import schnetpack.atomistic as atm
import schnetpack.representation as rep

data = AtomsData('3ptb.db', properties=['energy'])

# split in train and val
train, val, test = data.create_splits(80, 20)
loader = spk.data.AtomsLoader(train, batch_size=1, num_workers=1)
val_loader = spk.data.AtomsLoader(val)

# create model
reps = rep.SchNet(
    n_atom_basis=32,
    n_filters=128,
    n_interactions=1,
    cutoff=5.0,
    n_gaussians=25,
    normalize_filter=False,
    coupled_interactions=True,
    return_intermediate=False,
    max_z=100,
    trainable_gaussians=False,
    distance_expansion=None)
output = atm.Atomwise()
model = atm.AtomisticModel(reps, output).cuda()

opt = Adam(model.parameters(), lr=1e-4)
loss = lambda b, p: F.mse_loss(p["y"], b['energy'])
trainer = spk.train.Trainer("output/", model, loss, opt, loader, val_loader)

trainer.train(torch.device("cuda"))

3ptbdb.zip

Element wise MeanSquaredError

Hi,

I'm getting an error when i try and run RootMeanSquaredError('forces','dydx',element_wise=True) in my training script.

I fixed the problem by changing
self.n_entries += torch.sum(batch[Structure.atom_mask]) * y.shape[-1]
in line 135 in metrics.py to
self.n_entries += torch.sum(batch[Structure.atom_mask]).detach().cpu().data.numpy() * y.shape[-1]
is this a good solution?

/Daniel

Total energy (Uo) reproduction (QM9)

Good day. I just found your tool and wanted to get acquainted with it by reproducing the results for QM9 data. I set the available parameters according to the values given in the articles [J.Chem.Phys,148,241722(2018) and Adv.N.Inf.Proc.Sys,30 (2017),pp.992-1002]. The parameter 'an exponential moving average over weights with decay rate 0.99' I set in schnetpack_qm9.py in Adam optimizer (line 185) "weight_decay=0.99".
Thus, my command is:
python3 schnetpack_qm9.py train schnet qm9.db output --split 110000 1000 --batch_size 32 --lr 0.001 --lr_decay 0.96 --features 64 --cutoff 1000 --cuda
For the first run, after ~2500 epochs the min RMSE_U0 was 1.2 kcal/mol. For the second run the min. was 2.6 kcal/mol at ~ 100th epoch, after which the RMSE started to increase and turned to NaN at the end.
I would thus appreciate any guidance in parameters setting to obtain published results of 0.3-0.4 kkcal/mol.

No inheritance of ASEEnvironmentProvider

class ASEEnvironmentProvider:

There are no functional issues, but I think ASEEnvironmentProvider should inherit BaseEnvironmentProvider like below.

class ASEEnvironmentProvider(BaseEnvironmentProvider):

And in my opinion, the name of the class should be changed to AseEnvironmentProvider because in the other module (e.g., md) the word Ase is used for ASE related classes (e.g., md.AseInterface). The naming conventions should be used consistently.

Division by Zero

Hey folks,

I'm trying this out with the ROCM version of Pytorch so chances are there's an issue there

python3 src/scripts/schnetpack_qm9.py train schnet ./tests/data/test_qm9.db .  --split 2 2 --cuda
INFO:root:Random state initialized with seed 1534720078
INFO:root:QM9 will be loaded...
INFO:schnetpack.data.atoms:The dataset has already been downloaded and stored at ./tests/data/test_qm9.db
INFO:root:create splits...
INFO:root:load data...
INFO:root:calculate statistics...
INFO:root:cached statistics was loaded...
INFO:root:The model you built has: 1676033 parameters
INFO:root:training...
Traceback (most recent call last):
  File "src/scripts/schnetpack_qm9.py", line 443, in <module>
    train(args, model, train_loader, val_loader, device)
  File "src/scripts/schnetpack_qm9.py", line 221, in train
    trainer.train(device)
  File "/home/philix/.pyenv/versions/3.7.2/lib/python3.7/site-packages/schnetpack-0.2.1-py3.7.egg/schnetpack/train/trainer.py", line 239, in train
  File "/home/philix/.pyenv/versions/3.7.2/lib/python3.7/site-packages/schnetpack-0.2.1-py3.7.egg/schnetpack/train/trainer.py", line 217, in train
ZeroDivisionError: float division by zero

KeyError: '_neighbor_pairs_j'

Hi, I was trying to run the hdnn model with my own ase database but it is giving me an error.

`-------------------------------------------------------------------------
KeyError Traceback (most recent call last)
in ()
32
33 # start training
---> 34 trainer.train(torch.device("cpu"))

~\Anaconda3\lib\site-packages\schnetpack\train\trainer.py in train(self, device)
214 h.on_train_failed(self)
215
--> 216 raise e

~\Anaconda3\lib\site-packages\schnetpack\train\trainer.py in train(self, device)
142 }
143
--> 144 result = self._model(train_batch)
145 loss = self.loss_fn(train_batch, result)
146

~\Anaconda3\lib\site-packages\torch\nn\modules\module.py in call(self, *input, **kwargs)
487 result = self._slow_forward(*input, **kwargs)
488 else:
--> 489 result = self.forward(*input, **kwargs)
490 for hook in self._forward_hooks.values():
491 hook_result = hook(self, input, result)

~\Anaconda3\lib\site-packages\schnetpack\atomistic.py in forward(self, inputs)
53 if self.requires_dr:
54 inputs[Structure.R].requires_grad_()
---> 55 inputs['representation'] = self.representation(inputs)
56
57 if isinstance(self.output_modules, nn.ModuleList):

~\Anaconda3\lib\site-packages\torch\nn\modules\module.py in call(self, *input, **kwargs)
487 result = self._slow_forward(*input, **kwargs)
488 else:
--> 489 result = self.forward(*input, **kwargs)
490 for hook in self._forward_hooks.values():
491 hook_result = hook(self, input, result)

~\Anaconda3\lib\site-packages\schnetpack\representation\hdnn.py in forward(self, inputs)
195 if self.ADF is not None:
196 # Get pair indices
--> 197 idx_j = inputs[Structure.neighbor_pairs_j]
198 idx_k = inputs[Structure.neighbor_pairs_k]
199 neighbor_pairs_mask = inputs[Structure.neighbor_pairs_mask]

KeyError: '_neighbor_pairs_j'`

The database I use is a ase db with 50 Au-Pd clusters with 13 atoms.

import torch
import torch.nn.functional as F
from torch.optim import Adam

import schnetpack as spk
import schnetpack.atomistic as atm
import schnetpack.representation as rep
from schnetpack.datasets import *

data = AtomsData('./db/Icosahedron-2-large-unique-50-test.db', properties=['energy'])
# split in train and val
train, val, test = data.create_splits(10, 10)
loader = spk.data.AtomsLoader(train, batch_size=2, num_workers=1)
val_loader = spk.data.AtomsLoader(val)

# create model
reps = rep.BehlerSFBlock(n_radial=22, n_angular=5, elements=frozenset([46,79]))
output = atm.ElementalAtomwise(reps.n_symfuncs)
model = atm.AtomisticModel(reps, output)

# filter for trainable parameters (https://github.com/pytorch/pytorch/issues/679)
trainable_params = filter(lambda p: p.requires_grad, model.parameters())

# create trainer
opt = Adam(trainable_params, lr=1e-4)
loss = lambda b, p: F.mse_loss(p["y"], b['energy'])
trainer = spk.train.Trainer("wacsf/", model, loss,
                      opt, loader, val_loader)

# start training
trainer.train(torch.device("cpu"))```

Did I do something wrong in the code? Thanks! 

Accidental letter in schnetpack_matproj.py

Thank you for providing such a nice package together with the source code. I noticed a very small issue in the script "schnetpack_matproj.py", where on line 210 there is an additional letter "f" on column 18. This prevents the script from running.

CUDA memory issue during optimization

I have been training on various size water clusters and trying to optimize a water 256 using the trained model. The training process worked fine, but when I try to optimize, I always get the following CUDA memory error. When I run on CPU, the optimization is pretty slow(5min per step). I would like to ask, is the requirement of this much memory a legitimate thing and what is causing it so memory intensive? Or do you have any idea what might went wrong?
Thank you in advance for any help your could possibly provide!

Traceback (most recent call last):
File "optimize_water_256_wacsf.py", line 15, in
print("forces:", atoms.get_forces())
File "/home/xiaowei/.local/lib/python3.6/site-packages/ase/atoms.py", line 714, in get_forces
forces = self._calc.get_forces(self)
File "/home/xiaowei/.local/lib/python3.6/site-packages/ase/calculators/calculator.py", line 519, in get_forces
return self.get_property('forces', atoms)
File "/home/xiaowei/.local/lib/python3.6/site-packages/ase/calculators/calculator.py", line 552, in get_property
self.calculate(atoms, [name], system_changes)
File "/home/xiaowei/miniconda3/lib/python3.6/site-packages/schnetpack-0.2.1-py3.6.egg/schnetpack/ase_interface.py", line 92, in calculate
File "/home/xiaowei/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 494, in call
result = self.forward(*input, **kwargs)
File "/home/xiaowei/miniconda3/lib/python3.6/site-packages/schnetpack-0.2.1-py3.6.egg/schnetpack/atomistic.py", line 61, in forward
File "/home/xiaowei/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 494, in call
result = self.forward(*input, **kwargs)
File "/home/xiaowei/miniconda3/lib/python3.6/site-packages/schnetpack-0.2.1-py3.6.egg/schnetpack/representation/hdnn.py", line 366, in forward
File "/home/xiaowei/miniconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 494, in call
result = self.forward(*input, **kwargs)
File "/home/xiaowei/miniconda3/lib/python3.6/site-packages/schnetpack-0.2.1-py3.6.egg/schnetpack/representation/hdnn.py", line 238, in forward
File "/home/xiaowei/miniconda3/lib/python3.6/site-packages/schnetpack-0.2.1-py3.6.egg/schnetpack/nn/neighbors.py", line 204, in neighbor_elements
RuntimeError: CUDA out of memory. Tried to allocate 1.68 GiB (GPU 0; 7.77 GiB total capacity; 6.02 GiB already allocated; 648.62 MiB free; 17.94 MiB cached)

The following is the code I have been using to run optimization.

import torch
from schnetpack.ase_interface import SpkCalculator
from ase import Atoms
from ase.io import read
from ase.optimize import BFGS

path_to_model = "XX_water_wacsf/best_model"
model = torch.load(path_to_model)

atoms = read('water_256.xyz')
calc = SpkCalculator(model, device="cuda")
atoms.set_calculator(calc)
print("forces:", atoms.get_forces())
print("total_energy", atoms.get_potential_energy())
dyn = BFGS(atoms,trajectory='water_256_opt_BFGS_wacsf.traj',restart='water_256_opt_BFGS_wacsf.pckl')
dyn.run(fmax=0.05)

Dataloader stuck with AseEnvironmentProvider

Hi guys,

thanks for the suggestions with the AseEnvironmentProvider.
It worked well with my dummy test set of 100 structures, but I discovered that for a larger database size the whole data loading procedure gets stuck. This means when the data is initialized the first time (either during loader.get_statistics() or in the trainer function later) it doesn't finish. I also had an issue where it continued to fill up my ram and I needed to stop at 90Gb.
I set the number of workers in the dataloader to 0 to debug a bit and it seems to be stuck in the ase neighbour list function.
I tested this also with small molecules and its reproducible with the qm9 dataset.

Have you experienced something similar?

I used the same script as before with the AseEnvironment addition. The issue already arises with a database size of 3000 molecules.

Thanks for your help!

import torch
import torch.nn.functional as F
from torch.optim import Adam

import schnetpack as spk
from schnetpack.data import AtomsData
import schnetpack.atomistic as atm
import schnetpack.representation as rep

cutoff = 5.  # Angstrom
environment_provider = spk.environment.AseEnvironmentProvider(cutoff)
data = AtomsData('trypsin.db', required_properties=['energy'], environment_provider=environment_provider)

# split in train and val
train, val, test = data.create_splits(2800, 100)
loader = spk.data.AtomsLoader(train, batch_size=10, num_workers=4)
val_loader = spk.data.AtomsLoader(val)

# create model
reps = rep.SchNet(
    n_atom_basis=128,
    n_filters=128,
    n_interactions=1,
    cutoff=5.0,
    n_gaussians=25,
    normalize_filter=False,
    coupled_interactions=False,
    return_intermediate=False,
    max_z=100,
    trainable_gaussians=False,
    distance_expansion=None)
output = atm.Atomwise()
model = atm.AtomisticModel(reps, output).cuda()

opt = Adam(model.parameters(), lr=1e-3)
loss = lambda b, p: F.mse_loss(p["y"], b['energy'])
out_dir = 'test'
trainer = spk.train.Trainer(out_dir, model, loss, opt, loader, val_loader,
                            hooks=[spk.train.CSVHook(f'{out_dir}/log',
                                                     [spk.metrics.MeanAbsoluteError('energy', "y"),
                                                      spk.metrics.RootMeanSquaredError('energy', "y")],
                                                     every_n_epochs=1)])
trainer.train(torch.device("cuda"))

Evaluation.py _get_predicted(self, device) for clusters with different number of atoms

Hi,

I am trying to predict the forces of clusters using the best_model. However, it seems that if the clusters are of various size, the _get_predicted function would not work. The problem is in

for p in predicted.keys():
            predicted[p] = np.vstack(predicted[p])
ValueError: all the input array dimensions except for the concatenation axis must match exactly

Thanks!

Benchmark results in docs

As suggested in Issue #77, we should have a table of benchmark results in the docs for our scripts.

It would be great to have a script for that, so we can run it before every release to update the results table. It would take a set of cmd arguments for the scripts, run the models on our cluster and create a file with the table that can directly be parsed by the docs (probably rst).

hdnn.py distances not correct for periodically repeated bulk structure

Hi,

It seems that the distances calculated in the hdnn.py is not correct.

 # Compute radial functions
        if self.RDF is not None:
            # Get atom type embeddings
            Z_rad = self.radial_Z(Z)
            # Get atom types of neighbors
            Z_ij = snn.neighbor_elements(Z_rad, neighbors)
            # Compute distances
            distances = snn.atom_distances(
                positions, neighbors, neighbor_mask=neighbor_mask
            )
            radial_sf = self.RDF(
                distances, elemental_weights=Z_ij, neighbor_mask=neighbor_mask
            )
        else:
            radial_sf = None

If I have a Cu bulk with a single Cu atom repeated in all 3 directions, the distances vector looks like [0, 0, 0,....,0]. I believe it should be [a, a, a, ..... , 2a etc] if a is the lattice constant. This is probably because the cell_offset is not used in the calculation? Could you have a look at it? Thanks!

ValueError: invalid filename or file not found

Hi all,
I have installed schentpack on windows .
python version is 3.7.3
when i run pytest it returns the following error:
ValueError: invalid filename or file not found "c:\users\fariba\appdata\local\programs\python\python37\lib\site-packages\schnetpack-0.2.1-py3.7.egg\schnetpack\sacred\calculator_ingredients.py"
Any comment is appreciated.

Bug in Train Embeddings for Elemental References?

I think the train_embeddings value of Atomwise is used incorrectly. When setting the default references in the constructor for Atomwise, the value of freeze for the embeddings class is equal to train_embeddings which means that if train_embeddings == True then the embeddings are not trained.

Do I understand that correctly? And, if so, could I fix it by changing it to freeze=not train_embeddings?

One of the two relevant lines:

freeze=train_embeddings)

Issue with materials project test script

Hi,

I am getting an error when trying to train the network using the included script for training the materials project data. The model seems to train fine when I set --property to anything other than band_gap, but when I use band_gap I am getting some KeyError:

Traceback (most recent call last):
  File "matproj.py", line 273, in <module>
    mean, stddev = train_loader.get_statistics(train_args.property, False)
  File "/home/paperspace/anaconda3/envs/ml/lib/python3.6/site-packages/schnetpack/data.py", line 408, in get_statistics
    for row in self:
  File "/home/paperspace/anaconda3/envs/ml/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 336, in __next__
    return self._process_next_batch(batch)
  File "/home/paperspace/anaconda3/envs/ml/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 357, in _process_next_batch
    raise batch.exc_type(batch.exc_msg)
KeyError: 'Traceback (most recent call last):\n  File "/home/paperspace/anaconda3/envs/ml/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 106, in _worker_loop\n    samples = collate_fn([dataset[i] for i in batch_indices])\n  File "/home/paperspace/anaconda3/envs/ml/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 106, in <listcomp>\n    samples = collate_fn([dataset[i] for i in batch_indices])\n  File "/home/paperspace/anaconda3/envs/ml/lib/python3.6/site-packages/schnetpack/data.py", line 100, in __getitem__\n    prop = row.data[p]\nKeyError: \'band_gap\'\n'

Any help would be greatly appreciated. Thank you.

RMSE logging when training data contains molecules of varying sizes

I'm unsure whether or not I'm getting representative RMSE logging when training on a data set that contains molecules of varying sizes (from 55 atoms to 190 atoms).

If I'm not mistaken, the following piece of code is run (per batch?) when logging the RMSE of a target:

https://github.com/atomistic-machine-learning/schnetpack/blob/master/src/schnetpack/metrics.py#L68

    def add_batch(self, batch, result):
        y = batch[self.target]
        if self.model_output is None:
            yp = result
        else:
            yp = result[self.model_output]

        diff = self._get_diff(y, yp)
        self.l2loss += torch.sum(diff.view(-1)).detach().cpu().data.numpy()
        self.n_entries += np.prod(y.shape)

https://github.com/atomistic-machine-learning/schnetpack/blob/master/src/schnetpack/metrics.py#L132

    def aggregate(self):
        return np.sqrt(self.l2loss / self.n_entries)

As far as I understand, there's some zero padding happening in the background when the training batch contains molecules of varying sizes. Am I correct in assuming that this zero padding increases the number self.n_entries (when using np.prod(y.shape)) such that my mean (for the smaller molecules) becomes a lot smaller than it should be?

qm9 dataset has no attribute "properties"

qm9 dataset has an attribute called "available_properties", but I think it should be "properties" to be consistent with the rest. The "schnetpack_qm9.py" script also refers to "properties" which currently doesn't exist.

max() arg is an empty sequence error

(schnetpack) victor@DEEPLEARN3:~/chemistry/Codes/schnetpack_reproduce$ python3 src/scripts/schnetpack_qm9.py train schnet data/seed_2000/energy_U0_cosine/data/qm9.db data/seed_2000/energy_U0_cosine/model --split 109000 1000 --cuda --property energy_U0 --cutoff_function cosine --seed 2000 --logger tensorboard
INFO:root:Random state initialized with seed 2000      
INFO:root:QM9 will be loaded...
INFO:schnetpack.data.atoms:The dataset has already been downloaded and stored at data/seed_2000/energy_U0_cosine/data/qm9.db
INFO:root:create splits...
INFO:root:load data...
INFO:root:calculate statistics...
INFO:root:cached statistics was loaded...
INFO:root:The model you built has: 1676033 parameters
INFO:root:training...
Traceback (most recent call last):
  File "src/scripts/schnetpack_qm9.py", line 606, in <module>
    train(args, model, train_loader, val_loader, device)
  File "src/scripts/schnetpack_qm9.py", line 328, in train
    args.modelpath, model, loss, optimizer, train_loader, val_loader, hooks=hooks
  File "/home/victor/anaconda3/envs/schnetpack/lib/python3.6/site-packages/schnetpack-0.2.1-py3.6.egg/schnetpack/train/trainer.py", line 60, in __init__
  File "/home/victor/anaconda3/envs/schnetpack/lib/python3.6/site-packages/schnetpack-0.2.1-py3.6.egg/schnetpack/train/trainer.py", line 119, in restore_checkpoint
ValueError: max() arg is an empty sequence

Hi, if I run the src/scripts/schnetpack_qm9.py it gives me error of "max() arg is an empty sequence error", I have tried uninstall schnetpack and install again from the source.

Evaluation on validation split fails

The following command produces an empty evaluation.txt:

spk_run.py eval modeldir --split validation

The reason is that the split name in code is expected to be "val". However, the util parsing script only allows the full word, i.e. validation. So I think the code where "val" is still used should be changed to "validation" as well.

question about wACSFs

With the wACSF descriptors, is there still a separate neural network per element in the system as with the ACSFs?

"strict" argument in md.load_model()

model.load_state_dict(torch.load(os.path.join(modelpath, 'best_model')))

First of all, thank you for the excellent library (nice design patterns, easy-to-use, extendable, and so on...).

I think above line should be
model.load_state_dict(torch.load(os.path.join(modelpath, 'best_model')), strict=False)

Otherwise, in my case, md.load_model() doesn't work...

Eval mode of all scripts broken by torch.load

I think recently a change was made to save the model instead of the state as a checkpoint (which I agree is an improvement). Anyway, all scripts are still using load_state_dict instead of just model = torch.load(...), so the scripts now give an error when running eval like: AttributeError: 'DataParallel' object has no attribute 'copy'.

Using trained model as ASE calculator

I'm trying to used a trained model as an calculator object in ASE, however i get an error.

Traceback (most recent call last):
  File "opti.py", line 30, in <module>
    dyn.run(fmax=0.01)
  File "/usr/local/lib/python3.6/dist-packages/ase/optimize/optimize.py", line 174, in run
    f = self.atoms.get_forces()
  File "/usr/local/lib/python3.6/dist-packages/ase/atoms.py", line 735, in get_forces
    forces = self._calc.get_forces(self)
  File "/usr/local/lib/python3.6/dist-packages/ase/calculators/calculator.py", line 460, in get_forces
    return self.get_property('forces', atoms)
  File "/usr/local/lib/python3.6/dist-packages/ase/calculators/calculator.py", line 493, in get_property
    self.calculate(atoms, [name], system_changes)
  File "/usr/local/lib/python3.6/dist-packages/schnetpack/md.py", line 94, in calculate
    model_results = self.model(model_inputs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/schnetpack/atomistic.py", line 55, in forward
    inputs['representation'] = self.representation(inputs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/schnetpack/representation/schnet.py", line 199, in forward
    v = interaction(x, r_ij, neighbors, neighbor_mask, f_ij=f_ij)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/schnetpack/representation/schnet.py", line 52, in forward
    v = self.cfconv.forward(x, r_ij, neighbors, neighbor_mask, f_ij=f_ij)
  File "/usr/local/lib/python3.6/dist-packages/schnetpack/nn/cfconv.py", line 67, in forward
    y = torch.gather(y, 1, nbh)
RuntimeError: Invalid index in gather at /pytorch/aten/src/TH/generic/THTensorMath.cpp:620

Have been trying with different ASE scripts, geometry optimization, phonon dispersion calculation etc, but getting similar errors. Attached are the files for a geometry optimization of a 6,0 SWCNT with the trained model.

Any tips on how to fix this?

6,0_opti.zip

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.