Code Monkey home page Code Monkey logo

qmctorch's People

Contributors

felipez avatar matthijsdewit111 avatar nicorenaud avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

scorpjd gxiaotian

qmctorch's Issues

Benchmark/Optimization of the MH sampling

The sampling used by default Metropolis use a simple metropolis hasting sampling. I have done zero effort to optimize the code nor or CPU/GPU. As it is a central part of the code it would be great to see if we can improve its performance.

We should try to optimize it on a single CPU/GPU and we will use Horovod (or similar) to distribute over multiple GPUs.

Acceleration Orbital Projection

ATM we compute all Slater determinants in the CI expansion. However we can compute the ground state SD and from there compute all the other ones using the properties of SD with one column change.

  • Single excitation from A^{-1} B : B.L. Hammond appendix B1
  • Double excitation from P A^{-1} A' P : Claudia "efficient derivative"

STO fit of the GTO

The gradients obtained with pyscf are very noisy due to the incorrect cusp condition of the GTO. One possible solution would be to fit the AO obtained with pyscf to STO. We could then use slater ao in the rest of the network and still have acceptable first guess for the MO coefficients.

  • Fit AO (i.e. the contracted GTOs) of pyscf to STO
  • Export the new basis parameter in mol.basis

Horovod solver

We do have a horovod solver but its performance as never been fully tested

  • test if the SolverHorovod is still working
  • test on multiple CPU using MPI
  • test on multiple CPU/GPU

Improving sampling techniques

We could go beyond Metropolis Hasting to do the sampling. A few candidates are :

  • Hamiltonian Monte Carlo (already sort of implemented sampler/hamiltonian.py)
  • An efficient QMC specific sampling https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.71.408 Would be great to implement this one !
  • Using PyMC3 https://docs.pymc.io/ A lot is implemented here but I don't know if we can use it here as our objective function is a pytorch object. It would be great to look into it as it would bring a lot of different methods !

For each method it would be great to have a small benchmark of their perf and when possible some having them as efficient as possible

Horovod integration

Starting from the SolverOrbitalHorovod class, study the possibility to distribute the calculation over multiple GPUs. Both the sampling and the optimization are distributed. I'm not sure if the class still works. Things to look at :

  • How the sampling is distributed.
  • How is the memory pinned
  • How is the optimization distributed.

It would be great to run the code for small molecules (H2, Li2, NH3), on 1, 2, 4 GPUs and see what scaling we have.

Horovod test sometimes fell

It appears that the test for horovod sometimes fail because multiple processes try to create the hdf5 file of Molecule.

E   OSError: Unable to open file (unable to lock file, errno = 11, error message = 'Resource temporarily unavailable')

h5py/h5f.pyx:96: OSError

During handling of the above exception, another exception occurred:

self = <horovod_tests.test_solver_orbital_horovod.TestSolverOribitalHorovod testMethod=test_single_point>

    def setUp(self):
        hvd.init()
    
        torch.manual_seed(101)
        np.random.seed(101)
    
        set_torch_double_precision()
    
        # molecule
>       mol = Molecule(atom='H 0 0 -0.69; H 0 0 0.69',
                       calculator='pyscf', basis='sto-3g',
                       unit='bohr', rank=hvd.local_rank())

horovod_tests/test_solver_orbital_horovod.py:27: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
qmctorch/scf/molecule.py:115: in __init__
    dump_to_hdf5(self, self.hdf5file,
qmctorch/utils/hdf5_utils.py:143: in dump_to_hdf5
    h5 = h5py.File(fname, 'a')
../anaconda3/envs/qmctorch/lib/python3.8/site-packages/h5py/_hl/files.py:424: in __init__
    fid = make_fid(name, mode, userblock_size,
../anaconda3/envs/qmctorch/lib/python3.8/site-packages/h5py/_hl/files.py:204: in make_fid
    fid = h5f.create(name, h5f.ACC_EXCL, fapl=fapl, fcpl=fcpl)
h5py/_objects.pyx:54: in h5py._objects.with_phil.wrapper
    ???
h5py/_objects.pyx:55: in h5py._objects.with_phil.wrapper
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   OSError: Unable to create file (unable to open file: name = 'H2_pyscf_sto-3g.hdf5', errno = 17, error message = 'File exists', flags = 15, o_flags = c2)

h5py/h5f.pyx:116: OSError

We might have to check that if we have multiple process only the rank=0 creates the molecule or something on those lines ...

Generic Jastrow Factor

The jastrow factors can also be defined via a generic differentiable function (typically a small network) as seen in :

  • elec-nuc : electron_nuclei_generic.py
  • elec-elec : generic_jastrow.py
  • elec-elec-nuc : three_body_jastrow_generic.py

They can also be combined together to form a single Jastrow term as seen in : mixed_elec_nuc_pade_jastrow.py

The differentiable function that define each term is here very flexible and could for example be a small fully connected network. It would be great to run a few calculations using different network and see what accuracy we get on the final energy.

Design of Neural Jastrow

Instead of using the Jastrow factor that are already been implemented we could replace the entire Jastrow factor by a network. This network would use as input either the xyz coordinate of each electron/nuclei or the distance between electron and nuclei. The output of the network will be a single scalar (per electronic configuration). The network should respect a few properties :

  • The network should be independent of the ordering of the positions (or distance) used as input
  • The value of the output should increase when the electrons are closed together (that;s the entire idea of the Jastrow)
  • It should respect the electron electron cusp, so have a peaked min value when two elec are at the same positions
  • ....

A few possible architectures could be :

This is a completely open research question !

Diffusion Monte Carlo

Implement a DMC like routine. We need to :

  • Implement green's function sampling method
  • Interface the GF sampling with the Solver

track variational parameters optional

When the network contains a lot of parameters, e.g. fermi net, storing all the parameters is prohibitively memory intensive (a few TB of data that we don t care about). The tracking of these params should be made optional in solver_base.track_observable

Benchmark the current pade-jastrow factors

We have now 3 different types of jastrow factor implemented :

  • nuclei-electron : qmctorch/wavefunction/jastrows/electron_nuclei_pade_jastrow.py
  • electron-electron qmctorch/wavefunction/jastrows/pade_jastrow.py
  • elec-elec-nuclei qmctorch/wavefunction/jastrows/three_body_pade_jastrow.py

They can be combined as for example in mixed_elec_nuc_pade_jastrow.py It would be interesting to run a few calculations using different combinations of the jastrow terms to understand their influence on the final results. This could be especially interesting for NH3 where a simple elec-elec jastrow doesn't do a great job

Implement a traditional VMC algorithm

In a traditional VMC algorithm, the energy is tracked during the sampling instead of after the sampling is done. It might be interesting to implement VMC as a qmctoch.solver

Performance of QMCTorch on H2

H2 is the simplest system so we should start here. Things to try :

Single point calculation :

  • Impact of sampling on single point local energy.
  • Impact of basis set
  • Pyscf vs ADF

Optimization :

  • Impact of sampling/resampling on optimization
  • Impact of optimizer (ADAM, SGD, ...) on optimization
  • Impact of electronic configurations

trouble reproducing h2 wavefunction optimization example with PySCF?

I'm currently trying to reproduce the CPU wavefunction optimization example.

I don't have a license for scm, so can't use the adf calculator. The only change I made to h2.py was to replace adf with pyscf. But when I run the example, I don't get a nice plot of variance dropping over time, as shown in the documentation. Instead, I get a chart that looks like the following:

image

It is totally possible that this is user error, or that there are some other changes that need to be made for this to work with pyscf.

Also, if the software is currently in a work-in-progress state such that reports like this from end users are unhelpful, please let me know.

Backflow orbitals

We should implement backflow orbitals (see e.g. https://www.researchgate.net/publication/50248536_Quantum_Monte_Carlo_study_of_the_first-row_atoms_and_ions or https://www.cond-mat.de/events/correl20/manuscripts/foulkes.pdf ). In short we build the slater matrix the same way as before but the electrons are replaced by quasi particle with positions :

$r_i = r_i + \sum j\neq i \eta_{ij} r_j$

The resulting orbitals are multi-electronic with var parameters $\eta_ij$. The implementation of the SlaterJastrowOrbitals should help handling the kinetic energy calculation and for the table method

Interpolation

  • Interpolate the AO and MO during the sampling.
  • We could also interpolate the AO during energy calculation when the kinetic energy is obtained through the Jacobi formula and when the AO are not optimized

Performance of QMCTorch on LiH

LiH is a relatively simple system. Things to try :

Single point calculation :

  • Impact of sampling on single point local energy.
  • Impact of basis set
  • Pyscf vs ADF

Optimization :

  • Impact of sampling/resampling on optimization
  • Impact of optimizer (ADAM, SGD, ...) on optimization
  • Impact of electronic configurations

Stochastic Reconfiguration optimizer

Implement the stochastic reconfiguration method as a torch.optim class. This will allow us to compare performance between traditional optimizers and QMC optimizers

adf kffile read error in kftools

When using adf I optain a error from the calculator/adf.py file for kffile.read. This is probably a error on my side with regards to the plams package.
The error I obtain when using adf as calculator:

Traceback (most recent call last):
File "/home/breebaart/dev/QMCTorch/example/optimization/h2.py", line 20, in
unit='bohr')
File "/usr/lib/python3.6/qmctorch/wavefunction/molecule.py", line 78, in init
self.basis = self.calculator.run()
File "/usr/lib/python3.6/qmctorch/wavefunction/calculator/adf.py", line 40, in run
basis = self.get_basis_data(t21_path)
File "/usr/lib/python3.6/qmctorch/wavefunction/calculator/adf.py", line 85, in get_basis_data
basis.TotalEnergy = kf.read('Total Energy', 'Total energy')
File "/home/breebaart/.local/lib/python3.6/site-packages/scm/plams/tools/kftools.py", line 301, in read
ret = self.reader.read(section, variable)
AttributeError: 'NoneType' object has no attribute 'read'

GPU support

Performance of the GPU support:

  • test if GPU implementation is working

Spin flip in excitation

ATM we only have excitation that conserved the spin of the electrons. We should change that.

  • add triplet support to orbital_configuration.py
  • follow the confs so that each routine accept a different number of spin up/down for each confs

sampling in log domain

We currently perform the sampling in the normal domain. It is generally advised to do it in log domain to avoid over/under flows.

Remove kinpool

The KineticPooling is never used (or should never be used anyway) IT should be removed

Cube output

It would be great if each solver or wf could export their molecular orbitals to a cube file format so that we can look at them in VMD/pymol

Support periodic boundary conditions

I would be nice to support PBC either via pyscf or BAND. The main thing would be to add PBC to the sampler or to the wave function by simply wrapping the pos into the cell

Replacing HF by DFT calculation

So far the SCF calculation is done at the HF level. Replacing that to a DFT calculation would potentially provide better first approximations of the MOs.

Dissociation curves

Computes the dissociation curves of :

  • H2
  • N2

Compare with available data and with FermiNet

Refactor Jastrow to help combine terms

It would be great to :

  • pass the e-e dist or e-n dist to the jastrow factor so that we can share the distance matrices between different jastrow term
  • do dot comput the exp in the individual jastrow but simply on the sum of them, i.e. replace J = \prod exp(J) by J = exp(\sum J)

Jastrow factor

We only have the 2-body Pade-Jastrow term atm. We can include more complex factors

Speed up energy calculation

As we accelerated the calculation of the wave function, we should speed up the energy calculation.

  • AO derivatives,
  • Jastrow derivatives
  • Kinetic operator

Process positions could be faster

In AtomicOrbitals the method _process_position computes the distance between each elec and each orbitals.

    def _process_position(self, pos):
        """Computes the positions/distance bewteen elec/orb

        Args:
            pos (torch.tensor): positions of the walkers Nbat, NelecxNdim

        Returns:
            torch.tensor, torch.tensor: positions of the elec wrt the bas
                                        (nbatch, Nelec, Norn, Ndim)
                                        distance between elec and bas
                                        (nbatch, Nelec, Norn)
        """
        self.bas_coords = self.atom_coords.repeat_interleave(
            self.nshells, dim=0)

        xyz = (pos.view(-1, self.nelec, 1, self.ndim) -
               self.bas_coords[None, ...])

        r = torch.sqrt((xyz*xyz).sum(3))

        return xyz, r

As seen there we first expand the atomic positions to bas_positions and then computes the distance. We could compute first the distance between elec and atoms and then expand to bas

Remove unused imports

There many import modules in the library that are not used. We need to remove those ones.

Logger

Implement logging of the principal routines

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.