Code Monkey home page Code Monkey logo

graddft's People

Contributors

jackbaker1001 avatar pabloamc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

graddft's Issues

There is still checkpointing and other files in the repo

running du -h . in the root of this repo returns:

 32K	./grad_dft/interface
 28K	./grad_dft/utils
1.5M	./grad_dft/external/density_functional_approximation_dm21/density_functional_approximation_dm21/checkpoints/DM21mu/variables
1.7M	./grad_dft/external/density_functional_approximation_dm21/density_functional_approximation_dm21/checkpoints/DM21mu
1.5M	./grad_dft/external/density_functional_approximation_dm21/density_functional_approximation_dm21/checkpoints/DM21mc/variables
1.7M	./grad_dft/external/density_functional_approximation_dm21/density_functional_approximation_dm21/checkpoints/DM21mc
1.5M	./grad_dft/external/density_functional_approximation_dm21/density_functional_approximation_dm21/checkpoints/DM21m/variables
1.7M	./grad_dft/external/density_functional_approximation_dm21/density_functional_approximation_dm21/checkpoints/DM21m
1.5M	./grad_dft/external/density_functional_approximation_dm21/density_functional_approximation_dm21/checkpoints/DM21/variables
1.7M	./grad_dft/external/density_functional_approximation_dm21/density_functional_approximation_dm21/checkpoints/DM21
6.7M	./grad_dft/external/density_functional_approximation_dm21/density_functional_approximation_dm21/checkpoints
6.8M	./grad_dft/external/density_functional_approximation_dm21/density_functional_approximation_dm21
 16K	./grad_dft/external/density_functional_approximation_dm21/cc
4.0K	./grad_dft/external/density_functional_approximation_dm21/.vscode
6.9M	./grad_dft/external/density_functional_approximation_dm21
6.9M	./grad_dft/external
7.2M	./grad_dft
 16K	./tests/unit
 48K	./tests/integration
 64K	./tests
1.5M	./models/DM21_model/variables
1.7M	./models/DM21_model
1.7M	./models
 64K	./image/README
 64K	./image
 20K	./examples/intermediate_examples
 48K	./examples/article_experiments
 32K	./examples/basic_examples
 28K	./examples/advanced_examples
344K	./examples
8.0K	./.github/workflows
8.0K	./.github
161M	./.git/objects/pack
  0B	./.git/objects/info
161M	./.git/objects
4.0K	./.git/info
4.0K	./.git/logs/refs/heads
4.0K	./.git/logs/refs/remotes/origin
4.0K	./.git/logs/refs/remotes
8.0K	./.git/logs/refs
 12K	./.git/logs
 60K	./.git/hooks
4.0K	./.git/refs/heads
  0B	./.git/refs/tags
4.0K	./.git/refs/remotes/origin
4.0K	./.git/refs/remotes
8.0K	./.git/refs
161M	./.git
 48K	./data/raw/dissociation
 84K	./data/raw
 84K	./data
171M	.

Question is @PabloAMC, do we need the checkpoints in the DM21 external folder? We certainly don't need the huge .pack objects in .git. I think these were committed by mistake. Typically, these would be in .gitignore

Testing the Non-XC part of the total energy

In #17, one of the points raised was about the correctness of the non-XC terms in the total energy.

Mostly, I was concerned with the inclusion of two body terms because these do not exist in DFT (outside the scope of the XC functional atleast). However, it appears that removal of these terms makes the energy completely wrong so I think my concern was related to a monomer in the language used in PySCF rather than any actual numerics being incorrect.

To make entirely sure that we are doing things correctly here, I am recommending implementing tests for the Non-XC total energy by checking the dissociation curves of 3 diatomic molecules match the result from PySCF. Runs with no XC functional can be run in PySCF using the dummy function:

mol = gto.M(
    atom = '''
    Li  0.   -0.37   0.
    F  0.   0.98    0. 
    ''',
    basis = 'ccpvdz')

def zero_xc(xc_code, rho, spin=0, relativity=0, deriv=1, omega=None, verbose=None):
    # A fictitious XC functional to demonstrate the usage
    rho0, dx, dy, dz = rho[:4]
    vlapl = None
    vtau = None
    fxc = None
    kxc = None
    vgamma = np.zeros(shape=rho0.shape)
    vrho = np.zeros(shape=rho0.shape)
    exc = np.zeros(shape=rho0.shape)
    vxc = (vrho, vgamma, vlapl, vtau)
    return exc, vxc, fxc, kxc

mf = dft.RKS(mol)
mf = mf.define_xc_(zero_xc, 'GGA')
truth = mf.kernel(max_cycle=500)

We can check for error (in the total energy) with our code by running:

HF_molecule = molecule_from_pyscf(mf, scf_iteration=500)
print(HF_molecule.nonXC() - truth)

A good benchmark for judging this error is chemical accuracy (1 mHa) but in reality I hope to see us with around 100x less error.

Simplify README.md

Presently, the README.md contains too much mathematics. It should just show the most simple and easy way to use the code and install instructions.

Adding install instructions

There exists no requirements.txt in ~ or install instructions in README.md.

In further issues, we should also consider/discuss making a setup.py and the other steps needed to make this package installable through PyPI.

Implement new loss functions in `train.py`

Presently, we have the "energy-only" loss in functional.py currently named default_loss. We should move this into train.py and call it "energy_loss". In total, we should have 3 loss functions:

  1. energy_loss: can be used with self-consistent and non-self consistent training
  2. density_loss: only for use in self-consistent training (and maybe Harris Fouled training if we implement it). Anything to do with density in non-self consistent training doesn't make sense as we never update the 1RDM so the predicted density is static in training.
  3. energy_and_density_loss: For use in self-consistent training only for same reasons as 2.

There are some more considerations here as to what to do with the stop_gradient on the non-XC energy when using the different loss functions, but this will be handled in another issue.

We also have the "implied self consistency" approach I recommended some time ago. I've attached the whiteboard scrawling for this below:

PXL_20230823_211058359 MP

Long story short, self consistency can be by passed during training if we calculate the total energy passing it a true many body density. You then match the predicted density one step forward in an SCF cycle to the true density (second term in the equation). This loss has the same minimum point as a full self consistent training.

Stable `eigh` for degenerate eigenvalues

When implementing self-consistent training methods, I thought I had fixed the NaN-gradients problem, but in reality, I only partly fixed it.

The core issue is that the reverse-mode Jacobian is undefined for degenerate eigenproblems. This is because in the eigenvector term for the gradients we have a term like: $1/(\lambda_i - \lambda_j)$ for eigenvalues $\lambda$ which is+inf for $\lambda_i = \lambda_j$.

I have implemented a "safe version" of the reverse-mode Jacobian which uses the Lorentzian broadening approach to mitigate this. I used the implementation suggested here, slightly modified for symmetric matrix inputs.

Broadening also only happens when eigenvalue differences are below some tolerance which is set by the user. Therefore, this method produces the same gradients as jnp.linalg.eigh for non-degenerate problems.

Design a pretty icon for the repo README.md

Discuss ideas for this.

I also think we should include a gif animation. Something cool like a heatmap or isosurface of the charge density evolving during training of a neural functional would be pretty neat.

We should come back to this one once more high priority issues are dealt with.

Make an example notebook displaying the different kind of loss functions

I want a notebook where we test using cost functions that:

(1) Train only the energy in a non self consistent way

(2) Train only the energy in a self-consistent way

(3) Train the energy and density in a way which enforces self consistency

The example will be for training on the binding curve of H_2 and testing on the binding curve of LiH. I also wish to generate a gif which shows the reduction of error in the LiH density at some bond length as a function of training epochs on H2. This will make a cool .gif animation for the repo README.md

Fix unstable B88 tests

the B88 tests sometimes pass and sometimes fail. Will trying fixing the grid level and fixing reducing the tolerance.

Numerical stability during autodifferentiation of eigh

In #17 we discussed that it would be quite important to implement the training self-consistently. Unfortunately, I was having lots of Nan-related issues in trying to differentiate throughout the self-consistent loop. With a simple example, I have been able to narrow down the problem to an external implementation of the generalized eigenvalue problem, provided in https://github.com/XanaduAI/DiffDFT/blob/main/grad_dft/external/eigh_impl.py and originally reported in https://gist.github.com/jackd/99e012090a56637b8dd8bb037374900e. The author also has some notes in https://jackd.github.io/posts/generalized-eig-jvp/.
The problem is that jax.scipy.linalg.eigh only supports the implementation when b = 0, even if it is not reported in the docs https://jax.readthedocs.io/en/latest/_autosummary/jax.scipy.linalg.eigh.html.
Substituting the line https://github.com/XanaduAI/DiffDFT/blob/da7d384fcb1bae547fa9370ae7846bf939a6d580/grad_dft/evaluate.py#L544 by anything else removes the error. For example, even if it nonsense, I can do

mo_energy = fock[:,:,0]
mo_coeff = fock + fock**2

which preserves the right matrix shapes, and retains dependency on the Fock matrix (so gradients are not 0). Then, for a very simple LDA model with a single layer etc, the gradients all look good even when the number of scf iterations is 35, though it breaks between 35 to 40 scf iterations.
I think solving this should be a high-priority problem now.

Implement a limited number of high priority unit tests

There are presently no unit tests at all in the repo. It is very likely that we will not have time to achieve high test coverage, but we should identify some key areas to test.

As I progress through understanding the code base internals, it will become more obvious which parts these are and we can discuss below.

The return of the NaN

I've noticed that for a few basis sets, NaN gradients are appearing again when trained using the DIIS SCF loops but not the linear mixing loops.

I think this is likely because degenerate eigenvectors/eigenvalues are being encountered in the calls to jnp.linalg.eigh routines. We can try switching these out for our custom safe_eigh which will hopefully fix the bug.

Make linear mixing SCF code Jittable

Much like the DIIS Jittable code, it would be nice to have the linear mixing code Jittable too. Linear mixing, while slower, is typically less prone to instability issues so it's good to have this in our tool box.

Reduce CI costs

MacOS tests in the CI are expensive. Let's just test with Ubuntu and change the actions triggering event to Pull requests and not pushes.

Add license and licensing strings to python modules

Proceeding with apache 2.0, we should include this license in the root directory and place the below strings in the header of all source files:

# Copyright 2023 Xanadu Quantum Technologies Inc.

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at

#     http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

Formatting and linting

Xanadu open sourcing policy requires we lint the code. I'm a fan of Python Black so unless anybody is opposed, I will use this tool.

I will also add a precommit hook:

#!/bin/sh

set -e

files=`git diff --staged --name-only --diff-filter=d -- "*.py"`

for file in $files; do
  black $file
  git add $file
done

To ensure that new commits are always formatted.

Give credit to PySCF in accordance to licensing

PySCF is provided with an Apache 2.0 license and we have translated a large portion of this into JAX. We should therefore acknowledge this in accordance to the software licence where is needed.

Make a differentiable SCF procedure

Presently, a single iteration of the SCF procedure is now fully differentiable and produces stable gradients.

However, when we update the relevant parameters in a new iteration of the loop using the logic of some SCF procedure (like DIIS for example), gradient computations fail as they are not presently fully differentiable in the code.

DIIS is a fairly advanced SCF iterator as they go, so we can try something much simpler like linear mixing. I.e, update the charge density (equivalent the RDM) each iteration like:

$$ \rho_{out} = (1 - \alpha) \rho_{old} + \alpha \rho_{new} $$

where we pick some $0 \leq \alpha \leq 1$

This is arguably the most stable and reliable (but slow) way of performing the SCF procedure.

Correctness checks

Grad-DFT achieves something fairly complex. That along side the fact that (i) development occurred without any unit testing and (ii) it has not been exposed to a large number of users means that the probability that things (and perhaps even important things) are wrong is quite high.

In this issue, I will pass comments/questions about things I either don't understand or believe are incorrect. If we agree that incorrectness is present, this will be raised in separate issues and corrected.

Fix failing examples

A number of the examples in the ~examples directory do not finish without errors.

More details will be given below on a module by module basis.

Repo memory is high and contains things that should probably be elsewhere.

Running

du -h ~/.

shows:

533M	./DiffDFT

This is very large for a code repo. Refining the source of the large amount of data, it seems most is in:

~/checkpoint_dimers

and

~/ckpt_dimers

I assume these are the raw results of experiments. They should be moved elsewhere (zenodo once paper is released) as this repo is for the code.

Implement the Harris-Foulkes energy

In #17, a concern raised was to do with the usage of self-consistency in training. While I think we should still look into implementing this, a good work around is to use the Harris-Foulkes Functional instead of the energy functional.

When self consistency has not been reached, this gives a much better estimate of the electronic energy than the DFT energy functional.

I cannot see if this is implemented in PySCF though so could take a little more thought.

Demonstrate parallel execution of a loss function

On an HPC cluster, each term in a mean square loss can be calculated using embarrassingly parallel logic.

Unfortunately, the native way of doing this with jax (using jax.vmap and jax.pmap) is not compatible with input we must parallelize over: the Molecule object. This is because its data is stored in "ragged" structure. I.e., the dimensions of the grid for one molecule are very often different from the grid for another and the dimensions of the 1-RDM for one molecule is different for another: jnp.array([rdm1_1, rdm1_2]) will not work.

This means that for loss parallelism, we need to think differently. Sharding may be the way forward, but this requires more thought. A good reference is here: https://jax.readthedocs.io/en/latest/notebooks/Distributed_arrays_and_automatic_parallelization.html

I don't think we will get around to solving this problem before our release deadline, but if we want to do something with HPC, getting this right is non-negotiable.

Implementing a documentation method

While I can see a few examples for using the code, there is no formal ~/docs folder.

What we use for this will depend on need, but I am most familiar in setting this up with Sphinx. We can then consider hosting the sphinx docs at the gitHub.io we address associated with the future public repository.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.