Code Monkey home page Code Monkey logo

horton's Introduction

HORTON: Helpful Open-source Research TOol for N-fermion systems. Copyright (C) 2011-2016 The HORTON Development Team

For more information, visit HORTON's website: http://theochem.github.io/horton/latest

HORTON 3 info:

HORTON 3 is the modular rewrite of HORTON. It is split into several repositories and will not be hosted here. The table below contains information on the progress of various modules.

module name location code package doc
gbasis https://github.com/theochem/gbasis x x
iodata https://github.com/theochem/iodata x x x
grids + cell (old) https://github.com/theochem/old_grids x x
grids (new) https://github.com/theochem/qcgrids
cell (new) https://github.com/theochem/cellcutoff x x
meanfield https://github.com/theochem/meanfield x x
porcelain https://github.com/QuantumElephant/horton-porcelain RFC!

Acknowledgements

This software was developed using funding from a variety of international sources including, but not limited to: Canarie, the Canada Research Chairs, Compute Canada, the European Union's Horizon 2020 Marie Sklodowska-Curie Actions (Individual Fellowship No 800130), the Foundation of Scientific Research--Flanders (FWO), McMaster University, the National Fund for Scientific and Technological Development of Chile (FONDECYT), the Natural Sciences and Engineering Research Council of Canada (NSERC), the Research Board of Ghent University (BOF), and Sharcnet.

horton's People

Contributors

alimalek2000 avatar cestdiego avatar crisely09 avatar farnazh avatar kimt33 avatar kzinovjev avatar matt-chan avatar paulwayers avatar ptecmer avatar sfias avatar susilehtola avatar tczorro avatar tovrstra avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

horton's Issues

Horton ERIs differ from PySCF in cc-pvtz

For the following test system, horton integrals differ from PySCF:

from horton import *
import numpy as np

from horton.pyscf_wrapper import gobasis


# Hartree-Fock calculation
# ------------------------

# Construct a molecule from scratch
mol = IOData(title='dinitrogen')
mol.coordinates = np.array([[0.0, 0.0, 0.0]])
mol.numbers = np.array([2])

# Create a Gaussian basis set
obasis = get_gobasis(mol.coordinates, mol.numbers, 'cc-pvtz')
obasis2 = gobasis(mol.coordinates, mol.numbers, 'cc-pvtz')

# Create a linalg factory
lf = DenseLinalgFactory(obasis.nbasis)

# Compute Gaussian integrals
olp = obasis.compute_electron_repulsion(lf)
#olp = obasis.compute_overlap(lf)
print "="*50
print "HORTON ERI"
print olp._array
print "="*50
print "PYSCF ERI"
olp2 = obasis2.compute_electron_repulsion(lf)
#olp2 = obasis2.compute_overlap(lf)
olp2._array = olp2._array.transpose([0,2,1,3])
print olp2._array.transpose([0,2,1,3])
print "="*50
print olp._array - olp2._array
print np.allclose(olp._array, olp2._array)

Parsing coverage output is not robust

The format of the screen output depends on the version over Coverage.py, so we'd better not rely on it. The XML output is not great either because it provides only line-by-line information with way too much detail. It is probably also possible to directly interact with the coverage API to extract the information of interest.

Write a regression tester (several unit tests are too slow and ineffective)

A regression tester, e.g. similar to what is used in the CP2K project, works as follows:

  1. A set of tests is defined that mimic typical user behavior. This is different from unit tests, where one tries to focus on one small piece of code at a time. In a regression tester, one test may (or should) touch many different parts of the code.

  2. Each test generates some output that is easily processed, i.e a set of numbers that can be compared easily to previous runs. For each number, the test should also define a threshold for acceptable deviations from previous runs, to account for slight variations between different CPU architectures, randomized integration grids, ... JSON or another text-based format should be used to facilitate the processing of the output. (Text-based is needed because of the following point.)

  3. When a regression test is written, it must be executed by the author and the original result is also committed to the repository. Idealy, the reference outputs should contain one number per line to make diff stats easily to follow.

  4. The regression tests are executed on different architectures a daily basis. Deviations from the original output should somehow become available to the relevant persons, e.g. by e-mail, through a web interface, ... See for example: https://dashboard.cp2k.org/

  5. When a regression test suddenly gives a different result, the author should determine its cause: (a) a bug was introduced or (b) a bug was fixed. In case (a) the bug must be fixed and a unit test should be added to avoid similar bugs in future. In case (b), the new output of the regression test should be committed.

In the above scheme, point 3 is done differently in CP2K: each test node keeps its own set of reference values and they are not committed to the repository. Instead a TEST_FILES_RESET is committed with all names of the tests that are affected by a commit. The regression tester on each test node can figure out from the revision history of TEST_FILES_RESET when reference outputs should be updated. This is not so nice because it allows for large differences between different architectures.

It seems logical to turn every example (in data/examples) into a regression test and vice versa: make every regression test an example. The run time of one test should be limited to one minute.

Once this system is in place, a lot of unit tests should be turned into regression tests. For example, a lot of HF/DFT and horton/part tests are in fact regression tests in a unit testing framework. It is a poor match, bound to cause trouble when refactoring code.

For a recent example of a regression test causing troubles in our unit tests: see PR #29.

Move QA scripts to separate repository

We should make the QA framework reusable in other projects. The current scripts are loosely integrated with HORTON and they can be factored out relatively easily. This would also improve the quality of the QA scripts, because they get used in different scenario's.

Integration or updating of the QA scripts with a new project should work with a bootstrap script, e.g. as follows:

wget https://raw.githubusercontent.com/foo/qascripts/bootstrap.sh && bash ./bootstratp.sh

The other option is that we work with a git submodule: https://git-scm.com/book/en/v2/Git-Tools-Submodules. I'm generally not a fan of submodules but this may be a good fit because every project then has the freedom to make local modifications and rebase these regularly on the developments in the main qascripts repo. (That is different from a conventional dependency where all changes are done upstream.)

The following points should then also be addressed:

  • Make some scripts more flexible, e.g. to run only a selection of trapdoor scripts.
  • Rewrite some or even all shell scripts ( e.g. test_all_*.sh) in Python to keep them maintainable while adding extra features.
  • Make a trapdoor.cfg.example and add trapdoor.cfg to .gitignore
  • Make better use of dependencies.txt when installing/updating stuff.
  • Make standard module for reading and using dependencies.txt, e.g. including check for duplicate entries.
  • Break up checklist in different sections that can be easily referred to from other projects.

Drop support for older distros (ubuntu 12.04 and fedora 21)

I've tried to install on Ubuntu 12.04 but it is a nightmare to get it working. Given it's nearing end-of-life, it makes little sense to keep supporting it with draconian instructions for backporting recent packages.

I have not tried to install on fedora 21 but given it is past its end of life, we can safely drop it and stop worrying.

Trapdoor scripts are too verbose

(This point has been brought up before by several people.) When a few lines are added/removed in the beginning of a big file, there are often a lot of "new" messages in the trapdoor QA output, which are not really new. It is just that their line numbers have changed. That is becoming a bigger nuisance than I originally thought, so I'd like to have it fixed.

Tests using random numbers sometimes fail

This is mostly noticable with the convoluted slow tests but it only happens rarely. For example, when an initial guess for a WFN is random, it sometimes does not converge. All tests and examples should use fixed seeds for the random number generators to avoid this issue. (That said, random guesses are probably not the best examples.)

A context manager can be used to fix such problems relatively easily. See for example:

https://github.com/QuantumElephant/romin/blob/master/romin/test/random_seed.py

Add setup.cfg files for Fedora 22, 23

For fedora 22 and 23, these are just the same as for 21: data/setup_cfgs/setup.Linux-Fedora-21*. They probably also work for 24 but that should be tested.

Fix a few minor issues in git documentation

  • Refer to general download and install page for the installation of git and other development dependencies.
  • Put a copy of the pre-commit hook in tools and make it executable. Make a literalinclude of that file in the docs and provide the command cp tools/pre-commit .git/hooks to install the hook.

Additional Gaussian integrals

We should add integrals for forces and multipole moments. I believe the code is already lingering in some branches but it should just be dusted a little.

Check for namespace collisions between different submodules

This can be tested with a trapdoor script: the namespace of each module must be imported separately and must be checked for overlaps. It would also be good to check that numpy and a few other popular packages do not end up in the HORTON namespace.

Logging needs to be cleaned up and made consistent

The timer right now lives as long as the python interpreter lives.

It would be useful to be able to start/stop/reset the timer, in case someone wants to put Horton as a part of a larger script, or run multiple jobs in one interpreter.

Typical error in unit tests

I just fixed on occurence of the following:

assert abs(obasis.con_coeffs - np.array([0.15432897, 0.53532814, 0.44463454])).all()

It is missing a .max(). Such checks pass even if there is a mismatch. It is better to use numpy.testing instead. In this case:

np.testing.assert_almost_equal(obasis.con_coeffs, [0.15432897, 0.53532814, 0.44463454])

We should at least fix all the cases where .max() was forgotten.

Numerical Poisson solver gives wrong result in origin.

The Poisson solver by itself works fine but the generated cubic splines make the wrong extrapolation for very small radii. They just return zero for r=0, which is wrong for l=0. A related problem is that for l>0, the IntGrid.eval_decomposition method returns NaN, which is not the correct answer in most cases.

Last steps (after fixing all other issues) for 2.0.1 release

Can be done in PR:

  • Update version number
  • Write release notes
  • Update year in copyright statements
  • Make docs available for different versions
  • Add Stijn to the list of authors in the HORTON citation

After PR gets merged:

  • Upload tar.gz file to github
  • Update website

Clean code using pycharm introspection

Lots of warnings in from inspection:

  • unused locals
  • unused references
  • PEP8 violations
  • etc...

The Pycharm scope I currently use is:
(file[horton]:horton//*&&!file[horton]:horton/correlatedwfn/cached//*||file[horton]:scripts//*||file[horton]:tools//*||file[horton]:data//*)&&!file:*.pyx&&!file:*.cfg&&!file:cext.*&&!file:*.fchk&&!file:*.pxd

Add pylint exception to 'from horton import *' in tests

This is the proper way to import horton in tests because it mimics the typical usage pattern. To keep pylint happy, we should do it as follows:

from horton import *  # pylint: disable=wildcard-import,unused-wildcard-import

Evaluation of (XC) kernels

I have an extension of the meanfield code that allows one to compute the dot product of a kernel of (a part of) an effective Hamiltonian with a given (symmetric) two-index object. Needs to be rebased, cleaned and what not.

Fix a few more fedora 24 issues

  • Python 2.X package names have been renamed from python-* to pyhon2-*
  • Fedora 24 has sufficiently recent packages for sphinx, sphinx_rtd_theme and breathe, so we don't need pip anymore for that part.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.