Code Monkey home page Code Monkey logo

qcengine's Introduction

QCEngine

Build Status codecov Documentation Status Conda (channel only) Chat on Slack python

Quantum chemistry program executor and IO standardizer (QCSchema) for quantum chemistry.

Example

A simple example of QCEngine's capabilities is as follows:

>>> import qcengine as qcng
>>> import qcelemental as qcel

>>> mol = qcel.models.Molecule.from_data("""
O  0.0  0.000  -0.129
H  0.0 -1.494  1.027
H  0.0  1.494  1.027
""")

>>> inp = qcel.models.AtomicInput(
    molecule=mol,
    driver="energy",
    model={"method": "SCF", "basis": "sto-3g"},
    keywords={"scf_type": "df"}
    )

These input specifications can be executed with the compute function along with a program specifier:

>>> ret = qcng.compute(inp, "psi4")

The results contain a complete record of the computation:

>>> ret.return_result
-74.45994963230625

>>> ret.properties.scf_dipole_moment
[0.0, 0.0, 0.6635967188869244]

>>> ret.provenance.cpu
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz

See the documentation for more information.

License

BSD-3C. See the License File for more information.

qcengine's People

Contributors

ahurta92 avatar anabiman avatar awvwgk avatar baritone0211 avatar bdice avatar coltonbh avatar dgasmith avatar dotsdl avatar eljost avatar farhadrgh avatar ffangliu avatar hokru avatar joshhorton avatar jthorton avatar jturney avatar kexul avatar lnaden avatar loriab avatar mattwelborn avatar mattwthompson avatar maxscheurer avatar mfherbst avatar robertodr avatar sinamostafanejad avatar sjrl avatar stvogt avatar taylor-a-barnes avatar vivacebelles avatar wardlt avatar zachglick avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

qcengine's Issues

Bootcamp: Changeset 4

diff --git a/qcengine/programs/base.py b/qcengine/programs/base.py
index 5970827..9c0bcce 100644
--- a/qcengine/programs/base.py
+++ b/qcengine/programs/base.py
@@ -5,11 +5,8 @@
 from typing import Set
 from ..exceptions import InputError, ResourceError
 
-from .cfour import CFOURHarness
 from .dftd3 import DFTD3Harness
 from .entos import EntosHarness
-from .gamess import GAMESSHarness
-from .nwchem import NWChemHarness
 from .molpro import MolproHarness
 from .mopac import MopacHarness
 from .mp2d import MP2DHarness
@@ -97,7 +94,4 @@ def list_available_programs() -> Set[str]:
 register_program(DFTD3Harness())
 register_program(TeraChemHarness())
 register_program(MP2DHarness())
-#register_program(GAMESSHarness())
-#register_program(NWChemHarness())
-#register_program(CFOURHarness())
 register_program(EntosHarness())

Psi4 version checking

I installed QCEngine on a machine that had Psi4 v1.2.1 installed through Conda, and it was failing the Psi4-related tests without flagging a recognizable error related to the incompatible Psi4 version.

program options beyond flat plain-old-data

At some point QCEngine will have to confront what options look like in non psi4/cfour/qchem-like programs. For example,

dft
  direct
  ...
end

is nwchem for boolean direct algorithm for dft. Another example is ESTATE=0/1/0/0 for an array variable in Cfour. even if the user knew the option and value they wanted (respectively, dft direct on and B1 state only in C2v), the settings of the keywords block in qcschema would be very different depending on whether they knew python best (dft_direct = True, estate=[0, 1, 0, 0]) or the target program domain language best. naturally, the input file must be formattable from the qcschema keywords dict.

My philosophy has been that the keyword RHS must be in natural python format (True, [0, 1, 0, 0]) and the LHS must be predictable by someone who knows the program DSL (domain specific lang) with double underscore being any module separator, so dft__direct and estate. That way, we’re only transforming, not making a new DSL. Somehow, will have to work Molpro into this.

This much, as I see it, is in the qcengine domain, not the qcdb (which is concerned with translating LHS options). Any concerns/disputes/that-doesn’t-belong-here-arguments before I act like this is qcng’s philosophy, too?

  • qcschema precedence is not fleshed out
  • On __-separated vs. nested dict: I used to use the nested dict but find the __ separator much easier on the user. Since nested dict is an intermediate in: __-sep-string --> nested-dict --> formatted-input, I can see allowing either at the qcng level.

DFTD3 Method error for incorrect atom distances

Describe the bug
This is more of a hunch than a hard error. In several cases where jobs were inputted into QCArchive in Angstrom rather than Bohr DFTD3's error return was "Unsuccessful run. Possibly -D variant not available in dftd3 version.".

Additional context
It would be good to run this and see if we can break out "atom too close" errors from variant errors.

Complex, ordered multi-command input for Molpro

Is your feature request related to a problem? Please describe.
Molpro input files are very complex, and depend on an imperative programming structure. For example, CCSD can only be performed after a HF command. This leads to very complicated input files, such as the following:

memory,300,m

symmetry,nosym
geometry=geo.xyz

basis=cc-pVDZ

gthresh,ENERGY=1e-10
{df-rhf,;save,2100.2}

{ibba,antibonds=1,THRLOC_IB=1e-12;freezecore;orbital,2100.2}
{locali,boys;print,orbital;fock,2100.2;order,type=fock}

{df-lccsd(t),chgfrac=1.0,canblk=0;core,6}

I want a way to specify this input file inside an AtomicInput.

Describe the solution you'd like
A format for specifying commands, in order. Each command must support args, kwargs, and directives. Directives themselves have args and kwargs. args and kwargs are not ordered. Directives are not ordered. (@sjrl please verify this last point).

One possible format would be:

keywords["commands"]: List[Commands]

class Directive: 
    name: str
    args: Optional[List[Union[str,int,bool,float]]]
    kwargs: Optional[Dict[str, Union[str,int,bool,float]]

class Command(Directive): 
    directives: Set[Directive]

Under this structure, the above input file would become:

method = "df-lccsd(t)"
basis = "cc-pVDZ"  # note that this replaces the basis command
molecule = Molecule.from_file("geo.xyz")
keywords = {
    "memory": 300,
    "symmetry": "nosym",
    "gthresh": "energy=1e-10",  # gthresh is weird, and this is a lame solution
    "commands": [
        {"name": "df-rhf", "directives": {{"name": "save", "args": ["2100.2"]}}},
        {
            "name": "ibba",
            "kwargs": {"antibonds": True, "thrloc_ib": 1e-12},
            "directives": {
                {"name": "freezecore"},
                {"name": "orbital", "args": ["2100.2"]},
            },
        },
        {
            "name": "locali",
            "args": ["boys"],
            "directives": {
                {"name": "print", "args": ["orbital"]},
                {"name": "fock", "args": ["2100.2"]},
                {"name": "order", "kwargs": {"type": "fock"}},
            },
        },
        {
            "name": "df-lccsd(t)",
            "kwargs": {"chgfrac": 1.0, "canblk": 0},
            "directives": {{"name": "core", "args": [6]},},
        },
    ],
}

Describe alternatives you've considered

  • Template files (current solution).
  • dunder representation of nested dictionaries. Will lead to extremely long strings, and does not have the necessary ordered property of commands.

@loriab @sjrl

Document program detection

The documentation section "Environment Detection" should provide an example of how to see which programs were detected. I think that qcengine.list_available_programs() does this? (Same goes for procedures.)

Testing import standardization

Currently in the tests there is both from qcengine.testing import ... and from QCEngine import testing. We should settle on a single strategy for code cleanliness. I would propose from qcengine.testing import ... so that this strategy works for both functions and fixtures.

psi4 CCSD(T) D1/D2 diagnostics not reported for open shell case

I use the following block of code to do CCSD(T) calculations

psi4_atom_task = qcelemental.models.ResultInput (
    molecule= mol,
    driver="energy",
    model= {"method": "ccsd", "basis": "6-31g"},
)
ret=qcengine.compute(psi4_atom_task, "psi4")

ret.dict()['extras']

If the mol is restricted shell, then the result in 'extras' includes non-zero values of the CC D1 DIAGNOSTIC and CC D2 DIAGNOSTIC. The example output of water molecule at singlet state is:

{'qcvars': {'-D ENERGY': 0.0,
  'CC D1 DIAGNOSTIC': 0.015061336005482073,
  'CC D2 DIAGNOSTIC': 0.12393617164619228,
  'CC NEW D1 DIAGNOSTIC': 0.015061336005482073,
  'CC T1 DIAGNOSTIC': 0.006755831075299291,
  'CCSD CORRELATION ENERGY': -0.13940696350922102,
  'CCSD OPPOSITE-SPIN CORRELATION ENERGY': -0.11488872459315766,
  'CCSD SAME-SPIN CORRELATION ENERGY': -0.02451823891606328,
  'CCSD TOTAL ENERGY': -76.1195648218788,

However, if the calculation is for an open shell system, CC D1 DIAGNOSTIC and CC D2 DIAGNOSTIC will always be zero regardless of molecule species. I suspect that QCEngine is not parsing out the D1 and D2 values for these calculation, resulting in zero values all the time.

Add list of programs currently supported in docs

QCEngine supports executing a variety of quantum chemistry, semiempirical, and molecular mechanics programs. The programs currently supported appear to be:

> from qcengine import list_all_programs
> list_all_programs()
{'dftd3',
 'entos',
 'molpro',
 'mopac',
 'mp2d',
 'psi4',
 'rdkit',
 'terachem',
 'torchani'}

The docs do not currently indicate what QCEngine supports. I propose we add a docs page with the list of currently-supported programs, possibly with comments as to what is and is not supported for each.

MOPAC Codecov

MOPAC Codecov is not being uploaded due to "git not found":

/bin/sh: git: command not found
/bin/sh: hg: command not found
/bin/sh: git: command not found
/bin/sh: hg: command not found

      _____          _
     / ____|        | |
    | |     ___   __| | ___  ___ _____   __
    | |    / _ \ / _  |/ _ \/ __/ _ \ \ / /
    | |___| (_) | (_| |  __/ (_| (_) \ V /
     \_____\___/ \____|\___|\___\___/ \_/
                                    v2.0.15

==> Detecting CI provider
    Error running `git rev-parse --abbrev-ref HEAD || hg branch`: None
  -> Got branch from git/hg
  -> Got sha from git/hg
==> Preparing upload
Error: Commit sha is missing. Please specify via --commit=:she

This used to work, so unsure what happened. @Lnaden any ideas?

https://dev.azure.com/MolSSI/QCArchive/_build/results?buildId=107&view=logs&j=0e9986bc-4438-57a6-7391-1704fabd60a9&t=9bc28805-ff26-57a6-76fd-c5967bb8a1e9

Implement DFT-D4 harness

Is your feature request related to a problem? Please describe.
The DFT-D4 dispersion correction has been recently published and should be made available for commonly used quantum chemistry packages.

Describe the solution you'd like
An integration via the dftd4 C-API or Python-API in QCEngine.

Describe alternatives you've considered
IO based integration would be possible but is not really desirable.

Additional context
The reference implementation of dftd4 is available here: https://github.com/dftd4/dftd4.
A C++ ported version of dftd4 is available (used for ORCA 4.2.0).
I am also volunteering to implement the harness in QCEngine. (currently not enough time)

Update:
dftd4 is now available via conda-forge: https://anaconda.org/conda-forge/dftd4.

Pydantic Options

Move to pydantic options rather than raw dict for two reasons:

  • Validation is automatic and error message will look the same across QCA ecosystem,
  • Merging options is straightforward for local options, #3.

tests installation

Somehow that manifest isn't sending the message that the programs/tests/ need to be installed. (Hence psi's unhappiness, as its tests borrow data from qcng.)

python-scripts/qcengine
site-packages/qcengine-0.7.0.dist-info/INSTALLER
site-packages/qcengine-0.7.0.dist-info/LICENSE
site-packages/qcengine-0.7.0.dist-info/METADATA
site-packages/qcengine-0.7.0.dist-info/RECORD
site-packages/qcengine-0.7.0.dist-info/WHEEL
site-packages/qcengine-0.7.0.dist-info/entry_points.txt
site-packages/qcengine-0.7.0.dist-info/top_level.txt
site-packages/qcengine/__init__.py
site-packages/qcengine/_version.py
site-packages/qcengine/cli.py
site-packages/qcengine/compute.py
site-packages/qcengine/config.py
site-packages/qcengine/exceptions.py
site-packages/qcengine/extras.py
site-packages/qcengine/procedures/__init__.py
site-packages/qcengine/procedures/base.py
site-packages/qcengine/procedures/geometric.py
site-packages/qcengine/procedures/model.py
site-packages/qcengine/programs/__init__.py
site-packages/qcengine/programs/base.py
site-packages/qcengine/programs/cfour/__init__.py
site-packages/qcengine/programs/cfour/harvester.py
site-packages/qcengine/programs/cfour/keywords.py
site-packages/qcengine/programs/cfour/runner.py
site-packages/qcengine/programs/dftd3.py
site-packages/qcengine/programs/empirical_dispersion_resources.py
site-packages/qcengine/programs/entos.py
site-packages/qcengine/programs/gamess/__init__.py
site-packages/qcengine/programs/gamess/harvester.py
site-packages/qcengine/programs/gamess/runner.py
site-packages/qcengine/programs/model.py
site-packages/qcengine/programs/molpro.py
site-packages/qcengine/programs/mp2d.py
site-packages/qcengine/programs/nwchem.py
site-packages/qcengine/programs/psi4.py
site-packages/qcengine/programs/rdkit.py
site-packages/qcengine/programs/terachem.py
site-packages/qcengine/programs/torchani.py
site-packages/qcengine/programs/util/__init__.py
site-packages/qcengine/programs/util/hessparse.py
site-packages/qcengine/programs/util/pdict.py
site-packages/qcengine/stock_mols.py
site-packages/qcengine/testing.py
site-packages/qcengine/tests/__init__.py
site-packages/qcengine/tests/test_config.py
site-packages/qcengine/tests/test_procedures.py
site-packages/qcengine/tests/test_program_utils.py
site-packages/qcengine/tests/test_standard_suite.py
site-packages/qcengine/units.py
site-packages/qcengine/util.py

MOPAC nthread trials

MOPAC sets nthreads internally based on the MKL_NUM_THREADS, we need to double check that this is being correctly passed down before deploying on HPC machines.

Bootcamp: Changesset 6

diff --git a/qcengine/programs/molpro.py b/qcengine/programs/molpro.py
index d618956..8054308 100644
--- a/qcengine/programs/molpro.py
+++ b/qcengine/programs/molpro.py
@@ -4,7 +4,7 @@ Calls the Molpro executable.

 import string
 import xml.etree.ElementTree as ET
-from typing import Any, Dict, List, Optional
+from typing import Any, Dict, List, Set, Tuple, Optional

 from qcelemental.models import Result
 from qcelemental.util import parse_version, safe_version, which
@@ -26,7 +26,7 @@ class MolproHarness(ProgramHarness):
     version_cache: Dict[str, str] = {}

     # Set of implemented dft functionals in Molpro according to dfunc.registry (version 2019.2)
-    _dft_functionals = {
+    _dft_functionals: Set[str] = {
         "B86MGC", "B86R", "B86", "B88C", "B88", "B95", "B97DF", "B97RDF", "BR", "BRUEG", "BW", "CS1", "CS2",
         "DIRAC", "ECERFPBE", "ECERF", "EXACT", "EXERFPBE", "EXERF", "G96", "HCTH120", "HCTH147",
         "HCTH93", "HJSWPBEX", "LTA", "LYP", "M052XC", "M052XX", "M05C", "M05X", "M062XC",
@@ -49,9 +49,9 @@ class MolproHarness(ProgramHarness):
     }

     # Currently supported methods in QCEngine for Molpro
-    _scf_methods = {"HF", "RHF", "KS", "RKS"}
-    _post_hf_methods = {'MP2', 'CCSD', 'CCSD(T)'}
-    _supported_methods = {*_scf_methods, *_post_hf_methods}
+    _scf_methods: Set[str] = {"HF", "RHF", "KS", "RKS"}
+    _post_hf_methods: Set[str] = {'MP2', 'CCSD', 'CCSD(T)'}
+    _supported_methods: Set[str] = {*_scf_methods, *_post_hf_methods}

     class Config(ProgramHarness.Config):
         pass
@@ -113,10 +113,10 @@ class MolproHarness(ProgramHarness):
                 extra_infiles: Optional[List[str]] = None,
                 extra_outfiles: Optional[List[str]] = None,
                 as_binary: Optional[List[str]] = None,
-                extra_commands=None,
+                extra_commands: bool = None,
                 scratch_name: Optional[str] = None,
                 scratch_messy: bool = False,
-                timeout: Optional[int] = None):
+                timeout: Optional[int] = None) -> Tuple[bool, Dict[str, Any]]:
         """
         For option documentation go look at qcengine/util.execute
         """

standardize job name, commands

two items for a standardization pass:

  • (A) insofar as programs allow, should we standardize on a filename in qcng? e.g., qcengine_job.[in|out|inp|nw|mop] so it's clearer what are placeholders vs. runtime details.

  • (B) commands and extra_commands are deceptive in that while one can have arguments aplenty, qcng.util.execute is adamant about running only one command. propose singularizing for clarity.

Virtual core allocations

Check virtual cores, and limited subscription of cores are correctly allocated. This often comes up on VM's and supercomputers where a single node is not fully given to individual jobs. Appears to work on Travis, AWS, and ARC.

version collection on Windows

Describe the bug
Idk if it's Azure, Windows, or general script echoing, but the usual version printing is incorporating the path, and then the safe_version madly join/hyphenates the result. I'll fix this for psi4, so this is an fyi should others hit bizarre versions.

    def get_version(self) -> str:
        self.found(raise_error=True)

        which_prog = which("psi4")
        print("v0:", which_prog)
        print("v1:", self.version_cache)
        with popen([which_prog, "--version"]) as exc:
            exc["proc"].wait(timeout=30)
        print("v2:", exc["stdout"])
        print("v3:", exc["stdout"].strip())
        print("v4:", safe_version(exc["stdout"]))
        if which_prog not in self.version_cache:
            with popen([which_prog, "--version"]) as exc:
                exc["proc"].wait(timeout=30)
            self.version_cache[which_prog] = safe_version(exc["stdout"])
2019-12-15T22:49:01.4454072Z v2: 
2019-12-15T22:49:01.4454364Z 
2019-12-15T22:49:01.4454695Z D:\a\1\b>C:/tools/miniconda3/python.exe D:\a\1\b\install\bin\psi4 --version 
2019-12-15T22:49:01.4454997Z 
2019-12-15T22:49:01.4455300Z 1.4a2.dev345
2019-12-15T22:49:01.4455611Z 
2019-12-15T22:49:01.4455891Z 
2019-12-15T22:49:01.4456222Z v3: D:\a\1\b>C:/tools/miniconda3/python.exe D:\a\1\b\install\bin\psi4 --version 
2019-12-15T22:49:01.4456545Z 
2019-12-15T22:49:01.4456851Z 1.4a2.dev345
2019-12-15T22:49:01.4457188Z v4: -D-a-1-b-C-tools-miniconda3-python.exe.D-a-1-b-install-bin-psi4.-version.-1.4a2.dev345-

Generate `basis` if not specified in OpenMMHarness

During #151, it was decided that the basis field in AtomicInput for the OpenMMHarness need not be specified. When not specified, it would be generated from the contents of url or offxml. This is partially implemented, but in a fragile way that is not desirable, in OpenMMHarness._generate_basis.

Ideally, the basis contents would be generated as:

  • f"{forcefield_name}-{hash(forcefield_schema)}" for non-versioned forcefields
  • f"{forcefield_name}-{forcefield_version}" for versioned forcefields

However, there does not appear to exist a way to pull forcefield_version from the XML contents of an e.g. SMIRNOFF force field. This will likely require this addition to future releases of SMIRNOFF force fields.

Bootcamp: Changeset 1

diff --git a/qcengine/programs/nwchem/harvester.py b/qcengine/programs/nwchem/harvester.py
index fa3759d..273a896 100644
--- a/qcengine/programs/nwchem/harvester.py
+++ b/qcengine/programs/nwchem/harvester.py
@@ -1,11 +1,10 @@
 import re
 from decimal import Decimal
 
-import numpy as np
 import qcelemental as qcel
 from qcelemental.models import Molecule
 
-from ..util import load_hessian, PreservingDict
+from ..util import PreservingDict
 
 def harvest_output(outtext):
     """Function to separate portions of a NWChem output file *outtext*,

Standard QC test suite

A standard test suite for QC programs should be curated that run over several dimensions at a minimum:

  • Driver: energy/gradient/Hessian
  • Reference: UHF/RHF
  • Relevant procedures: optimization

Expanded `data` options for `run` and `run-procedure` CLI

CLI should allow data input in all serialization formats supported by QCElemental, namely:

  • JSON
  • MsgPack

run and run-procedure CLI commands should have the additional optional arguments:

  • --input-encoding (default: None). None would try everything; this would need to be implemented in qcelemental.basemodels.ProtoModel.parse_raw.
  • --output-encoding (default: JSON).

Error classification

It would be good to add error classification to these models so that can downstream programs can make decisions on what should happen. Several examples:

  • InputError - (non-recoverable) error in the user input (e.g., incorrect keyword or method)
  • SetupError - (recoverable) example: scratch directory is not writeable
  • ConvergenceError - (recoverable) likely requires options tweaking to enhance iterative convergence.
  • RandomError - (recoverable) random seg fault or the like.

Recoverable/non-recoverable in a distributed computing sense where an upstream manager can make the decision to resubmit.

My initial thought here is that we build these as Exception classes so that the compute command can capture them and either properly process them into proper JSON error message or let them raise. I am usually not a fan of custom error classes, but here is a good case where there is a variety of different behaviors that you want to elicit depending on the type of error provided.

It would be good to kick around the different error types for a bit before implementing.

CLI tests

Tests are needed for the CLI. Testing technology from QCFractal should be ported over.

Bootcamp: Changesset 5

diff --git a/qcengine/programs/molpro.py b/qcengine/programs/molpro.py
index d618956..8054308 100644
--- a/qcengine/programs/molpro.py
+++ b/qcengine/programs/molpro.py
@@ -4,7 +4,7 @@ Calls the Molpro executable.

 import string
 import xml.etree.ElementTree as ET
-from typing import Any, Dict, List, Optional
+from typing import Any, Dict, List, Set, Tuple, Optional

 from qcelemental.models import Result
 from qcelemental.util import parse_version, safe_version, which
@@ -26,7 +26,7 @@ class MolproHarness(ProgramHarness):
     version_cache: Dict[str, str] = {}

     # Set of implemented dft functionals in Molpro according to dfunc.registry (version 2019.2)
-    _dft_functionals = {
+    _dft_functionals: Set[str] = {
         "B86MGC", "B86R", "B86", "B88C", "B88", "B95", "B97DF", "B97RDF", "BR", "BRUEG", "BW", "CS1", "CS2",
         "DIRAC", "ECERFPBE", "ECERF", "EXACT", "EXERFPBE", "EXERF", "G96", "HCTH120", "HCTH147",
         "HCTH93", "HJSWPBEX", "LTA", "LYP", "M052XC", "M052XX", "M05C", "M05X", "M062XC",
@@ -49,9 +49,9 @@ class MolproHarness(ProgramHarness):
     }

     # Currently supported methods in QCEngine for Molpro
-    _scf_methods = {"HF", "RHF", "KS", "RKS"}
-    _post_hf_methods = {'MP2', 'CCSD', 'CCSD(T)'}
-    _supported_methods = {*_scf_methods, *_post_hf_methods}
+    _scf_methods: Set[str] = {"HF", "RHF", "KS", "RKS"}
+    _post_hf_methods: Set[str] = {'MP2', 'CCSD', 'CCSD(T)'}
+    _supported_methods: Set[str] = {*_scf_methods, *_post_hf_methods}

     class Config(ProgramHarness.Config):
         pass
@@ -113,10 +113,10 @@ class MolproHarness(ProgramHarness):
                 extra_infiles: Optional[List[str]] = None,
                 extra_outfiles: Optional[List[str]] = None,
                 as_binary: Optional[List[str]] = None,
-                extra_commands=None,
+                extra_commands: bool = None,
                 scratch_name: Optional[str] = None,
                 scratch_messy: bool = False,
-                timeout: Optional[int] = None):
+                timeout: Optional[int] = None) -> Tuple[bool, Dict[str, Any]]:
         """
         For option documentation go look at qcengine/util.execute
         """

Found, not found, and incomplete programs

As was mentioned in #212, some programs require additional dependancies to function beyond the executable or Python project. Two examples are Orca (CCLib) and NWChem (networkx). The CLI support for this is fairly straightforward as shown in #212 by @loriab:

>>> Program information
Available programs:
dftd3 v3.2.1
mp2d v1.1

Incomplete programs (see devtools/conda-envs for install help):
nwchem -- needs networkx
orca -- needs cclib

Other supported programs:
cfour entos gamess molpro mopac nwchem openmm psi4 qchem rdkit terachem torchani turbomole

The main question is if we need to expand the found() syntax or some other resource to determine the difference between a "runnable" state and "found, but incomplete". A few proposals:

  • Leave found() as is and have a new function runnable() that includes dependancies in its checks.
  • Add an additional kwarg to found(include_dependancies=True).
  • Have found() return an enum or similar object that contains a variety of states True, found_executable, found_dependancies, None.

Per-computation settings

Settings are currently global are not allowed to be overridden on a per computation basis. Local options passed into a computation should be able to overwrite the global state to allow for more flexibility.

UnboundLocalError: local variable 'output_data' referenced before assignment

Describe the bug
Running jobs with qcfractal-manager, and came across this "failed" job:

[W 191116 19:55:18 managers:586] Job 369024 failed: unknown_error - Msg: QCEngine Execution Error:
    Traceback (most recent call last):
      File "/export/home/tgokey/opt/lib/python3.7/site-packages/qcengine-0.12.0-py3.7.egg/qcengine/util.py", line 74, in compute_wrapper
        yield metadata
      File "/export/home/tgokey/opt/lib/python3.7/site-packages/qcengine-0.12.0-py3.7.egg/qcengine/compute.py", line 86, in compute
        output_data = executor.compute(input_data, config)
      File "/export/home/tgokey/opt/lib/python3.7/site-packages/qcengine-0.12.0-py3.7.egg/qcengine/programs/psi4.py", line 134, in compute
        output_data["schema_name"] = "qcschema_output"
    UnboundLocalError: local variable 'output_data' referenced before assignment

There isn't anything else in the log regarding this job.

To Reproduce
Just spun up a manager... this is the failed job:

[D 191116 01:48:49 base_adapter:153] Submitted Task:
    {'id': '369024', 'spec': {'function': 'qcengine.compute', 'args': [{'molecule': {'symbols': ['C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'C', 'N', 'N', 'N', 'O', 'O', 'H', 'H', 'H', '
H', 'H', 'H', 'H', 'H', 'H', 'H', 'H', 'H', 'H', 'H', 'H', 'H', 'H'], 'geometry': [0.57426417, -1.76307953, -1.16386869, -1.47978368, -2.24619508, -2.75941109, 2.74413569, -3.24931665, -1.34716956, -1.33181318, -4.20770247, -4.5
1471714, 13.32191332, -8.35663934, -12.83410606, 11.44467574, -8.78436772, -11.03868877, 8.23420593, -6.55351199, -13.53594165, 2.90643258, -5.22572166, -3.10567323, 0.84445409, -5.70346844, -4.7099867, 12.65282185, -7.0102777, 
-15.00519621, 8.93197759, -7.90773264, -11.3598525, 10.15762671, -6.16505774, -15.27477129, 12.43132868, -5.06715346, -18.53910154, 4.67927175, -7.92177835, -9.23975043, 5.29529728, -6.8051532, -3.28146083, 0.9106814, -7.8429122
4, -6.62874052, 4.78079419, -9.28500297, -4.67875428, 12.84651209, -3.85373275, -21.03942941, 14.04207412, -6.26749508, -17.13523185, 3.46901028, -8.79729428, -7.06684771, 7.2380414, -8.46174168, -9.37478738, 3.53211844, -6.8187
0128, -10.93700297, 10.00778802, -4.91063299, -17.55305051, 0.48296456, -0.23916519, 0.21021945, -3.1766797, -1.09487934, -2.64203791, 4.34180367, -2.87932082, -0.1054238, -2.91940231, -4.57621612, -5.76946934, 15.23438845, -9.0
449066, -12.55556938, 11.91887755, -9.82707218, -9.3304528, 6.32496233, -5.88249497, -13.80579222, 6.76321643, -5.74408794, -4.29250307, 6.03371138, -7.19630468, -1.3859967, -0.26204768, -9.4189406, -5.96004396, 0.15060304, -7.2
1422316, -8.43483111, 6.5075239, -10.38278599, -4.9578624, 3.53561928, -10.48988482, -3.54591095, 11.61663072, -4.69734817, -22.47359554, 12.41252019, -1.83262451, -20.95148419, 14.81484793, -4.10679705, -21.59280923, 7.99605943
, -9.50891788, -7.97487656], 'masses': [12.0, 12.0, 12.0, 12.0, 12.0, 12.0, 12.0, 12.0, 12.0, 12.0, 12.0, 12.0, 12.0, 12.0, 12.0, 12.0, 12.0, 12.0, 14.00307400443, 14.00307400443, 14.00307400443, 15.99491461957, 15.99491461957, 
1.00782503223, 1.00782503223, 1.00782503223, 1.00782503223, 1.00782503223, 1.00782503223, 1.00782503223, 1.00782503223, 1.00782503223, 1.00782503223, 1.00782503223, 1.00782503223, 1.00782503223, 1.00782503223, 1.00782503223, 1.0
0782503223, 1.00782503223], 'real': [True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True,
 True, True, True, True, True, True, True, True], 'fragments': [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39]], 'fragment_c
harges': [0.0], 'fragment_multiplicities': [1], 'schema_name': 'qcschema_molecule', 'schema_version': 2, 'name': 'C18H17N3O2', 'identifiers': {'molecule_hash': '56efe06472ee562b18994fa4b6334ff553f9dd23', 'molecular_formula': 'C1
8H17N3O2'}, 'comment': None, 'molecular_charge': 0.0, 'molecular_multiplicity': 1, 'atom_labels': ['', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '
', '', '', '', '', '', '', ''], 'atomic_numbers': [6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 8, 8, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 'mass_numbers': [12, 12, 12, 12, 12, 12, 12, 12, 12, 12,
 12, 12, 12, 12, 12, 12, 12, 12, 14, 14, 14, 16, 16, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 'connectivity': [[0, 1, 1.0], [0, 2, 2.0], [0, 23, 1.0], [1, 3, 2.0], [1, 24, 1.0], [2, 7, 1.0], [2, 25, 1.0], [3, 8, 1.0],
 [3, 26, 1.0], [4, 5, 2.0], [4, 9, 1.0], [4, 27, 1.0], [5, 10, 1.0], [5, 28, 1.0], [6, 10, 2.0], [6, 11, 1.0], [6, 29, 1.0], [7, 8, 2.0], [7, 14, 1.0], [8, 15, 1.0], [9, 11, 2.0], [9, 18, 1.0], [10, 20, 1.0], [11, 22, 1.0], [12,
 17, 1.0], [12, 18, 2.0], [12, 22, 1.0], [13, 19, 1.0], [13, 20, 1.0], [13, 21, 2.0], [14, 16, 1.0], [14, 30, 1.0], [14, 31, 1.0], [15, 19, 1.0], [15, 32, 1.0], [15, 33, 1.0], [16, 19, 1.0], [16, 34, 1.0], [16, 35, 1.0], [17, 36
, 1.0], [17, 37, 1.0], [17, 38, 1.0], [20, 39, 1.0]], 'fix_com': True, 'fix_orientation': True, 'fix_symmetry': None, 'provenance': {'creator': 'QCElemental', 'version': 'v0.5.0', 'routine': 'qcelemental.molparse.from_schema'}, 
'id': '2847582', 'extras': None}, 'driver': 'hessian', 'model': {'method': 'b3lyp-d3bj', 'basis': 'dzvp'}, 'id': None, 'schema_name': 'qcschema_input', 'schema_version': 1, 'keywords': {'maxiter': 200, 'scf_properties': ['dipole
', 'quadrupole', 'wiberg_lowdin_indices', 'mayer_indices']}, 'extras': {'_qcfractal_tags': {'program': 'psi4', 'keywords': '2'}}, 'provenance': {'creator': 'QCElemental', 'version': 'v0.5.0', 'routine': 'qcelemental.models.resul
ts'}}, 'psi4'], 'kwargs': {'local_options': {'memory': 32.0, 'ncores': 16, 'scratch_directory': '/dev/shm', 'retries': 2}}}, 'parser': 'single', 'status': 'RUNNING', 'program': 'psi4', 'procedure': None, 'manager': 'OpenFF_Moble
y_HPC-gplogin2.gp.local-9a1f67ec-e1d2-40f6-88b4-50b87a37a760', 'priority': 0, 'tag': 'openff', 'base_result': {'ref': 'result', 'id': '4227774'}, 'error': None, 'modified_on': '2019-11-16T09:48:49.770283', 'created_on': '2019-08
-07T23:42:40.267434'

Additional context
Been running for ~24 hours, with 40 jobs completed no problem. This is the only job to do this. Using v0.12.0.

Geometry Optimization within QC Codes

Is your feature request related to a problem? Please describe.
I would like to use the optimizers baked in to QC codes to perform geometry optimizations. My concern that using geoMETRIC could be especially inefficient with MPI codes as it will call mpiexec very frequently.

Describe the solution you'd like
A Procedure that calls the geometry optimization for NWChem.

Describe alternatives you've considered

  • Adding a driver for geometry optimizations. However, the compute drivers seems to all not effect geometry
  • Modifying geoMETRIC/qcengine to preserve NWChem output files so it can read in restart files. I think this might be a better option than the "solution I'd like" as it could work with any code that writes restart files and allows me to use geoMETRIC's optimizer.

Additional context
I've got time to work on this, just need some direction :)

Bootcamp: Changeset 2

diff --git a/qcengine/programs/empirical_dispersion_resources.py b/qcengine/programs/empirical_dispersion_resources.py
index d760d38..ac5bbf9 100644
--- a/qcengine/programs/empirical_dispersion_resources.py
+++ b/qcengine/programs/empirical_dispersion_resources.py
@@ -557,8 +557,7 @@ def from_arrays(name_hint=None, level_hint=None, param_tweaks=None, dashcoeff_su
     elif dashlevel_candidate_1 is not None and dashlevel_candidate_2 is not None:
         if dashlevel_candidate_1 != dashlevel_candidate_2:
             raise InputError(
-                f"""Inconsistent -D correction level ({dashlevel_candidate_2} != {dashlevel_candidate_1}) from name_hint ({name_hint}) and level_hint ({level_hint})"""
-            )
+                f"""Inconsistent -D correction level ({dashlevel_candidate_2} != {dashlevel_candidate_1}) from name_hint ({name_hint}) and level_hint ({level_hint})""")
     dashleveleff = dashlevel_candidate_1 or dashlevel_candidate_2
 
     allowed_params = dashcoeff[dashleveleff]['default'].keys()
diff --git a/qcengine/programs/nwchem/harvester.py b/qcengine/programs/nwchem/harvester.py
index fa3759d..72433b8 100644
--- a/qcengine/programs/nwchem/harvester.py
+++ b/qcengine/programs/nwchem/harvester.py
@@ -574,7 +574,7 @@ def harvest_hessian(hess):
     Hess file name has to be "nwchem.hess". (default)
 
     """
-    hess = hess.splitlines()
+    raise NotImplementedError()
 
 
 def harvest(p4Mol, nwout, **largs):  #check orientation and scratch files

Turbomole support

Hi,
a request for a TURBOMOLE interface came up in this geomeTRIC issue.
As I already developed a TM-wrapper for my own code I'd be willing to work on an implementation in QCEngine.
Right now the way my wrapper works is quite incompatible to QCEngine I imagine... In my code you have to prepare a control-file beforehand (through the TM-utility define) that gets passed to the wrapper class. All contents from this directory get copied to a temporary directory, from where the calculation is actually run.

What is needed for QCEngine is probably a wrapper for define?!
What would you consider to the minimal feature-set that should be implemented?

Best regards
Johannes

Incorrect units for molecule xyz in Molpro input

Molpro reads in the molecule xyz in Angstrom by default and QCElemental provides the coordinates in bohr by default. Therefore, either the XYZ needs to be converted to Angstrom or somehow specify to Molpro that the XYZ is in bohr.

Bootcamp: Changeset 3

diff --git a/qcengine/programs/gamess/runner.py b/qcengine/programs/gamess/runner.py
index fd1f6ed..d284dc3 100644
--- a/qcengine/programs/gamess/runner.py
+++ b/qcengine/programs/gamess/runner.py
@@ -75,9 +75,9 @@ def build_input(self, input_model: 'ResultInput', config: 'JobConfig',
     def fake_input(self, input_model: 'ResultInput', config: 'JobConfig',
                     template: Optional[str] = None) -> Dict[str, Any]:
 
-# Note decr MEMORY=100000 to get
-# ***** ERROR: MEMORY REQUEST EXCEEDS AVAILABLE MEMORY
-# to test gms fail
+        # Note decr MEMORY=100000 to get
+        # ***** ERROR: MEMORY REQUEST EXCEEDS AVAILABLE MEMORY
+        # to test gms fail
         infile = \
 """ $CONTRL SCFTYP=ROHF MULT=3 RUNTYP=GRADIENT COORD=CART $END
  $SYSTEM TIMLIM=1 MEMORY=800000 $END
@@ -93,7 +93,6 @@ def fake_input(self, input_model: 'ResultInput', config: 'JobConfig',
 Hydrogen   1.0   -0.82884     0.7079   0.0
  $END
 """
-
         # edits to rungms
         # set SCR=./
         # set USERSCR=./
@@ -102,8 +101,7 @@ def fake_input(self, input_model: 'ResultInput', config: 'JobConfig',
         return {
             "commands": [which("rungms"), "gamess"],  # rungms JOB VERNO NCPUS >& JOB.log &
             "infiles": {
-                #"gamess.inp": infile,
-                "gamess.inp": input_model.extras['gamess.inp'],
+                "gamess.inp": infile
             },
             "scratch_directory": config.scratch_directory,
             "input_result": input_model.copy(deep=True),

Create OpenMM engine harness

We would like to be able to execute OpenMM workloads, if possible, using QCEngine. This should involve creating a qcengine.programs.model.ProgramHarness subclass. Use the existing *Harness implementations as inspiration.

This issue should define the scope for this implementation; it serves as the nexus for discussion on this addition.

List Known Packages

QCEngine should be able to list all currently found execution packages (Psi4, RDKit, etc).

Error Handling

QCEngine currently assumes that errors should be passed back upstream to be handled by the calling program. An option that should be added so that errors can be raised by compute and compute_procedure.

Traceback is not complete.

Currently, if yield in utils.compute_wrapper raises an exception, compute_procedure raises an uninformative UnboundLocalError and the error_message does not get printed out:

distributed.worker - WARNING -  Compute Failed
Function:  compute_procedure
args:      ({'schema_name': 'qc_schema_optimization_input', 'schema_version': 1, 'keywords': {'coordsys': 'tric', 'constraints': {'set': [{'type': 'dihedral', 'indices': [3, 5, 7, 6], 'value': '180,0'}, {'type': 'dihedral', 'indices': [5, 7, 6, 8]}]}, 'program': 'rdkit'}, 'qcfractal_tags': {'procedure': 'optimization', 'keywords': {'coordsys': 'tric', 'constraints': {'set': [{'type': 'dihedral', 'indices': [3, 5, 7, 6], 'value': '180,0'}, {'type': 'dihedral', 'indices': [5, 7, 6, 8]}]}, 'program': 'rdkit'}, 'program': 'geometric', 'qc_meta': {'driver': 'gradient', 'method': 'UFF', 'basis': '', 'options': None, 'program': 'rdkit'}, 'tag': None}, 'initial_molecule': {'symbols': ['C', 'C', 'C', 'C', 'C', 'C', 'C', 'N', 'O', 'H', 'H', 'H', 'H', 'H', 'H', 'H'], 'geometry': [1.5068158, 2.15098616, 0.22531301, 0.67956586, 0.88903407, 2.38658242, 0.82729471, 1.26190607, -2.16125162, -0.82724459, -1.26193986, 2.16127032, -0.67959486, -0.88903165, -2.38659572, -1.50683663, -2.15095462, -0.22531796, -3.8732
kwargs:    {}
Exception: UnboundLocalError("local variable 'output_data' referenced before assignment",)

Cache `openmm_system` upon creation in `OpenMMHarness`

During #151, caching of the openmm_system was proposed as an optimization that could be valuable when we compute many gradients/energies for the same molecule in the same set of jobs. We currently cache the generated off_forcefield, so the same mechanism can be utilized for openmm_system.

The key/hash used for the cache must be selected with care. It must be insensitive to rotations or translations of the molecule, but should be sensitive to charge states, connectivity, and forcefield parameters. From @peastman:

One option for this is to serialize the System to XML with XmlSerializer.serialize(system) then compute a hash from the string. This will detect any change to the System or the Forces it contains, but will be unaffected by changes to particle positions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.