exa-analytics / exatomic Goto Github PK

View Code? Open in Web Editor NEW

19.0 5.0 11.0 219.41 MB

A unified platform for theoretical and computational chemists

Home Page: https://exa-analytics.github.io/exatomic

License: Apache License 2.0

Python 88.13% JavaScript 10.51% Jupyter Notebook 1.14% Batchfile 0.07% Shell 0.06% TypeScript 0.08%

chemistry computational-chemistry theoretical-physics visualization

exatomic's People

Contributors

Stargazers

Watchers

Forkers

tjduigna gitter-badger herbertludowieg adamphil farnoushnouri wgong rulingf computational-chemistry-research lschenberg chrinide

exatomic's Issues

Generalize the Gaussian frequency parser and add nmr shielding tensor parser

Is your feature request related to a problem? Please describe.
I find a bit frustrating that we cannot use the normal gaussian output from a geometry optimization followed by a frequency analysis. Also, it would be pretty nice if we could extract the frequency displacements with the higher precision when we specify the HPModes keyword.

Describe the solution you'd like
Design the program to take a more generalized output and extract the data.
I want to implement a nmr shielding tensor parser that already exists in the Fchk parser, but I do not have a bearing on the units they use in the .fchk files.

Describe alternatives you've considered
Use the Fchk instead, but there are some issue like if you don't properly save the information and then have to re-do the frequency calculation.

Quirks of the travis build

Describe the bug
This is just a minor issue with the way the .travis.yml builds exatomic. When mimicking the installation, the command to pip install -e exa/ requires a pip install of the following packages:

Installing collected packages: scipy, networkx, kiwisolver, cycler, matplotlib, seaborn, exa
  Running setup.py develop for exa

To Reproduce
Steps to reproduce the behavior:

Run the commands in the .travis.yml starting from:

exatomic/.travis.yml

Line 34 in 815b921

- conda create -n test python=$PYTHONVER

exatomic/.travis.yml

Line 44 in 815b921

- pip install -e exa/

Expected behavior
I expected there to be no additional dependencies for pip to install when installing the git clone of exa.
We already enumerate explicitly the packages needed to cover the exa/tomic stack in the .travis.yml file anyway, so perhaps we should leverage conda to install the above dependencies not already picked up.

Desktop (please complete the following information):

OS: WSL (Ubuntu 14.04)
Browser: Chrome 71.0.3578.98
Version: discovered while on #135

Additional context
Thought it may be a suitable issue for exa instead. But since exa/tomic are tightly coupled, it makes sense that we directly git clone exa during the exatomic build. Therefore it may only pertain to the exatomic build directly.

NWChem frequency and gradient parser

Is your feature request related to a problem? Please describe.
It would be good if we could have a frequency parser for NWChem so we could use the vibrational averaging code from more than one programs frequency calculations.

Describe the solution you'd like
A parser similar to Gaussian that could give the normal mode displacements reported by NWChem.

Describe alternatives you've considered
Generate a spaghetti code that uses crude searching methods.

An automated way for Editors to handle exceptions

I think this might work out nicely. It inherits the standard metaclass for editor (and universe) and just checks for parse methods, then decorates them with a try/except that will print the stack trace so that we don't lose that debugging information. That way when the AttributeError is thrown for not having a property set, we will see where the parsing function went wrong. Additionally it prevents the boilerplate of many try/excepts within the parse methods themselves and/or manual decorators. It may be too generic by catching all exceptions. What do you think?

from types import FunctionType
from logging import exception as stacktrace

class ParseMeta(exatomic.container.Meta):
    @classmethod
    def check_parse(cls, func):
        def wrapper(cls):
            try:
                return func(cls)
            except Exception as e:
                stacktrace('{} failed with error: {}'.format(func.__name__, e))
            return
        return wrapper
    
    def __new__(cls, name, bases, clsdict):
        for k, v in clsdict.items():
            if isinstance(v, FunctionType) and k.startswith('parse_'):
                clsdict[k] = cls.check_parse(v)
        return super(ParseMeta, cls).__new__(cls, name, bases, clsdict)

Molecular orbital issues

Describe the bug
When adding molecular orbitals I get the error:
TypeError: unsupported operand type(s) for *: 'numpy.ndarray' and 'Categorical'
in the basis.BasisSet.primitives and functions return conditions.
I checked the dtypes of the uni.basis_set_order and for L it shows category instead of int

To Reproduce
This is actually the test code from the docs:

from exatomic.base import resource
from exatomic import gaussian
uni = gaussian.Output(resource('g09-ch3nh2-631g.out')).to_universe()
uni.add_molecular_orbitals(vector=range(10))

If I change the return value on the respective methods to return n * n.index.get_level_values('L').map(mapper).astype(np.int64) instead of return n * n.index.get_level_values('L').map(mapper) it works without any issues.

Screenshots

Desktop (please complete the following information):

Windows
Opera
Git org master branch
Python 3.7.2

Additional context
I also updated all of the conda packages and there was no change

Extend travis builds to linux, win, osx

https://docs.travis-ci.com/user/multi-os/

Logging

Is your feature request related to a problem? Please describe.
Currently there isn't a systematic logging system for exatomic. The value could be both for debugging as well as more descriptive information about "handled" errors e.g. when a parser tries to parse a presumably correct file but fails to do so because the format of the file has changed slightly. Other use cases could be for logging widget information.

Describe the solution you'd like
exatomic could in principle write a persistent log to ~/.exatomic/ or similar. Standard library logging configuration should do in this case (handles format, rotation, etc).

Describe alternatives you've considered
Currently we don't have a logging solution implemented.

Additional context
Looking for additional thoughts about this (originally suggested by @tjduigna !)

Gaussian Input generator

Hey guys so I am trying to generate a bunch of Gaussian inputs to do some SP calculations. I just wanted to see if I'm doing this right.

import exatomic
from exatomic import gaussian
path = "path_to_some_xyz_file"
uni = exatomic.XYZ(path).to_universe()
gaussian.Input(uni)

When I do this it throws the following error

TypeError: Unknown type for arg data: <class 'exatomic.core.universe.Universe'>

When I just input the path of the xyz file it prints out the original file. Also, if I try gaussian.Input(uni=uni) it throws TypeError: __init__() got an unexpected keyword argument 'uni'. Thanks for your help.

Orbital labeling on Universe Widget

Is your feature request related to a problem? Please describe.
It would be a good idea to try and change the labeling in the fields.

Describe the solution you'd like
Change the orbital labels so that they match the actual orbital labels.

Describe alternatives you've considered
Write what the indexes should be every time you call the add_molecular_orbitals function.

Where to save images when non-persistent

Removing the requirement of the persistent .exa directory and sqlite database on disk is good for the shallow use case. Now say we want to run in non-persistent but also want to save PNGs to disk. Where should we do this? I remember the trick you used for finding the OS's natural temporary directory (although not where the code would be). I suppose this would be the natural choice? In the persistent case we can associate UID of a universe with the PNGs saved but for the non-persistent case a simple incrementing scheme of 0001.png 0002.png etc. seems sufficient. I hesitate to commit the save PNG patch until we have a systematic solution for this. What are your thoughts on this?

Parse_cubes does not account for different frames

Eg. if a geometry is oriented differently in one cube file than another, the algorithm only populates the universe (onebody, etc) with the first cube file.

A check if _gen_one_df returns the same df as the first cube file else create a new frame ought to sort it out.

momatrix.as_matrix() does not build the full mo coefficient matrix

This function needs to populate nan values for missing rows/columns!

TensorContainer not working

TensorContainer seems to be missing alo and rlo variables.

VROA

Is your feature request related to a problem? Please describe.
I would like to implement an vibrational raman optical activity module into exatomic given by the equations in this paper J. Chem. Phys. 127, 134101 (2007); https://doi.org/10.1063/1.2768533. This would be a continuation to the VA project and linked to issue #128.
Already have most of the code made up just need to polish and do some intensive tests.

Add a to_xyz bound method to Atom

This is purely for convenience but instead of repeating

atom[['symbol', 'x', 'y', 'z']].to_csv(sep=' ', float_format='%    .8f', header=False, index=False, quoting=csv.QUOTE_NONE, escapechar=' ')

can simply wrap that in a bound method of the atom table.

atom.to_xyz()

def to_xyz(self):
    return self[['symbol', 'x', 'y', 'z']].to_csv(sep=' ', float_format='%    .8f', header=False, index=False, quoting=csv.QUOTE_NONE, escapechar=' ')

NWChem input

Is your feature request related to a problem? Please describe.
So, the nwchem input generator is a bit of a mess as it stands. I had to add a lot of inputs to be able to use it and I feel that it might be good to use a different approach.

Describe the solution you'd like
I think it would be good if we could simplify it to templates. At least I have found it to be the most useful when I needed to generate a lot of files from displaced structures that use the exact same inputs and the only thing that changes are the coordinates. Maybe, we can go the route of making a different method in the class that can take in a template or templates as inputs and the program takes care of splicing in the coordinates.

Describe alternatives you've considered
Appending more and more variables.

Additional context
I wonder if it would also be worthwhile to make a few basic templates for some of the more used calculations that are done like Geometry Optimizations and whatnot. Open to suggestions you might have.

Memory check for adding molecular orbitals

The add_molecular_orbitals function should estimate memory usage prior to attempting to build fields and throw a memory error if necessary. For example, using an exatomic provided universe...

import exatomic
from exatomic.base import resource
u = exatomic.Universe.load(resource("adf-lu.hdf5"))
u.add_molecular_orbitals(field_params=dict(rmin=-10, rmax=10, nr=nr),
                         vector=sorted(u.momatrix['orbital'].unique()))

..depending on the value of nr in the above example one can quickly explode RAM. In production work, one may want to inspect (at a high quality) hundreds of orbitals. The warning/memory estimator should also consider the fact that the field data created is duplicated on the frontend (for visualization).

API

From a design standpoint is it nicer to have an API that looks like:

universe.primitive_atoms

universe.get_primitive_atoms(inplace=True)
universe.primitive_atoms

In the former approach, primitive_atoms (dataframe) is a property that will be computed if it doesn't exist. In the latter it must first be explicitly computed before it can be accessed. Also, in the latter note that the default is to return the primitive_atoms rather than attach them to the universe.

Incorrect units conversion in diffusion.einstein_relation()

Describe the bug
Time units are converted incorrectly for the computation of diffusion coefficients in diffusion.einstein_relation()

To Reproduce
Steps to reproduce the behavior:
1.Read in dynamics universe, add time data to frame in ps
2.compute D(t) using diffusion.einstein_relation()

from exatomic.algorithms import diffusion
atom = qe.parse_xyz(scratch +'waterbox/emass380/waterbox.pos', symbols=(['O']*64+['H']*128))

uni = exatomic.Universe(atom = atom)

atom['frame'] = atom['frame'].astype(int)
atomnve = atom[(atom['frame']>=30000) & (atom['frame']%1000==0)]
unve = exatomic.Universe(atom=atomnve)

unve.atom['label'] = unve.atom.get_atom_labels()

unve.atom['label'] = unve.atom['label'].astype(int)

# Add the unit cell dimensions to the frame table of the universe
a = 23.46
for i, q in enumerate(("x", "y", "z")):
    for j, r in enumerate(("i", "j", "k")):
        if i == j:
            unve.frame[q+r] = a
        else:
            unve.frame[q+r] = 0.0
    unve.frame["o"+q] = 0.0
unve.frame['periodic'] = True

unve.frame['time'] = unve.frame.index*0.000145
Dt = diffusion.einstein_relation(unve)
Dt.plot()

Expected behavior
Obtain accurate D(t) with units of cm^2/s
Desktop (please complete the following information):

OS: Linux
Browser: chrome

Update documentation

I propose we move away from restructured text (rst) files being written by hand. We can use sphinx-apidoc to generate rst files for all subpackages/modules. A slight modification to the index.rst will enable docs to use the programmatically generated API docs. We should keep the getting started and other helpful pieces of course (sphinx-apidoc is for api only).

Config incompatible with base package

As mentioned in exa-analytics/exa#109 , config-dependent code in exatomic may or may not run if base package's atexit is called before import.

Symbolic python using symengine/symengine.py beneath sympy?

Long term goal most likely, but perhaps we can use https://github.com/symengine/symengine and https://github.com/symengine/symengine.py to improve the performance of symbolic python (sympy). Symengine is still young, so this issue is a long term reminder and place to keep track of developments in that projected (and related).

Unit cells are not fully supported

Currently only a simple cubic unit cell is displayed (if available). This is a complete hack and should be fixed. Relevant code is in exatomic/widgets/traits.py:frame_traits and js/src/appthree.js:add_unit_axis. There is a comment that says "Hack to also add unit..." (in that function) which is the place to start on the frontend side of things.

This code (and the Python code in traits.py) should be generalized to support an arbitrary cell and to support unique cells for every frame in a multiframe universe.

labels check box in gui display options

We need to include an option to display the atom's label in the 3D render of the universe. This enables things like computing bond summary tables because the code currently can't tell apart (for example) radial from axial bonding schemes and will simply group all of those bonds into one average bond length.

Visual bugs for field viewing

This may be relevant!

mrdoob/three.js#2593

Essentially we need to smoothen z fighting (as it is called) issues between objects within our threejs scene.

Prep for ipywidgets 5.0

http://blog.jupyter.org/2016/04/22/ipywidgets5/

qe parser

Bug with string typing/decoding in exatomic/qe/cp/dynamics.py
input: parse_symbols_from_input(path)
error: 'in ' requires string as left operand, not bytes

rcParams-like for styling to circumvent the gui

In the context of the workflow for snapping shots of isosurfaces: It is easy enough to save the canvas as a png when it is set up nice. Now say I have my set of defaults; the camera position, the field index, the material of choice for atoms (spheres) and the material of choice for surfaces, etc. These are not the defaults in the GUI and as such require manual setting each time a universe is loaded. This precludes automation of the process by requiring input each time. We should think about how to programmatically (sp?) set these values before the rendering of a universe.

Although I am not sure this solves the automation problem, it would certainly be useful regardless of this use case.

add arg to read_xyz to allow enforcing of specific isotope

Compute Zero-Point Vibrational Corrections

Implement code that can be used to perform Vibrational Averaging as explained in the paper J. Phys. Chem. A, 2005, 109 (38), pp 8617–8623 (DOI: 10.1021/jp051685y).
Re-write of Perl and FORTRAN scripts by the authors to a python format.

field values not getting passed to JS

auni = atomic.Cube(myfilepath).to_universe()
auni

shows the universe and hitting enter inside the isovalue option hits march_cubes2

console.log(field); // at the top of march_cubes2
returns

Field Object

values: NewFloat32Array[1]
0: NaN

On the python side I can look at auni.field and auni.field_values both of which look correct.

Small bug in XYZ parser

YXZ breaks with the following error message if there are 2 blank lines at the end of a single geometry.

line 45, in parse_atom
counts = nats.values.astype(np.int64)
ValueError: cannot convert float NaN to integer

Just need to dropna() on nats at some point before line 45, but needs tests.

VisualAtom is buggy; doesn't correctly update the traits

Either the function is not behaving correctly or the traits are not being updated correctly.

Issue with nearest_molecules()

nearest_molecules() does not update atom_count in the frame table to be the new total atom count

Field buggy with diferent ndims

If I have two rows with different parameters in the field table marching cubes returns nothing.

Implementation of the universal force field

Is your feature request related to a problem? Please describe.
Given exatomic.Universe object, it would be really convenient to be able to perform molecular mechanics. A simple, universal way to do this would be to implement the UFF.

Describe the solution you'd like
This would involve building a framework for computing molecular mechanics that is flexible to accept any type of classical force field/parameter set.

Describe alternatives you've considered
One alternative would be to provide a shim/connector or better support for external software that perform molecular mechanics. The downside is needed to have third party software compiled/installed and configured in order to perform calculations locally. One upside is not needed to put together a framework for force fields.

Re-factor of the javsascript extensions

Given that all the relevant data from a universe is shipped to a UniverseWidget on construction, it is entirely possible that the widgets can re-factored to be exported and embedded in standalone html pages. There are a few main problems with the current implementation preventing this.

1: The current GUI implementation relies on logic requiring the python kernel to execute updates to the front-end. This can be seen with just the Folder widget, whose "view" is essentially controlled on the python side. This requires a custom extension of the Folder to the javascript side so that view update logic happens in the front-end.

class FolderModel extends control.VBoxModel {
  
    defaults() {
        return _.extend(super.defaults(), {
            _model_name: "FolderModel",
            _view_name: "FolderView",
            _model_module_version: semver,
            _view_module_version: semver,
            _model_module: "exatomic",
            _view_module: "exatomic",
            show: false
        })
    }
    
});

class FolderView extends control.VBoxView {
  
      initialize(parameters) {
          super.initialize(parameters);
          this.pWidget.addClass('widget-folder');
      }

      update_children() {
          // Logic here to either show just the controller Button 
          // or the entire folder of widgets
      }

}

2: Many of the ExatomicScene traitlets are updated by GUI widgets via observe methods. Alternatively, the Scene traitlets could be linked to the GUI widgets via jslink. This will put most of the scene update logic directly into the front-end as well.

Admittedly, both of these tasks may not be sufficient to actually accomplish this goal. However, without refactoring both of these things, I think it would be impossible to tell.

Create a conda package

http://conda.pydata.org/docs/build_tutorials/pkgs2.html

atomic.algorithms.slicing.nearest_molecules other_symbols not implemented

Need to allow selection of nearest by counting (solvent) atoms.

Garbage collecting fields

It looks like the close button doesn't quite take care of everything as expected. After closing a universe with high-resolution fields, for computers with smaller amounts of RAM, the jupyter notebook server starts spitting out this message:

Malformed HTTP message from 127.0.0.1: Content-Length too long

Also auto-save of the notebook fails and the whole browser can become sluggish.

Perhaps some of the other jupyter projects have run into this, but having messed with the iopub data rate limits and whatnot may have unintended consequences.

Additionally, the order of operations for replacing fields in a universe is a bit backwards. If replace is True, it should ideally delete the fields and then compute new ones, whereas right now it computes the new ones, then deletes the old ones, then attaches the new ones to the universe.

Transformation manipulations for the atom table

It would be convenient to perform operations such as rotations and translations on the atom table via convenient methods:

uni.atom.rotate(theta=10)    # degrees

Suppose that theta is the angle in the xy plane, this function would rotate each frame's atomic coordinates 10 degrees in the xy plane. Another example:

uni.atom.translate(dx=-10)

This example would translate all x coordinates by minus 10 units. These functions could be extended to other convenient methods like uni.atom.center(.....).

Implement real regular solid harmonics in javascript

These will be used for building spherical harmonics on the javascript side (meaning we only have to pass the momatrix and basis_set dataframes!).

Benchmark falconn for two body properties

https://github.com/falconn-lib/falconn/wiki

Compute molecule center of mass for periodic systems

Currently not implemented (if periodic = True in Universe.frame, Universe.compute_molecule_com will raise an error). A workaround (if molecules are not spliced in the periodic boundary condition is to set periodic to False.

To fix this issue, need to account for spliced molecules in periodic boundary conditions - some of this machinery has been implemented as part of the "visual atom" table. Secondly, may/may not need to rework the "projected atom" table (the 3x3x3 projection of the periodic unit atom table).

Support for system creation

Currently we have no support for creating molecules, crystals, periodic cells, etc. This is a long term issue to track the development of these features.

Centralized quantum-code input generator and output reader

I think we could really benefit of a class that could take a universe as an input to generate inputs for any quantum-code or take file paths and read the respective outputs.
Something I have been playing around with for an output reader is.

class VA:
    prog = {'gauss': gaussian.Fchk, 'gaussian': gaussian.Fchk}
    def get_gradients(self):
        path = self.path
        grad_path = path+"gradient/"
        files = os.listdir(grad_path)
        gradient = []
        for file in files:
            if file.endswith(".fchk"):
                ed = self.soft(grad_path+file)
                ed.parse_gradient()
                df = ed.gradient
                fdx = list(map(int, re.findall('\d+', file)))
                if len(fdx) != 1:
                    raise NotImplemented("Cannot determine integer from list to place label on gradient dataframe")
                df['file'] = np.tile(fdx, len(df))
            else:
                continue
            gradient.append(df)
        self.gradients = pd.concat([grad for grad in gradient]).reset_index(drop=True)
        self.gradients.sort_values(by=['file', 'label'], inplace=True)
        self.gradients.reset_index(drop=True, inplace=True)
        self.force_vector = self.gradients[['Z', 'label', 'symbol', 'frame', 'file']].copy()
        self.force_vector['vector'] = np.linalg.norm(self.gradients[['fx', 'fy', 'fz']].values, axis=1)
        
    def __init__(self, path, soft, temp=None, *args, **kwargs):
        self.soft = self.prog[soft]
        self.path = path
#         self.soft = soft
#         self.path = path

I tried out using a predefined dictionary where you could input a string and it looks for the key in the dict. If it's not found we could raise a NotImplemented error to safely exit. Another way I tried it out was to have a user input the output parser function to be used directly both of which worked fine in this implementation.
This code is for a very specific example but we could probably generalize it so it can look in a user defined directory for the files.

Different behavior for neighbors.periodic_nearest_neighbors_by_atom() and neighbors.periodic_nearest_neighbors_by_atom_large()

Using all the same arguments, some atoms are excluded when computing neighbors using the _large version of the function. Seems to have to do with the passing of kwargs regarding bond thresholds and molecule computation.

Robust electronic structure visualization

The "add_molecular_orbitals" and associated functionality seems to more or less work for the quantum codes that have the required parsers (molcas, gaussian, nwchem?, adf used to?). However, due to internal differences in the order of operations and conventions for dealing with segmented/contracted/cartesian/spherical/cubic functions, it is not easy to guarantee that evaluation of molecular orbitals on a grid by "quantum-code agnostic" routines gives necessarily idential results to the quantum codes themselves.

This issue serves to track progress towards developing what essentially amounts to implementing computation of the overlap matrix and validating with a given set of molecular orbital coefficients. This procedure should be done on the fly "per universe" and throw a warning if something doesn't look quite right. This type of check will probably never be computed for STOs.

From absolute positions to sliced nearest neighbors

This issue (epic) will track progress in workflow from periodic absolute nuclear coordinates through projected coordinates, visual coordinates, and selection of nearest neighbors by various methods of classification.

implement write_xyz

Should support the writing a universe as a trajectory xyz or as individual xyz files.

Also include the unit and put something meaningful in the comment line.