Code Monkey home page Code Monkey logo

znh5md's People

Contributors

dependabot[bot] avatar pre-commit-ci[bot] avatar pythonfz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

znh5md's Issues

retry if file locked

When writing have a retry option. E.g. if read via zndraw this could be useful.

Property not available inbetween

Read from a list of ASE Atoms objects and only a few implement a calculator and the others don't.
This currently raises some strange index errors.

Lazy Sequence support

It would be a great feature, especially for the ´znh5md.ASEH5MD´ to have:

  • an iterator that prefetches data (if __iter__ is called, load the next n batches and not just one at a time for better performance when looping)
  • a lazy sequence, if accessed atoms[:20] the first 20 atoms objects are kept in memory and not loaded again
  • an interface based on numpy and not dask because there is no benefit in using dask here.

`znh5md convert` support `*.gro` and `.xtc`

ref = ase.io.read("nvt_eq.gro")

reader = znh5md.io.ChemfilesReader("nvt_eq.xtc")
db = znh5md.io.DataWriter("nvt_eq.h5")
db.initialize_database_groups()
db.add(reader)

x = znh5md.ASEH5MD("nvt_eq.h5")
for atoms in x.get_atoms_list():
    atoms.set_atomic_numbers(ref.get_atomic_numbers())
ase.io.write("nvt_eq.xyz", atoms_list)

add `GroupNotFound` error

If a group does not exist in the file but someone tries to access it, raise an GroupNotFound error

Make variable PBC optional

Because it is not part of H5MD this should be a kwarg and log a warning that the file might not be readable!

Support for changing number of particles

Currently ZnH5MD assumes a constant number of particles per trajectory. E.g. in the IPSuite use case that might not be the case so we should support changes in the number of particles.

MDAnalysis can't read files

KeyError: "Can't open attribute (can't locate attribute: 'dimension')"

using

u = mda.Universe.empty(n_atoms=300)
u.load_new('nodes/ASEMD/trajectory.h5', format='H5MD', in_memory=True)

Support reader to atoms_list

Currently one has to do:

ref = ase.io.read("nvt_eq.gro")

reader = znh5md.io.ChemfilesReader("nvt_eq.xtc")
db = znh5md.io.DataWriter("nvt_eq.h5")
db.initialize_database_groups()
db.add(reader)

x = znh5md.ASEH5MD("nvt_eq.h5")
for atoms in x.get_atoms_list():
    atoms.set_atomic_numbers(ref.get_atomic_numbers())
ase.io.write("nvt_eq.xyz", atoms_list)

but would like to do:

reader = znh5md.io.ChemfilesReader("nvt_eq.xtc")
for atoms in reader.get_atoms_list(): ...

use `multiprocessing` for `remove_nan`

This operation as mapping over lists can use

num_cores = multiprocessing.cpu_count()
with multiprocessing.Pool(processes=num_cores) as pool:
    structures = pool.map(process_single_structure, data_array)

for better performance.

add mask by species / name wrapping traj.position.get_dataset(species=["Na", "Cl"]) or `species=[1, 2]`

There are a few different possiblities to solve this:

  1. Assume indices are ordered, create a mask once and apply it to every batch
  2. Assume that the key species always exists and add species=["Na", "Cl"]. Iterate over the species dataset and create a mask for every step / batch
  3. Pass a dataset, e.g. traj.species to the get_dataset (or some variation) together with a filter for full flexibility if you want to filter by species, ids, charge, ...

Test on Windows

ZnH5MD doesn't seem to work on Windows Operating systems

  • fix encoding of boundary
  • check paths

Improvement List

Change List

  • species handling "particles//velocity". See MDSuite database or Espresso dump no files available
  • #3
  • add batch_size as argument to get_dataset.
  • transpose to be always (n_confs, n_atoms, n_dim) via argument
  • loop_indices should support slices like [::2] every second
  • default prefetch to batch size

Write Files

It could be possible to write to the database via traj.positions = tf.data.Dataset to write to H5MD. A property setter is probably not sufficient, because one might want to add some args. so maybe znh5md.update_database(property=traj.position, dataset=tf.data.Dataset, append=True)

Add calculator support to `ase.Atoms.todict`

The todict / fromdict methods don't support calculators. Use SinglePointCalculators to also support these information. Support full results..
Also add a to_json() method and from_json() method.

`ase.io.iread` and `ase.io.write`

This library is designed around ase. I don't think the dask features are used by anyone.
Thus I'd suggest redesigning it fully around ase.

This would include

  • maybe use pyh5md
  • include ase.io.iread compatibility (eventually make this a ASE feature)
  • ase.io.write

Some context managers for better performance

Can we use contextmanager

ZnH5MD does use file = open(filename) often without closing it.
We should check if we can replace this with some context managers.

PBC as dynamic property

H5MD does not allow for PBC to be dynamic. For IPS we can either

  • allow dynamic PBC
  • make each configuration a new particle group

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.