Code Monkey home page Code Monkey logo

cfiles's Introduction

Chemfiles: a library for reading and writing chemistry files

Documentation Build Status Code Coverage Gitter DOI

Chemfiles is a high-quality library for reading and writing trajectory files created by computational chemistry simulations programs. To help you access information (atomic positions, velocities, names, topology, etc.) about these files, Chemfiles provides a simple and unified interface to a variety of file formats.

  • unified: the same code will work with all supported formats;
  • simple: the interface is easy to use and extensively documented.

You can use Chemfiles to conduct post-processing analysis and extract physical information about the systems you're simulating, to convert files from one format to another, to write trajectories with your own simulation software, and anything that requires reading or writing the file formats used in computational chemistry.

Chemfiles is used in multiple scientific software

  • cfiles provides ready-to-use analysis algorithms simulations trajectories as a command line tool;
  • lemon is a framework for rapidly mining structural information from the Protein Data Bank;
  • lumol is a prototype of universal extensible molecular simulation engine, supporting both molecular dynamics and Metropolis Monte Carlo simulations;
  • ANA detects cavities, calculates their volume and their flexibility in macromolecular structures and molecular dynamics trajectories;

This repository contains the core of the chemfiles library — written in C++11, with a C99 interface. You can also use chemfiles from other languages: Python 2&3, Fortran, Rust, and Julia.

Quick Links

Is chemfiles for you?

You might want to use chemfiles if any of these points appeals to you:

  • you don't want to spend time writing and debugging a file parser;
  • you use binary formats because they are faster and take up less disk space;
  • you write analysis algorithms and want to read more than one trajectory format;
  • you write simulation software and want to use more than one format for input or output.

There are other libraries doing the roughly the same job as chemfiles, have a look at them if chemfiles is not for you. Here we also say why we could not use them instead of creating a new library.

  • OpenBabel is a C++ library providing convertions between more than 110 formats. It is more complex than chemfiles, and distributed under the GPL license.
  • VMD molfile plugins are a collection of plugins witten in C and C++ used by VMD to read/write trajectory files. They do not support a variable number of atoms in a trajectory.
  • MDTraj, MDAnalyis, cclib are Python libraries providing analysis and read capacities for trajectories. Unfortunely, they are only usable from Python.

Chemfiles Features

  • Reads both text (XYZ, PDB, ...) and binary (NetCDF, TNG, ...) file formats;
  • Transparently read and write compressed files (.gz, .xz and .bz2);
  • Filters atoms with a rich selection language, including constrains on multiple atoms;
  • Supports non-constant numbers of atoms in trajectories;
  • Easy-to-use programming interface in Python, C++, C, Fortran 95, Julia and Rust;
  • Cross-platform and usable from Linux, OS X and Windows;
  • Open source and freely available (3-clauses BSD license);

Contact / Contribute / Cite

Chemfiles is free and open source. Your contributions are always welcome!

If you have questions or suggestions, or need help, please open an issue or join us on our Gitter chat room.

If you are using Chemfiles in a published scientific study, please cite us using the following DOI: https://doi.org/10.5281/zenodo.3653157.

Getting Started

Here, we'll help you get started with the C++ and C interface. If you want to use Chemfiles with another language, please refer to the corresponding documentation.

Installing Compiled Packages

We provide compiled packages of the latest Chemfiles release for Linux distributions. You can use your package manager to download them here.

We also provide conda packages in the conda-forge community channel for Linux and OS X. This package provides the C++, C and Python interfaces. Install the conda package by running:

conda install -c conda-forge chemfiles

Find more information about pre-compiled packages in the documentation.

Building from Source

You will need cmake and a C++11 compiler.

git clone https://github.com/chemfiles/chemfiles
cd chemfiles
mkdir build
cd build
cmake ..
make
make install

Usage Examples

This is what the interface looks like in C++:

#include <iostream>
#include "chemfiles.hpp"

int main() {
    chemfiles::Trajectory trajectory("filename.xyz");

    auto frame = trajectory.read();
    std::cout << "There are " << frame.size() << " atoms in the frame" << std::endl;

    auto positions = frame.positions();
    // Do awesome science with the positions here !
}

License

Guillaume Fraux created and maintains Chemfiles, which is distributed under the 3 clauses BSD license. By contributing to Chemfiles, you agree to distribute your contributions under the same license.

Chemfiles depends on multiple external libraries, which are distributed under their respective licenses. All external libraries licenses should be compatible with chemfiles's 3 clauses BSD. One notable execption depending on your use case is Gemmi which is distributed under the Mozilla Public License version 2. You can use CHFL_DISABLE_GEMMI=ON CMake flag to remove this dependency.

The AUTHORS file lists all contributors to Chemfiles. Many thanks to all of them!

cfiles's People

Contributors

luthaf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

cfiles's Issues

Add and use angle conversion function

We could really use deg2rad and rad2deg function for degree => radians and radians => degrees conversions.

This is an easy issue, with mentoring and help from me available if you want to make this your first contribution.

wrap and unwrap algorithms

cfiles wrap would rewrap all atoms inside the unit cell.

cfiles unwrap would try to unwrap atoms, using the velocities to predict and correct boundary crossing.

Trajectory informations

We could add a cfiles info <file> that would query and print metadata on a file. What could be interesting:

  • number of steps;
  • number of atoms (variable or constant);
  • unit cell (if constant);
  • presence of velocities;
  • any other ideas?

Add selections in Convert

Could we add a feature in the Convert command? It would write only a certain selection.
For example, I have a nanotube and water in my system and I would like a file with only the water molecules. I could use
`cfiles convert --selection="only water" myfile.xyz water.xyz
Would it be consistent with the purpose of the command?

Cannot build with g++ 11 or later

[  0%] Building CXX object CMakeFiles/docopt.dir/external/docopt/docopt.cpp.o
In file included from /tmp/cfiles/external/docopt/docopt.h:12,
                 from /tmp/cfiles/external/docopt/docopt.cpp:9:
/tmp/cfiles/external/docopt/docopt_value.h: In member function 'void docopt::value::throwIfNotKind(Kind) const':
/tmp/cfiles/external/docopt/docopt_value.h:98:36: error: 'runtime_error' is not a member of 'std'
   98 |                         throw std::runtime_error(std::move(error));
      |                                    ^~~~~~~~~~~~~
/tmp/cfiles/external/docopt/docopt_value.h: In member function 'long int docopt::value::asLong() const':
/tmp/cfiles/external/docopt/docopt_value.h:286:44: error: 'runtime_error' is not a member of 'std'
  286 |                                 throw std::runtime_error( str + " contains non-numeric characters.");
      |                                            ^~~~~~~~~~~~~

and many more errors in docopt. The solution is to add #include <stdexcept> to external/docopt/docopt_value.h. This fixes the build:

diff --git a/external/docopt/docopt_value.h b/external/docopt/docopt_value.h
index a923219..2003a7a 100644
--- a/external/docopt/docopt_value.h
+++ b/external/docopt/docopt_value.h
@@ -13,6 +13,7 @@
 #include <vector>
 #include <functional> // std::hash
 #include <iosfwd>
+#include <stdexcept>
 
 namespace docopt {
 

Density profile

An analysis algorithm computing the density profile of some particle along a given axis.

The algorithm should support a profile along a basis axis (x, y, or z); or an axis given by the user from 3 coordinate ([2,2,1] for example).

Distance distribution

As there is a command for angle distribution, we could implement either a second command for distance distribution or do a "distribution" command where you can choose between angle or distance distribution.

Throw an error when no angles are found

Running cfiles angle -s "..." should return an error if no angles matching the selection were found, to allow the user to check for typos. Currently the code just continue and output NaN in the data file.

We are already doing it for the rdf. Adding it for the angle should consist in adding the corresponding check in the Angle::accumulate function.

I am willing to mentor anyone wanting to work on this. If you are interested, please comment here or on gitter.

Catch exceptions in std::stod

In utils.cpp, we should handle invalid arguments (for steps and unitcell parsers) using a try{}catch(invalid_argument){}.
For now, in case of wrong argument (eg --cell=sdt:yr:4) the program gives: "Error stod"

Selections within selections

It is for now possible to select atoms, bonds, angles or dihedrals on the base of their components' name, type, coordinates...
It could be interesting to select atoms based on their bonded neighbours. We could create a selection inside a selection, for example:
"atoms: name O and bonded to {name H and x<5}"
It could also be interesting to add a "water" selection.

Improve test coverage

As of writing this issue, the test coverage is 53%. We should add more tests to bring this up to at least 80%, 90% would be better.

Tests are in the test directory, and they are Python scripts checking the output for different run of the main binary.

Code coverage is then collected by codecov. If you want to help here, click on the codecov link, pick a file with less than 80% coverage, and add a test checking the output for this specific code path. If you don't know how to trigger a specific code path, ask here. I am willing to mentor anyone wanting to get started on the code with this issue.

Convert the angle distibution histogram to degree

The histograms are outputting the distribution against the angle in radians. I believe it would be easier to use and read if the angle was in degree.

This involve modifying the Angles::finish function to convert angles from radians to degrees.

I am willing to mentor anyone wanting to work on this. If you are interested, please comment here or on gitter.

admin rights

One needs rights to write to /usr/local/bin/cfiles.

CMake Error at cmake_install.cmake:42 (file):
  file INSTALL cannot copy file
  "/home/X/XX/cfiles/build/cfiles"
  to "/usr/local/bin/cfiles".

Don't you think it is not desirable for this kind of tool?

MSD giving wrong results with trajectory within NPT ensemble

Hi, the trajectory is in this link. (https://www.dropbox.com/s/lez3brm58lg13ya/1000K-NPT-200ps.gro?dl=0). The trajectory was obtained from classical MD within NPT ensemble for 200 ps at 1000 K.
Calculating MSD with this command:
cfiles msd trj.gro --unwrap --selection "name O" -o O.dat
the result was:
1000K-NPT-C7 5-200ps
which is abnormal that the MSD of oxygen atoms is higher than sodium atoms.
And if I kept the cell size constant, using this command:
cfiles msd trj.xyz --unwrap --cell 16.03:16.03:16.03 --selection “name O" -o O.dat
the result was:
Graph1
which is much more resonable in terms of experimental findings and other computational results.
@fxcoudert

Solvation number

Why not adding an algorithm that computes the solvation number from the radial distribution function?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.