computationalradiationphysics / libsplash Goto Github PK
View Code? Open in Web Editor NEWlibSplash - Simple Parallel file output Library for Accumulating Simulation data using Hdf5
License: GNU Lesser General Public License v3.0
libSplash - Simple Parallel file output Library for Accumulating Simulation data using Hdf5
License: GNU Lesser General Public License v3.0
@PrometheusPi reported a problem reading particle bool
attributes with h5py
in ComputationalRadiationPhysics/picongpu#682.
The problem is of course, that HDF5 has no native boolean type why we implemented a 1 8 byte (?) bitfield as a work-around.
h5py
instead uses HDF5 enums as a work around that will then be directly mapped to a 1-byte numpy bool while reading, see also the HDF5 Enum Docs (section 8).
The idea now is to also switch to that representation to increase compatibility (that will increase the internal file format version).
Related to a misunderstanding in #127
add a unified header file (splash.h) which allows to test on parallel IO support
I'll have to fix the travis support.
Seem that the line sudo apt-get install -qq libcppunit-dev libboost-program-options-dev $APTMPI $APTHDF5
is broken, investigating...
we should also support linking with HDF5 statically
In principle, invalid offsets are bound to fail. However, zero reads must be allowed with invalid offsets.
Cross posting of ComputationalRadiationPhysics/picongpu#330 since the grid/poly file splitting already happens in splash2xdmf.py
.
During the deployment of CPack
routines in #70 I realized a slight problem with the "huge number" (>1) of include files we produce.
Technically, a user only sees splash.h
but of couse this one pulls more header dependencies in. If I would like to install libSplash the clean way, I would specify a prefix like /usr
(that's the default).
By that, our two libraries (.so, .a) go to /usr/lib/
, our binaries (splashtools, later splash2txt) to /usr/bin
and our headers/includes to /usr/include
.
So far, so nice.
Unfortunately, that introduces the risk of header file name collisions in /usr/include
. As a work around, boost uses a separate folder like boost/allHeaders.h
(that is of course kind of a namespace ansatz).
I tried to realize this during install time, which is possible. But since we mess a little bit with tests, examples and real tools (some depend on already build libs, other get build directly out or their sources) one must technically also move the source tree:
src/include/*
-> src/include/splash/
to reflect that and one hast to change all internal includes to #include "splash/..."
to be consistent.
I know it's a major change but I suggest to change that the earlier the better.
How about adding a simple data format version to version.hpp
?
The change in a version in libSplash does not necessarily mean (in)compatible hdf5 files...
We could increment it (e.g. start with 1
in splash 1.1) for any incompatible changes that will occur later on.
We could also use major
and minor
for compatible and incompatible data format changes, but maybe just a single number is enough.
This data format version should be added as standard meta data to each data file created with libSplash.
especially for parallel I/O
Create Fortran bindings to make libSplash usable in Fortran programs
Parallel_SimpleDataTest has been disabled and is currently not tested
Add a FindSplash.cmake
Module for CMake's find_package
.
This module should find Splash and set some useful but space-wasting variables like SPLASH_VERSION
, SPLASH_FORMAT
, SPLASH_PARALLEL
, ... in CMake.
Copied to e.g. <INSTALL PATH>/cmake
, it is found if $SPLASH_ROOT(/bin)
is in $PATH
: doc
Btw: I would prefer "talking" (doc) about Splash rather than libSplash since else we should call our libraries liblibsplash.so
:)
Some write/read interfaces are quite long and have several overloaded versions. To increase usability, use helper/meta classes like Domain
and Selection
to reduce the number of required parameters and avoid confusing their order. Moreover, this will reduce the number of required overloaded functions.
Let's perform some Benchmarks.
At least on our Panasus (hypnos) and on taurus (lustre?).
They should end up like usual (parallel-)hdf5 benchmarks.
HDF5 1.8.12 is out - the world is a better place now! โจ
Do some of these changes affect us?
Shouldn't this entry in localDomain
come from the Selection
input?
like this
Domain localDomain(Dimensions(0, 0, 0), select.size);
Furthermore, shouldn't the offset be the offset of the local domain within the globalDomain? that does not make sense for a ParallelDomain write so (0,0,0) is fine I guess.
If N processes call collective ParallelDataCollector::read(...)
and some read zero amount of data the read call hangs.
This is the case when all process which read zero amount of data set sizeRead
to zero and the pointer buf
to NULL
. e.g.
void read(100, "dataName", Dimensions ( 0, 0, 0), NULL)
The BUG is triggered by this check because not all read calls arrive H5Dread
. This is not allowed if collective operations are used.
@psychocoderHPC , @ax3l , @bussmann
I would like to brainstorm new ideas, feature requests for the next version of libSplash.
Some ideas:
We should make use of a MPI_Info object.
striping_factor
/striping_unit
)direct_io
should be rathe true; env variable MPICH_MPIIO_HINTS
instead of coding; stating to use cb_align 2
)fill_value
via NULL
H5Pset_fill_value and H5D_FILL_TIME_NEVERH5Pset_alignment
to disk block size reported to improve performance, also IBM_largeblock_io=true
for MPI hints in H5Pset_fapl_mpioIf $build and $1 are not set, a small help text would be useful.
@ax3l Any concerns over releasing this as version 1.2.3?
Need to update version header.
No update of file format required.
There is a Bug in ParallelDataCollector
in getMaxID()
, it always return zero.
I add some more debug output and see that the bug is in listFilesInDir()
.
This are my debug changes (only the log_msg)
// extract id from filename
int32_t id = atoi(
fname.substr(name.size(), fname.size() - 3 - name.size()).c_str());
ids.insert(id);
log_msg(2, "add file to max id search list %s with id %i", fname.c_str(),id);
I get this output
[1,1]<stderr>:[SPLASH_LOG:1] add file to max id search list h5_500.h5 with id 0
[1,1]<stderr>:[SPLASH_LOG:1] add file to max id search list h5_0.h5 with id 0
[1,1]<stderr>:[SPLASH_LOG:1] add file to max id search list h5_1000.h5 with id 0
There is something wrong inside the parameter calculation for atoi()
The debian package h5utils contains some interesting tools.
It might be useful to play around with them on our data sets and to document some useful applications.
quote:
@f-schmitt-zih @psychocoderHPC
We were discussing if it would be more useful to define the domainOffset starting with (0,0,0) from the globalDomOffset.
That would "extract" the globalDomOffset
as an meta attribute for the whole domain, for example if I want to follow a moving window and I need an absolute position.
testIntersection
would have to read the globalDomOffset
in this case.
An other nice fact would be, that we store particles this way right now, because it is the intuitive way of seeing a moving simulation window in post-processing (adding again the global offset for an absolute position is also possible, but a little bit less often used).
According to FindHDF5.cmake we should remove
OPTION(PARALLEL "enable parallel MPI I/O" OFF)
and check for
HDF5_IS_PARALLEL - Whether or not HDF5 was found with parallel IO support
from FIND_PACKAGE(HDF5)
Easy and nice.
We should somewhere describe our generic HDF5 structure we create.
Even if we prefer using libSplash for reading and writing, it's a good manner to transparently allow other HDF5 readers/writers to jump in.
A tool reading a png picture and translating it in a scalar 2D grid entry would be really useful (e.g. to start simulations with a gas profile like the PIConGPU logo).
Option: transform to 3D field and clone in one direction.
An explicit point to free all allocated resources is beneficial. Otherwise, there is no nice way to free the internal MPI_Comm for a stack-allocated PDC in main
before the user calls MPI_Finalize
.
At least for .deb
packages, yeah ๐
Getting started: http://www.cmake.org/Wiki/CMake:Packaging_With_CPack
Example: http://www.cmake.org/Wiki/CMake/CPackExample
Deb, RPM & OSX: http://www.cmake.org/Wiki/CMake:CPackPackageGenerators#DEB_.28UNIX_only.29
Add this packages as binary downloads to each new libSplash release.
All write methods (e.g. writeDomain) not support to write empty data and use a NULL pointer as data pointer.
Error:
[1,15]:terminate called after throwing an instance of 'DCollector::DCException'
[1,15]: what(): Exception for SerialDataCollector::write: a parameter was NULL
Build & test the examples during travis ci runs, too.
How to compare a stored type annotation to an according DCollector::CollectionType
again?
DomainCollector::DomDataClass data_class;
DataContainer *particles_container =
dataCollector.readDomain(simulationStep,
name.c_str(),
domain_offset,
domain_size,
&data_class);
DomDataClass::getDataType()
returns a H5DataType
(not useful, because we only know DCollector::CollectionType
s)
How can I find my float
/ColTypeFloat
or double
/ColTypeDouble
from data_class
?
1-3D --> nD
to prevent interference with user-app MPI messages
I would like to add at least a parallel write example for the sandbox MD simulation (= particles only) I wrote here.
one of the interfaces of writeDomain
in at least the parallel domain collector is broken for the following use case:
Apply the following patch to a checked out version of v.1.1.1
to test. I assume the one overloaded member in v.1.2.0
will also be affected and the error may relies at ParallelDataCollector::gatherMPIWrite
:
diff a/examples/2Din1Dtop/2Din1Dtop.cpp b/examples/2Din1Dtop/2Din1Dtop.cpp
new file mode 100644
index 0000000..db6f21c
--- /dev/null
+++ b/examples/2Din1Dtop/2Din1Dtop.cpp
@@ -0,0 +1,70 @@
+// Copyright 2014 Axel Huebl
+//
+// LGPL
+//
+// In this example I am going to write a 2D data set
+// which is distributed over a 1D MPI topology
+
+#include <mpi.h>
+#include <splash/splash.h>
+
+int main()
+{
+ using namespace splash;
+ MPI_Init(NULL, NULL);
+
+ int size, rank, vrank;
+ MPI_Comm_size( MPI_COMM_WORLD, &size );
+ MPI_Comm_rank( MPI_COMM_WORLD, &rank );
+ // "simulate" a moving window
+ vrank = ( rank + 3 ) % size;
+
+ {
+ ParallelDomainCollector pdc(
+ MPI_COMM_WORLD, MPI_INFO_NULL, Dimensions(size, 1, 1), 10 );
+
+ /* use a shifted "virtual rank" for data distribution */
+ DataCollector::FileCreationAttr fAttr;
+ Dimensions mpiPosition( vrank, 0, 0 );
+ fAttr.mpiPosition.set( mpiPosition );
+
+ /* open file */
+ pdc.open( "testDomain", fAttr );
+
+ /* write my virtual rank -> output should be purely ascending */
+ const int numVal = 2;
+ ColTypeFloat ctFlt;
+ float a[] = {(float)vrank, (float)vrank};
+
+ /* sizes and offsets, naming conventions see
+ https://github.com/ComputationalRadiationPhysics/picongpu/issues/128#issuecomment-41366257 */
+ Dimensions localDomainSize( 1, numVal, 1 );
+ Dimensions localDomainOffset( vrank, 0, 0 );
+
+ Dimensions globalDomainSize( size, numVal, 1 );
+ Dimensions globalDomainOffset( 0, 0, 0 );
+
+ /* call write routine */
+ pdc.writeDomain( 0, /* time step */
+ ctFlt,
+ 2, /* 2D data set */
+ localDomainSize,
+ "myfield",
+ localDomainOffset, /* ignored anyway ... :( */
+ localDomainSize, /* ignored anyway ... :( */
+ globalDomainOffset,
+ globalDomainSize,
+ DomainCollector::GridType,
+ a );
+
+ /* close and return */
+ pdc.close();
+ }
+
+ int fin;
+ MPI_Finalized( &fin );
+ if( !fin )
+ MPI_Finalize();
+
+ return 0;
+}
diff a/examples/2Din1Dtop/testOutput.py b/examples/2Din1Dtop/testOutput.py
new file mode 100755
index 0000000..ab41370
--- /dev/null
+++ b/examples/2Din1Dtop/testOutput.py
@@ -0,0 +1,15 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+#
+# Copyright 2014 Axel Huebl
+#
+# LGPL
+#
+
+import h5py as h5
+
+f=h5.File("testDomain_0.h5", "r")
+data=f['data/0/myfield']
+
+print "shape", data.shape
+print "data", data[:,:]
diff a/examples/CMakeLists.txt b/examples/CMakeLists.txt
index a13f5e8..53c1105 100644
--- a/examples/CMakeLists.txt
+++ b/examples/CMakeLists.txt
@@ -39,7 +39,7 @@ SET(CMAKE_BUILD_TYPE Debug)
OPTION(WITH_MPI "build MPI examples" OFF)
SET(EXAMPLES domain_read/domain_read)
-SET(MPI_EXAMPLES domain_read/domain_read_mpi domain_write/domain_write_mpi)
+SET(MPI_EXAMPLES domain_read/domain_read_mpi domain_write/domain_write_mpi 2Din1Dtop/2Din1Dtop)
FOREACH(EXAMPLE_NAME ${EXAMPLES})
SET(EXAMPLE_FILES "${EXAMPLE_FILES};${EXAMPLE_NAME}.cpp")
to test:
mpirun -n 4 2Din1Dtop.cpp.out
~/src/libSplash/examples/2Din1Dtop/testOutput.py
output (wrong)
shape (2, 4)
data
[[ 3. 0. 1. 2.]
[ 3. 0. 1. 2.]]
output (should be)
shape (2, 4)
data
[[ 0. 1. 2. 3. ]
[ 0. 1. 2. 3. ]]
It would be a good idea to create version file like boost/version.hpp
This would enable users to check for the version number with cmakes VERSION_LESS
compare operator, as seen in cmake-2.X/Modules/FindBoost.cmake, for upcoming interface changes and releases.
Yes, probably one day. But thats an other topic.
Glad you asked! Well there a two ways: one way would be the
of putting the new version number in every time you push to master (e.g. major.minor.patchlvl)
The
would be not to commit to the master on a daily basis at all, but to consider every pull request/commit to master and its resulting merge a release. (And therefore to develop in a separate dev branch.)
We try to follow that strategy kind-of in PIConGPU, but the right way to do it is described in a-successful-git-branching-model.
Anyway, that method allows to set up a hook for post-commits to master. Sadly, this will only work client-side or via a double-commit, since GitHub only supports Post-Receive hooks (for obvious security reasons with pre-receive hooks).
Last but not least: since every commit on master (yes, merging pull requests creates a merge commit, too) should be considered a release, it is self-evident to put a annotated tag on it which marks it as a GitHub release :)
Allow user to change the internal file structure to avoid the timestep group.
Required to use splash files with VisIt.
Add an internal flag to state that this group is missing to allow transparent reading of such files.
HDF5 seems to support dimension scales and labels [1]. Is this already included in libSplash?
If no, do you think this might be a good feature to add?
I think, this would be extremely useful for documenting physics quantities stored in arrays.
I have not found this option in the HDF5 documentation, but in some tutorials [2] of the hdf5 group.
Analyzing *.h5
files with such scales using h5dump
looks like scales are actually attributes.
DATASET "data" {
DATATYPE H5T_IEEE_F32LE
DATASPACE SIMPLE { ( 4, 3, 2 ) / ( 4, 3, 2 ) }
DATA {
(0,0,0): 1, 1,
(0,1,0): 1, 1,
(0,2,0): 1, 1,
(1,0,0): 1, 1,
(1,1,0): 1, 1,
(1,2,0): 1, 1,
(2,0,0): 1, 1,
(2,1,0): 1, 1,
(2,2,0): 1, 1,
(3,0,0): 1, 1,
(3,1,0): 1, 1,
(3,2,0): 1, 1
}
ATTRIBUTE "DIMENSION_LABELS" {
DATATYPE H5T_STRING {
STRSIZE H5T_VARIABLE;
STRPAD H5T_STR_NULLTERM;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
DATASPACE SIMPLE { ( 3 ) / ( 3 ) }
DATA {
(0): "z", NULL , "x"
}
ATTRIBUTE "DIMENSION_LIST" {
DATATYPE H5T_VLEN { H5T_REFERENCE { H5T_STD_REF_OBJECT }}
DATASPACE SIMPLE { ( 3 ) / ( 3 ) }
DATA {
(0): (DATASET 8560 /z1 ), (DATASET 8288 /y1 ),
(2): (DATASET 1536 /x1 , DATASET 1808 /x2 )
}
}
}
[1] http://docs.h5py.org/en/latest/high/dims.html
[2] http://www.hdfgroup.org/HDF5/Tutor/h5dimscale.html
We could try to write an xdmf
file during DataCollector.close()
that describes all the written hdf5 data sets.
It might be an option to automatically remember all the attributes we wrote since the DataCollector
was opened.
Positive effect: this feature would allow us to use the native hdf5/xdmf readers of tools like VisIt and ParaView.
Doc method writeDomain param id
: read from
? ... write to
It seems as libSplash uses resizable datasets in any case. This might be good for data that might change size but not for data with fixed size (e.g. magnetic field data). Always allowing resizable datasets might cost performance.
For information on resizable datasets see [1].
As an example in PIConGPU see the following h5ls -r *.h5
dump:
/ Group
/custom Group
/data Group
/data/2000 Group
/data/2000/fields Group
/data/2000/fields/Density_e Dataset {12/Inf, 512/Inf, 192/Inf}
/data/2000/fields/Density_i Dataset {12/Inf, 512/Inf, 192/Inf}
/data/2000/fields/EnergyDensity_e Dataset {12/Inf, 512/Inf, 192/Inf}
/data/2000/fields/EnergyDensity_i Dataset {12/Inf, 512/Inf, 192/Inf}
/data/2000/fields/FieldB Group
/data/2000/fields/FieldB/x Dataset {12/Inf, 512/Inf, 192/Inf}
/data/2000/fields/FieldB/y Dataset {12/Inf, 512/Inf, 192/Inf}
/data/2000/fields/FieldB/z Dataset {12/Inf, 512/Inf, 192/Inf}
/data/2000/fields/FieldE Group
/data/2000/fields/FieldE/x Dataset {12/Inf, 512/Inf, 192/Inf}
/data/2000/fields/FieldE/y Dataset {12/Inf, 512/Inf, 192/Inf}
/data/2000/fields/FieldE/z Dataset {12/Inf, 512/Inf, 192/Inf}
/data/2000/particles Group
/data/2000/particles/e Group
/data/2000/particles/e/globalCellIdx Group
/data/2000/particles/e/globalCellIdx/x Dataset {29491200/Inf}
/data/2000/particles/e/globalCellIdx/y Dataset {29491200/Inf}
/data/2000/particles/e/globalCellIdx/z Dataset {29491200/Inf}
/data/2000/particles/e/momentum Group
/data/2000/particles/e/momentum/x Dataset {29491200/Inf}
/data/2000/particles/e/momentum/y Dataset {29491200/Inf}
/data/2000/particles/e/momentum/z Dataset {29491200/Inf}
/data/2000/particles/e/momentumPrev1 Group
/data/2000/particles/e/momentumPrev1/x Dataset {29491200/Inf}
/data/2000/particles/e/momentumPrev1/y Dataset {29491200/Inf}
/data/2000/particles/e/momentumPrev1/z Dataset {29491200/Inf}
/data/2000/particles/e/particles_info Dataset {32/Inf}
/data/2000/particles/e/position Group
/data/2000/particles/e/position/x Dataset {29491200/Inf}
/data/2000/particles/e/position/y Dataset {29491200/Inf}
/data/2000/particles/e/position/z Dataset {29491200/Inf}
/data/2000/particles/e/weighting Dataset {29491200/Inf}
/data/2000/particles/i Group
/data/2000/particles/i/globalCellIdx Group
/data/2000/particles/i/globalCellIdx/x Dataset {29491200/Inf}
/data/2000/particles/i/globalCellIdx/y Dataset {29491200/Inf}
/data/2000/particles/i/globalCellIdx/z Dataset {29491200/Inf}
/data/2000/particles/i/momentum Group
/data/2000/particles/i/momentum/x Dataset {29491200/Inf}
/data/2000/particles/i/momentum/y Dataset {29491200/Inf}
/data/2000/particles/i/momentum/z Dataset {29491200/Inf}
/data/2000/particles/i/momentumPrev1 Group
/data/2000/particles/i/momentumPrev1/x Dataset {29491200/Inf}
/data/2000/particles/i/momentumPrev1/y Dataset {29491200/Inf}
/data/2000/particles/i/momentumPrev1/z Dataset {29491200/Inf}
/data/2000/particles/i/particles_info Dataset {32/Inf}
/data/2000/particles/i/position Group
/data/2000/particles/i/position/x Dataset {29491200/Inf}
/data/2000/particles/i/position/y Dataset {29491200/Inf}
/data/2000/particles/i/position/z Dataset {29491200/Inf}
/data/2000/particles/i/weighting Dataset {29491200/Inf}
/header Group
All datasets have the option to become infinitly large (maked by .../Inf
).
With (parallel) hdf5 it should be possible to set fixed and arbitary sized datasets.
A python example to illustrate this is given here:
from mpi4py import MPI
import h5py
rank = MPI.COMM_WORLD.rank
print "Hello from processor {}".format(rank)
f = h5py.File('example_dataSize.hdf5', 'w', driver='mpio', comm=MPI.COMM_WORLD)
f.create_dataset('dataset_fixed', (10,5), dtype='f')
f.create_dataset('dataset_variable1', (10,5), maxshape=(10,10), dtype='f')
f.create_dataset('dataset_variable2', (10,5), maxshape=(None,None), dtype='f')
f.close()
The corresponding hdf5 file looks like this when using h5ls -r *.h5
:
/ Group
/dataset_fixed Dataset {10, 5}
/dataset_variable1 Dataset {10, 5/10}
/dataset_variable2 Dataset {10/Inf, 5/Inf}
Is there a reason to aways use arbitrary sized datasets?
[1] http://docs.h5py.org/en/latest/high/dataset.html#resizable-datasets
Write a CHANGELOG.md for each new commit in master
.
With a special highlight interface changes.
Commit/Pull in master
from a intermediate release-XYZ
branch -> tag
it as vX.Y
need support for 64bit integer basetypes (signed+unsigned)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.