computationalradiationphysics / graybat Goto Github PK
View Code? Open in Web Editor NEWGraph Approach for Highly Generic Communication Schemes Based on Adaptive Topologies :satellite:
License: Other
Graph Approach for Highly Generic Communication Schemes Based on Adaptive Topologies :satellite:
License: Other
When trying to compile HASEonGPU for Windows, GrayBat terminates with several errors (and one warning ;) )
graybat\utils/serialize_tuple.hpp(35):
error C2061: syntax error : identifier 'uint'
graybat\utils/serialize_tuple.hpp(44):
error C2992: 'boost::serialization::Serialize' : invalid or missing template parameter list
graybat\utils/serialize_tuple.hpp(37) : see declaration of 'boost::serialization::Serialize'
graybat\utils/serialize_tuple.hpp(58):
error C2913: explicit specialization; 'boost::serialization::Serialize' is not a specialization of a class template
The used Build environment was VisualStudio Community 2013.
Divide the test cases into general test cases, which tests algorithms on all policy combinations of a cage, and special test cases which tests the policies itself.
It would be great to have a policy for local communication that is built on top of std::thread and does not have any dependencies to MPI, Boost.MPI or ZMQ. This would allow to use graybat also for simple single-node applications on systems that don't have any of the HPC communication libraries installed.
In the end, this would somewhat help implementing ComputationalRadiationPhysics/haseongpu#113 efficiently
Wrong graphical representention when execution
with roundrobin mapping and 2 -4 peers.
ZMQ and BMPI have slightly different demands on the data.
Therfore, only a single header has to be included!
Introduce some debug output library like dout again.
Each blocking non local communication method should raise an exception when a given timeout is expired. Idea taken from GPI.
The signaling server only listens to 5000. This value should be able to be configured (by parameter or ini file).
Because of the nice error output
Graph partitioning is the perfect tool to map
a graph to a set of peers. A full framework is provided by Metis. Should not be to complex to include in graybat as a mapping functor.
To see test progress. Can be found here
Because it is common for C++ libraries!
In file included from /home/fabian/DSP/include/graybat/include/graybat/utils/hana/include/boost/hana.hpp:50:0,
from /home/fabian/DSP/include/graybat/include/graybat/communicationPolicy/ZMQ.hpp:24,
from /home/fabian/DSP/src/Input/GrayBatReader.hpp:8,
from /home/fabian/DSP/src/Fitter.cpp:3:
/home/fabian/DSP/include/graybat/include/graybat/utils/hana/include/boost/hana/config.hpp:52:5: Warnung: #warning "You appear to be using GCC, which is not supported by Hana yet." [-Wcpp]
^
In file included from /home/fabian/DSP/src/Input/GrayBatReader.hpp:8:0,
from /home/fabian/DSP/src/Fitter.cpp:3:
/home/fabian/DSP/include/graybat/include/graybat/communicationPolicy/ZMQ.hpp: In Konstruktor »graybat::communicationPolicy::ZMQ::ZMQ(std::string, std::string, unsigned int)«:
/home/fabian/DSP/include/graybat/include/graybat/communicationPolicy/ZMQ.hpp:175:15: Warnung: »graybat::communicationPolicy::ZMQ::maxMsgID« wird initialisiert nach [-Wreorder]
unsigned maxMsgID;
^
/home/fabian/DSP/include/graybat/include/graybat/communicationPolicy/ZMQ.hpp:165:16: Warnung: »const int graybat::communicationPolicy::ZMQ::zmqHwm« [-Wreorder]
const int zmqHwm;
^
/home/fabian/DSP/include/graybat/include/graybat/communicationPolicy/ZMQ.hpp:184:6: Warnung: während es hier initialisiert wurde [-Wreorder]
ZMQ(const std::string masterUri,
^
In file included from /home/fabian/DSP/src/Input/GrayBatReader.hpp:8:0,
from /home/fabian/DSP/src/Fitter.cpp:3:
/home/fabian/DSP/include/graybat/include/graybat/communicationPolicy/ZMQ.hpp: In statischer Elementfunktion »static char* graybat::communicationPolicy::ZMQ::s_recv(zmq::socket_t&)«:
/home/fabian/DSP/include/graybat/include/graybat/communicationPolicy/ZMQ.hpp:332:22: Warnung: Vergleich zwischen vorzeichenbehafteten und vorzeichenlosen Ganzzahlausdrücken [-Wsign-compare]
if (message.size() == -1)
^
/home/fabian/DSP/include/graybat/include/graybat/communicationPolicy/ZMQ.hpp: In Elementfunktion »graybat::communicationPolicy::ZMQ::Uri graybat::communicationPolicy::ZMQ::getUri(zmq::socket_t&, graybat::communicationPolicy::ZMQ::ContextID, graybat::communicationPolicy::ZMQ::VAddr)«:
/home/fabian/DSP/include/graybat/include/graybat/communicationPolicy/ZMQ.hpp:323:6: Warnung: Kontrollfluss erreicht Ende von Nicht-void-Funktion [-Wreturn-type]
}
Current collectives like methods on vertices like spread and collect
are not implemented with support of communication policy collectives.
GrayBat user should be able to chose between collective / non
collective and blocking / non blocking for these kind of methods.
Related to PR #39
because it bringt support for abitrary data types !
A fine grain modeling of a simulation e.g. by every cell of the simulation domain
can lead to unnecessary communication overhead, since peers are exchanging
possibly more messages than is really necessary.
The solution is a graph partitioning of the communication graph and a unified
communication of these partitions. This should decrease the number of
communication calls between peers while still sticking to fine grain modeling
of the simulation, which is really handy.
A nice documentation is very important for a communcation
library like this. These documentations can be more or less
autogenerated from annotated source code by doxygen.
A good example for this kind of documentation is the CUB
Project.
Are our CI tests (drone.io) not running any more?
Can we enable those again and/or shall we use travis?
The C++14 dependency should be enforced by cmake or show an error.
I had a discussion with @fabian-jung about a mapping function that maps groups of peers to groups
of vertices. The cause of this mapping is that there might be peers that takes special roles in the communication graph like source or sink of data. Therefore particular communication vertices need to be mapped on these peers. The synopsis of this filter mapping could be:
Filter(PeerTag, VertexTag, Mapping)
Where PeerTag is an identifier for a peer group and the VertexTag is an identifier for a vertex group. The Mapping is a normal Mapping that maps peer groups to vertex groups. Data source vertices could be mapped to data source peers e.g.:
graybat::mapping::Filter("SourcePeer", "SourceVertex", graybat::mapping::RoundRobin())
Implementing some space filling curves would be straight forward but allows fantastic optimizations for next-neighbor communication patterns.
ZeroMQ is a quite interesting communication library with implementations in IPC, TCP, TIPC, multicast, ...
Probably, several of the implemented functionalities would be interesting, independently each as a CommunicationPolicy
.
GitHub: https://github.com/zeromq
Homepage: http://zeromq.org/
Create some special example folder
The constructor of BGL does only add edges to the graph assuming that all vertices are connected somehow to some global graph. But this can not be assumed!
Use add_vertex(v, g)
which is documented here for all vertices provided in the graph description.
Some libraries :
The ZeroMQ CP uses a single master based signaling approach to provide a mapping from peer uris to virtual addresses. This signaling should be outsourced in its own class/library/policy and maybe some existing solution should be used.
probably naming the repo grayBat
would be easier -> just imagine the "I have to press shift to enter the directory" after git clone ...
no one wants to enter a directory starting with an uppercase letter.
The ZMQ CP consists of the thread that receives messages from other peers. These messages are written into a queue within a multikeymap. The acces to this data structure is locked by mutex, thus, no concurrent write exist. The main thread accesses the message queue by busy waiting. This busy
should be replaced by a conditional_wait !
See this branch for first ideas.
There is no need for users to create all these objects
by themselfes. A single class managing the communication
from the user perspective should be enough. This class should
later also provide, distribution, load balancing and fault tolerance hooks.
This assumption should not hold, since a vertex
can usually send more than a single element within
a gather operation.
The gather and allGather operation within the Cave.hpp
need to be adapted!
Some mechanism used to implement the zmq communication policy can be reused
to implement other communication policies (boost::asio).
These mechanism should be modularized and extracted into a class using CRTP.
Messages with multiple endpoints
I think about a deployment tool which starts peers in a local or distributed environment which is as easy to use as mpirun/mpiexec. There could be a wrapper for mpiexec which which starts also
signaling server for zmq and set environment variables.
Since these should not be copied !
Not needed. CAL should be replaced by CommunicationPolicy directly!
Introduce an receive interface where a vertex can receive data from any edge it is
connected to. This could look like to following:
cage.recv(anyEdge, data)
where anyEdge
will be filled with the information of the edge through which data was send.
Masterbranch is depending on dout, but dout is not contained in master branch.
This seems to be a race condition when one peer destructs and the other constructs its ZMQ CP.
Only occurs when the ZMQ CP is instantiated massively often.
The idea is similar to the publisher/subscriber scheme where one publisher sends
the same message to multiple subscribers. Instead that the publisher has to send
a message to each of its subscribers, it would be more comfortable to do this only
once and graybat takes care of distributing this message to all subscribers.
A communication policy should provide the hardware topologie
of its network as a graph. Thus, the communication graph can
be mapped to the hardware graph. This issue is coupled
a little bit with issue #21 since a communication graph with
more vertices than available peers need to be partioned first
to be mapped to the hardware graph as second step.
How can this hardware graph be retrieved ?
No one knows what this is up to now 😿
Since multicasts are not supported yet, this leads to deadlock
The zmq_signaling server only reads packets, that are sent from localhost. Packages from remote hosts are ignored.
Communication based on tcp/udp ip
http://www.boost.org/doc/libs/1_57_0/doc/html/boost_asio.html
Give the genericCommunicator some fancy name ✨
See Discussion in PR #39
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.