Code Monkey home page Code Monkey logo

graybat's People

Contributors

ax3l avatar erikzenker avatar fabian-jung avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

graybat's Issues

Compilation fails with VS2013

When trying to compile HASEonGPU for Windows, GrayBat terminates with several errors (and one warning ;) )

graybat\utils/serialize_tuple.hpp(35): 
    error C2061: syntax error : identifier 'uint'

graybat\utils/serialize_tuple.hpp(44):
    error C2992: 'boost::serialization::Serialize' : invalid or missing template parameter list
      graybat\utils/serialize_tuple.hpp(37) : see declaration of 'boost::serialization::Serialize'

graybat\utils/serialize_tuple.hpp(58):
    error C2913: explicit specialization; 'boost::serialization::Serialize' is not a specialization of a class template

The used Build environment was VisualStudio Community 2013.

Special and general test cases

Divide the test cases into general test cases, which tests algorithms on all policy combinations of a cage, and special test cases which tests the policies itself.

C++11 Thread Communication Policy

It would be great to have a policy for local communication that is built on top of std::thread and does not have any dependencies to MPI, Boost.MPI or ZMQ. This would allow to use graybat also for simple single-node applications on systems that don't have any of the HPC communication libraries installed.

In the end, this would somewhat help implementing ComputationalRadiationPhysics/haseongpu#113 efficiently

Roundrobin correction

Wrong graphical representention when execution
with roundrobin mapping and 2 -4 peers.

Uniform data transfer format

ZMQ and BMPI have slightly different demands on the data.

  • ZMQ send data as a blob and does not care about data types.
  • BMPI serializes complex data types with boost::serialize and
    primitive data types are send using the MPI primitive types.

Divide api into core and extended api

  • Core API provides point to point communication
  • Extended API provides collective communication
  • By default extended methods are implemented by the core api
    • New communication policies only need to implemented core api
    • If support for collective communication is available then
      default extended methods can be overwritten

Compiler warnings g++ 5.2.0

In file included from /home/fabian/DSP/include/graybat/include/graybat/utils/hana/include/boost/hana.hpp:50:0,
from /home/fabian/DSP/include/graybat/include/graybat/communicationPolicy/ZMQ.hpp:24,
from /home/fabian/DSP/src/Input/GrayBatReader.hpp:8,
from /home/fabian/DSP/src/Fitter.cpp:3:
/home/fabian/DSP/include/graybat/include/graybat/utils/hana/include/boost/hana/config.hpp:52:5: Warnung: #warning "You appear to be using GCC, which is not supported by Hana yet." [-Wcpp]

warning "You appear to be using GCC, which is not supported by Hana yet."

 ^

In file included from /home/fabian/DSP/src/Input/GrayBatReader.hpp:8:0,
from /home/fabian/DSP/src/Fitter.cpp:3:
/home/fabian/DSP/include/graybat/include/graybat/communicationPolicy/ZMQ.hpp: In Konstruktor »graybat::communicationPolicy::ZMQ::ZMQ(std::string, std::string, unsigned int)«:
/home/fabian/DSP/include/graybat/include/graybat/communicationPolicy/ZMQ.hpp:175:15: Warnung: »graybat::communicationPolicy::ZMQ::maxMsgID« wird initialisiert nach [-Wreorder]
unsigned maxMsgID;
^
/home/fabian/DSP/include/graybat/include/graybat/communicationPolicy/ZMQ.hpp:165:16: Warnung: »const int graybat::communicationPolicy::ZMQ::zmqHwm« [-Wreorder]
const int zmqHwm;
^
/home/fabian/DSP/include/graybat/include/graybat/communicationPolicy/ZMQ.hpp:184:6: Warnung: während es hier initialisiert wurde [-Wreorder]
ZMQ(const std::string masterUri,
^
In file included from /home/fabian/DSP/src/Input/GrayBatReader.hpp:8:0,
from /home/fabian/DSP/src/Fitter.cpp:3:
/home/fabian/DSP/include/graybat/include/graybat/communicationPolicy/ZMQ.hpp: In statischer Elementfunktion »static char* graybat::communicationPolicy::ZMQ::s_recv(zmq::socket_t&)«:
/home/fabian/DSP/include/graybat/include/graybat/communicationPolicy/ZMQ.hpp:332:22: Warnung: Vergleich zwischen vorzeichenbehafteten und vorzeichenlosen Ganzzahlausdrücken [-Wsign-compare]
if (message.size() == -1)
^
/home/fabian/DSP/include/graybat/include/graybat/communicationPolicy/ZMQ.hpp: In Elementfunktion »graybat::communicationPolicy::ZMQ::Uri graybat::communicationPolicy::ZMQ::getUri(zmq::socket_t&, graybat::communicationPolicy::ZMQ::ContextID, graybat::communicationPolicy::ZMQ::VAddr)«:
/home/fabian/DSP/include/graybat/include/graybat/communicationPolicy/ZMQ.hpp:323:6: Warnung: Kontrollfluss erreicht Ende von Nicht-void-Funktion [-Wreturn-type]
}

Collectives on Single Vertices

Current collectives like methods on vertices like spread and collect
are not implemented with support of communication policy collectives.

GrayBat user should be able to chose between collective / non
collective and blocking / non blocking for these kind of methods.

Related to PR #39

Hierarchical Communication Pattern

A fine grain modeling of a simulation e.g. by every cell of the simulation domain
can lead to unnecessary communication overhead, since peers are exchanging
possibly more messages than is really necessary.

The solution is a graph partitioning of the communication graph and a unified
communication of these partitions. This should decrease the number of
communication calls between peers while still sticking to fine grain modeling
of the simulation, which is really handy.

Create some nice doxygen documentation

A nice documentation is very important for a communcation
library like this. These documentations can be more or less
autogenerated from annotated source code by doxygen.
A good example for this kind of documentation is the CUB
Project.

CI - Not Running

Are our CI tests (drone.io) not running any more?

Can we enable those again and/or shall we use travis?

Filter mapping

I had a discussion with @fabian-jung about a mapping function that maps groups of peers to groups
of vertices. The cause of this mapping is that there might be peers that takes special roles in the communication graph like source or sink of data. Therefore particular communication vertices need to be mapped on these peers. The synopsis of this filter mapping could be:

Filter(PeerTag, VertexTag, Mapping)

Where PeerTag is an identifier for a peer group and the VertexTag is an identifier for a vertex group. The Mapping is a normal Mapping that maps peer groups to vertex groups. Data source vertices could be mapped to data source peers e.g.:

graybat::mapping::Filter("SourcePeer", "SourceVertex", graybat::mapping::RoundRobin())

Extra example folder

Create some special example folder

  • Each example in a separate folder (gol, nbody)
  • Each example described by an own README.md

ZMQ Outsource signaling

The ZeroMQ CP uses a single master based signaling approach to provide a mapping from peer uris to virtual addresses. This signaling should be outsourced in its own class/library/policy and maybe some existing solution should be used.

repo name

probably naming the repo grayBat would be easier -> just imagine the "I have to press shift to enter the directory" after git clone ...

no one wants to enter a directory starting with an uppercase letter.

ZMQ blocking wait on multikeymap and message queue instead of busy waiting

The ZMQ CP consists of the thread that receives messages from other peers. These messages are written into a queue within a multikeymap. The acces to this data structure is locked by mutex, thus, no concurrent write exist. The main thread accesses the message queue by busy waiting. This busy
should be replaced by a conditional_wait !

See this branch for first ideas.

Communication Policy Skeleton

Some mechanism used to implement the zmq communication policy can be reused
to implement other communication policies (boost::asio).
These mechanism should be modularized and extracted into a class using CRTP.

Graybat deployment tool

I think about a deployment tool which starts peers in a local or distributed environment which is as easy to use as mpirun/mpiexec. There could be a wrapper for mpiexec which which starts also
signaling server for zmq and set environment variables.

Receive from any edge

Introduce an receive interface where a vertex can receive data from any edge it is
connected to. This could look like to following:

cage.recv(anyEdge, data)

where anyEdge will be filled with the information of the edge through which data was send.

Dependent on dout

Masterbranch is depending on dout, but dout is not contained in master branch.

zmq/send_recv test blocks

This seems to be a race condition when one peer destructs and the other constructs its ZMQ CP.
Only occurs when the ZMQ CP is instantiated massively often.

Edge with multiple targets or same message over multiple edges

The idea is similar to the publisher/subscriber scheme where one publisher sends
the same message to multiple subscribers. Instead that the publisher has to send
a message to each of its subscribers, it would be more comfortable to do this only
once and graybat takes care of distributing this message to all subscribers.

Hardware topology interface for communicationPolicy

A communication policy should provide the hardware topologie
of its network as a graph. Thus, the communication graph can
be mapped to the hardware graph. This issue is coupled
a little bit with issue #21 since a communication graph with
more vertices than available peers need to be partioned first
to be mapped to the hardware graph as second step.

How can this hardware graph be retrieved ?

  • By some communication benchmarks at initialization time
  • By some user input
  • By some network discovery tools (netloc, hwloc, lsnettopo, Libtopomap)
  • By the existing communication library (MPI virtual topologies)

Renaming

Give the genericCommunicator some fancy name ✨

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.