I experience random crashes when I try to reduce RealVectorObservables. Here's a code to test:
https://gist.github.com/aeantipov/f0e0a2af5a47c6b24171
Sometimes everything goes smoothly, sometimes I get the error. Here's the shell output (sorry, it's long)
(sci) [18:40] antipov@aantipov ~/code/alpscoretest/build $ mpirun --np 3 ./reduce_test02
e1: [1,..5..,1] #29855 +/-[0.0217015,..5..,0.0217015] Tau: [31.6243,..5..,31.6243] Bins: [[1,..5..,1],..232..,[0,..5..,0]]#128
e2: [1,..5..,1] #29855 +/-[0.0217015,..5..,0.0217015] Tau: [31.6243,..5..,31.6243] Bins: [[1,..5..,1],..232..,[0,..5..,0]]#128
(sci) [18:40] antipov@aantipov ~/code/alpscoretest/build $ mpirun --np 3 ./reduce_test02
e1: [1,..5..,1] #31224 +/-[0.0214054,..5..,0.0214054] Tau: [31.6809,..5..,31.6809] Bins: [[1,..5..,1],..242..,[0,..5..,0]]#128
e2: [1,..5..,1] #31224 +/-[0.0214054,..5..,0.0214054] Tau: [31.6809,..5..,31.6809] Bins: [[1,..5..,1],..242..,[0,..5..,0]]#128
(sci) [18:40] antipov@aantipov ~/code/alpscoretest/build $ mpirun --np 3 ./reduce_test02
libc++abi.dylib: terminating with uncaught exception of type std::logic_error: No alps::mpi::reduce available for this type NSt3__16vectorINS0_IdNS_9allocatorIdEEEENS1_IS3_EEEE
In /Users/antipov/_local/include/alps/accumulator/mpi.hpp on 89 in reduce_impl
1 reduce_test02 0x000000010089911c void alps::mpi::detail::reduce_impl >, std::__1::allocator > > >, std::__1::plus >(boost::mpi::communicator const&, std::__1::vector >, std::__1::allocator > > > const&, std::__1::plus, int, boost::integral_constant, boost::integral_constant) + 1260
2 reduce_test02 0x000000010089852b alps::accumulator::impl::Accumulator >, alps::accumulator::binning_analysis_tag, alps::accumulator::detail::simple_observable_type > > >::collective_merge(boost::mpi::communicator const&, int) const + 603
3 reduce_test02 0x0000000100897cf0 alps::accumulator::impl::Accumulator >, alps::accumulator::max_num_binning_tag, alps::accumulator::impl::Accumulator >, alps::accumulator::binning_analysis_tag, alps::accumulator::detail::simple_observable_type > > > >::collective_merge(boost::mpi::communicator const&, int) const + 32
4 reduce_test02 0x0000000100896e8c alps::accumulator::impl::Accumulator >, alps::accumulator::max_num_binning_tag, alps::accumulator::impl::Accumulator >, alps::accumulator::binning_analysis_tag, alps::accumulator::detail::simple_observable_type > > > >::collective_merge(boost::mpi::communicator const&, int) + 364
5 reduce_test02 0x000000010086163b alps::mcmpiadapter::collect_results(std::__1::vector, std::__1::allocator >, std::__1::allocator, std::__1::allocator > > > const&) const + 683
6 reduce_test02 0x00000001008612fd alps::mcmpiadapter::collect_results() const + 45
7 reduce_test02 0x000000010085f7a1 main + 513
8 libdyld.dylib 0x00007fff9139e5fd start + 1
[phys-098-161:38639] *** Process received signal ***
[phys-098-161:38639] Signal: Abort trap: 6 (6)
[phys-098-161:38639] Signal code: (0)
[phys-098-161:38639] [ 0] libc++abi.dylib: terminating with uncaught exception of type std::logic_error: No alps::mpi::reduce available for this type NSt3__16vectorINS0_IdNS_9allocatorIdEEEENS1_IS3_EEEE
In /Users/antipov/_local/include/alps/accumulator/mpi.hpp on 89 in reduce_impl
1 reduce_test02 0x000000010412911c void alps::mpi::detail::reduce_impl >, std::__1::allocator > > >, std::__1::plus >(boost::mpi::communicator const&, std::__1::vector >, std::__1::allocator > > > const&, std::__1::plus, int, boost::integral_constant, boost::integral_constant) + 1260
2 reduce_test02 0x000000010412852b alps::accumulator::impl::Accumulator >, alps::accumulator::binning_analysis_tag, alps::accumulator::detail::simple_observable_type > > >::collective_merge(boost::mpi::communicator const&, int) const + 603
3 reduce_test02 0x0000000104127cf0 alps::accumulator::impl::Accumulator >, alps::accumulator::max_num_binning_tag, alps::accumulator::impl::Accumulator >, alps::accumulator::binning_analysis_tag, alps::accumulator::detail::simple_observable_type > > > >::collective_merge(boost::mpi::communicator const&, int) const + 32
4 reduce_test02 0x0000000104126e8c alps::accumulator::impl::Accumulator >, alps::accumulator::max_num_binning_tag, alps::accumulator::impl::Accumulator >, alps::accumulator::binning_analysis_tag, alps::accumulator::detail::simple_observable_type > > > >::collective_merge(boost::mpi::communicator const&, int) + 364
5 reduce_test02 0x00000001040f163b alps::mcmpiadapter::collect_results(std::__1::vector, std::__1::allocator >, std::__1::allocator, std::__1::allocator > > > const&) const + 683
6 reduce_test02 0x00000001040f12fd alps::mcmpiadapter::collect_results() const + 45
7 reduce_test02 0x00000001040ef7a1 main + 513
8 libdyld.dylib 0x00007fff9139e5fd start + 1
[phys-098-161:38638] *** Process received signal ***
[phys-098-161:38638] Signal: Abort trap: 6 (6)
[phys-098-161:38638] Signal code: (0)
[phys-098-161:38638] [ 0] 0 libsystem_platform.dylib 0x00007fff89fb35aa _sigtramp + 26
[phys-098-161:38638] [ 1] 0 ??? 0x0000000000000000 0x0 + 0
[phys-098-161:38638] [ 2] 0 libsystem_c.dylib 0x00007fff901dfb1a abort + 125
[phys-098-161:38638] [ 3] 0 libc++abi.dylib 0x00007fff94b17f31 __cxa_bad_cast + 0
[phys-098-161:38638] [ 4] 0 libsystem_platform.dylib 0x00007fff89fb35aa _sigtramp + 26
[phys-098-161:38639] [ 1] 0 ??? 0x0000000000000000 0x0 + 0
[phys-098-161:38639] [ 2] 0 libsystem_c.dylib 0x00007fff901dfb1a abort + 125
[phys-098-161:38639] [ 3] 0 libc++abi.dylib 0x00007fff94b17f31 __cxa_bad_cast + 0
[phys-098-161:38639] [ 4] 0 libc++abi.dylib 0x00007fff94b3d93a _ZL25default_terminate_handlerv + 240
[phys-098-161:38639] [ 5] 0 libobjc.A.dylib 0x00007fff8838a322 _ZL15_objc_terminatev + 124
[phys-098-161:38639] [ 6] 0 libc++abi.dylib 0x00007fff94b3b1d1 _ZSt11__terminatePFvvE + 8
[phys-098-161:38639] [ 7] 0 libc++abi.dylib 0x00007fff94b3d93a _ZL25default_terminate_handlerv + 240
[phys-098-161:38638] [ 5] 0 libobjc.A.dylib 0x00007fff8838a322 _ZL15_objc_terminatev + 124
[phys-098-161:38638] [ 6] 0 libc++abi.dylib 0x00007fff94b3b1d1 _ZSt11__terminatePFvvE + 8
[phys-098-161:38638] [ 7] 0 libc++abi.dylib 0x00007fff94b3ac5b _ZN10__cxxabiv1L22exception_cleanup_funcE19_Unwind_Reason_CodeP17_Unwind_Exception + 0
[phys-098-161:38638] [ 8] 0 libc++abi.dylib 0x00007fff94b3ac5b _ZN10__cxxabiv1L22exception_cleanup_funcE19_Unwind_Reason_CodeP17_Unwind_Exception + 0
[phys-098-161:38639] [ 8] 0 reduce_test02 0x000000010089932e _ZN4alps3mpi6detail11reduce_implINSt3__16vectorINS4_IdNS3_9allocatorIdEEEENS5_IS7_EEEENS3_4plusIdEEEEvRKN5boost3mpi12communicatorERKT_T0_iNSC_17integral_constantIbLb0EEESM_ + 1790
[phys-098-161:38639] [ 9] 0 reduce_test02 0x000000010089852b _ZNK4alps11accumulator4impl11AccumulatorINSt3__16vectorIdNS3_9allocatorIdEEEENS0_20binning_analysis_tagENS0_6detail22simple_observable_typeIS7_EEE16collective_mergeERKN5boost3mpi12communicatorEi + 603
[phys-098-161:38639] [10] 0 reduce_test02 0x0000000100897cf0 _ZNK4alps11accumulator4impl11AccumulatorINSt3__16vectorIdNS3_9allocatorIdEEEENS0_19max_num_binning_tagENS2_IS7_NS0_20binning_analysis_tagENS0_6detail22simple_observable_typeIS7_EEEEE16collective_mergeERKN5boost3mpi12communicatorEi + 32
[phys-098-161:38639] [11] 0 reduce_test02 0x0000000100896e8c _ZN4alps11accumulator4impl11AccumulatorINSt3__16vectorIdNS3_9allocatorIdEEEENS0_19max_num_binning_tagENS2_IS7_NS0_20binning_analysis_tagENS0_6detail22simple_observable_typeIS7_EEEEE16collective_mergeERKN5boost3mpi12communicatorEi + 364
[phys-098-161:38639] [12] 0 reduce_test02 0x000000010412932e _ZN4alps3mpi6detail11reduce_implINSt3__16vectorINS4_IdNS3_9allocatorIdEEEENS5_IS7_EEEENS3_4plusIdEEEEvRKN5boost3mpi12communicatorERKT_T0_iNSC_17integral_constantIbLb0EEESM_ + 1790
[phys-098-161:38638] [ 9] 0 reduce_test02 0x000000010412852b _ZNK4alps11accumulator4impl11AccumulatorINSt3__16vectorIdNS3_9allocatorIdEEEENS0_20binning_analysis_tagENS0_6detail22simple_observable_typeIS7_EEE16collective_mergeERKN5boost3mpi12communicatorEi + 603
[phys-098-161:38638] [10] 0 reduce_test02 0x0000000104127cf0 _ZNK4alps11accumulator4impl11AccumulatorINSt3__16vectorIdNS3_9allocatorIdEEEENS0_19max_num_binning_tagENS2_IS7_NS0_20binning_analysis_tagENS0_6detail22simple_observable_typeIS7_EEEEE16collective_mergeERKN5boost3mpi12communicatorEi + 32
[phys-098-161:38638] [11] 0 reduce_test02 0x0000000104126e8c _ZN4alps11accumulator4impl11AccumulatorINSt3__16vectorIdNS3_9allocatorIdEEEENS0_19max_num_binning_tagENS2_IS7_NS0_20binning_analysis_tagENS0_6detail22simple_observable_typeIS7_EEEEE16collective_mergeERKN5boost3mpi12communicatorEi + 364
[phys-098-161:38638] [12] 0 reduce_test02 0x00000001040f163b _ZNK4alps12mcmpiadapterI4sim1NS_14check_scheduleEE15collect_resultsERKNSt3__16vectorINS4_12basic_stringIcNS4_11char_traitsIcEENS4_9allocatorIcEEEENS9_ISB_EEEE + 683
[phys-098-161:38638] [13] 0 reduce_test02 0x00000001040f12fd _ZNK4alps12mcmpiadapterI4sim1NS_14check_scheduleEE15collect_resultsEv + 45
[phys-098-161:38638] [14] 0 reduce_test02 0x000000010086163b _ZNK4alps12mcmpiadapterI4sim1NS_14check_scheduleEE15collect_resultsERKNSt3__16vectorINS4_12basic_stringIcNS4_11char_traitsIcEENS4_9allocatorIcEEEENS9_ISB_EEEE + 683
[phys-098-161:38639] [13] 0 reduce_test02 0x00000001008612fd _ZNK4alps12mcmpiadapterI4sim1NS_14check_scheduleEE15collect_resultsEv + 45
[phys-098-161:38639] [14] 0 reduce_test02 0x000000010085f7a1 main + 513
[phys-098-161:38639] [15] 0 libdyld.dylib 0x00007fff9139e5fd start + 1
[phys-098-161:38639] *** End of error message ***
0 reduce_test02 0x00000001040ef7a1 main + 513
[phys-098-161:38638] [15] 0 libdyld.dylib 0x00007fff9139e5fd start + 1
[phys-098-161:38638] *** End of error message ***
Received signal 15
--------------------------------------------------------------------------
mpirun noticed that process rank 2 with PID 38639 on node phys-098-161 exited on signal 6 (Abort trap: 6).
--------------------------------------------------------------------------