Comments (11)
I guess there are other build errors after fixing that:
[ 7%] Building CXX object runtime/CMakeFiles/RealmRuntime.dir/realm/deppart/partitions.cc.o
In file included from /lustre/orion/cmb138/scratch/seshuy/legion_s3d_flow_control/legion/runtime/legion/legion_redop.cu:24:
In file included from /lustre/orion/cmb138/scratch/seshuy/legion_s3d_flow_control/legion/runtime/legion.h:59:
In file included from /lustre/orion/cmb138/scratch/seshuy/legion_s3d_flow_control/legion/runtime/legion/legion_redop.h:1836:
/lustre/orion/cmb138/scratch/seshuy/legion_s3d_flow_control/legion/runtime/legion/legion_redop.inl:141:14: error: unknown type name '__forceinline__'
__device__ __forceinline__
^
/lustre/orion/cmb138/scratch/seshuy/legion_s3d_flow_control/legion/runtime/legion/legion_redop.inl:148:14: error: unknown type name '__forceinline__'
__device__ __forceinline__
^
/lustre/orion/cmb138/scratch/seshuy/legion_s3d_flow_control/legion/runtime/legion/legion_redop.inl:159:14: error: unknown type name '__forceinline__'
__device__ __forceinline__
^
/lustre/orion/cmb138/scratch/seshuy/legion_s3d_flow_control/legion/runtime/legion/legion_redop.inl:160:10: error: expected ';' after top level declarator
uint8_t __uint2ubyte(unsigned int value, unsigned offset)
^
;
[ 29%] Building CXX object runtime/CMakeFiles/RealmRuntime.dir/realm/hip/hip_access.cc.o
/lustre/orion/cmb138/scratch/seshuy/legion_s3d_flow_control/legion/runtime/realm/hip/hip_internal.cc:1135:47: error: no member named 'hip_fold_excl_fn' in 'Realm::ReductionOpUntyped'
(redop_info.is_exclusive ? redop->hip_fold_excl_fn : redop->hip_fold_nonexcl_fn) :
~~~~~ ^
/lustre/orion/cmb138/scratch/seshuy/legion_s3d_flow_control/legion/runtime/realm/hip/hip_internal.cc:1135:73: error: no member named 'hip_fold_nonexcl_fn' in 'Realm::ReductionOpUntyped'
(redop_info.is_exclusive ? redop->hip_fold_excl_fn : redop->hip_fold_nonexcl_fn) :
~~~~~ ^
/lustre/orion/cmb138/scratch/seshuy/legion_s3d_flow_control/legion/runtime/realm/hip/hip_internal.cc:1136:47: error: no member named 'hip_apply_excl_fn' in 'Realm::ReductionOpUntyped'
(redop_info.is_exclusive ? redop->hip_apply_excl_fn : redop->hip_apply_nonexcl_fn));
~~~~~ ^
/lustre/orion/cmb138/scratch/seshuy/legion_s3d_flow_control/legion/runtime/realm/hip/hip_internal.cc:1136:74: error: no member named 'hip_apply_nonexcl_fn' in 'Realm::ReductionOpUntyped'
(redop_info.is_exclusive ? redop->hip_apply_excl_fn : redop->hip_apply_nonexcl_fn));
~~~~~ ^
/lustre/orion/cmb138/scratch/seshuy/legion_s3d_flow_control/legion/runtime/realm/hip/hip_internal.cc:1500:20: error: no member named 'hip_apply_excl_fn' in 'Realm::ReductionOpUntyped'; did you mean 'cpu_apply_excl_fn'?
if(!redop->hip_apply_excl_fn)
^~~~~~~~~~~~~~~~~
cpu_apply_excl_fn
/lustre/orion/cmb138/scratch/seshuy/legion_s3d_flow_control/legion/runtime/realm/redop.h:62:14: note: 'cpu_apply_excl_fn' declared here
void (*cpu_apply_excl_fn)(void *lhs_ptr, size_t lhs_stride,
from legion.
Yes, CI is now failing on master commit 423ddfb: https://code.olcf.ornl.gov/ci/csc335/dev/legion/-/pipelines/9890
It previously worked on master commit 13d4101: https://code.olcf.ornl.gov/ci/csc335/dev/legion/-/pipelines/9862
from legion.
I'm not immediately seeing how any of this is related, but here are the commits in this range:
$ git log --graph --decorate --oneline 13d4101cccc75ae67e880068617608e90fb8c3e4..423ddfb3943c6e86cd269b045b67ba99e3463502
* 423ddfb39 (HEAD -> master, origin/master, origin/HEAD) Merge branch 'rpath-fixes' into 'master'
|\
| * 63eeeda1b (origin/rpath-fixes) Don't use lib/ blindly, ask cmake what to use
| * 110a83c35 Add libpython.so directory to BUILD_RPATH for librealm.so
| * 065d1cc05 Set RPATH on legion_python to @ORIGIN/../lib
| * 22b554eef Make sure we append to all RPATHs, instead of setting them
* 4024b4723 Merge branch 'test_args' into 'master'
* bcfd3e031 realm: update the machine_config unit test to fully test command line parser and machine config API
from legion.
I created a fix for the hip module https://gitlab.com/StanfordLegion/legion/-/merge_requests/908, but I can not reproduce the redop error with HIP_TARGET=CUDA. the function is there https://gitlab.com/StanfordLegion/legion/-/blob/master/runtime/realm/redop.h#L94
from legion.
Other users are hitting this, so I went ahead and merged https://gitlab.com/StanfordLegion/legion/-/merge_requests/908. I'm not sure what to tell you about the latest NUMA failure, but hopefully this unblocks @syamajala and others.
from legion.
New CI for master
here: https://code.olcf.ornl.gov/ci/csc335/dev/legion/-/pipelines/10240
from legion.
Fixed the initial issue. I'll open a new bug for the NUMA one.
from legion.
Follow-on issue is #1547.
from legion.
I made a local branch that merges master into control_replication and I am still not able to build on frontier. It doesnt look like the CI actually passed and failed in the prep step?
The same errors are still there.
from legion.
I think the CI build errors are in the filesystem? It seems to be intermittent, this one passed (the build at least):
https://code.olcf.ornl.gov/ci/csc335/dev/legion/-/pipelines/10250
from legion.
FYI, CI is passing now. The only thing that fails is #1547.
https://code.olcf.ornl.gov/ci/csc335/dev/legion/-/pipelines/10316
I'm pretty sure this is I/O instability.
from legion.
Related Issues (20)
- Crash in checkpointed Circuit HOT 7
- Regent: Performance regression in Terra 1.2.0 / LLVM 18 on NVIDIA GPUs
- Legion: uninitialized data error HOT 1
- [BUG] Legion Multinode Crash UBSAN Error HOT 1
- Realm: Crash in checkpointed SNAP HOT 11
- Fuzzer: incorrect application of reductions HOT 4
- Legion: collective instance freeze on slingshot-11 HOT 7
- Remove CUDART hijack HOT 12
- Unify most memory kinds HOT 1
- Realm GPU Profiling Is Not Precise HOT 10
- Optimize gather copies in Moya/FleCSI HOT 25
- [BUG] `PhysicalManager::get_use_event()` race condition HOT 8
- Fuzzer: assertion in multi-node DCR HOT 2
- Legion: profile mpi handshake HOT 5
- [UBSAN] `VirtualChannel::package_message()`: store to misaligned address HOT 2
- Legion built without zlib generates uncompressed logs without warning
- Profiler: Show "truly-in-use" memory usage line
- Performance degradation when using compact instances HOT 4
- Add online hang detection in Realm HOT 2
- Barrier Profiling
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from legion.