Comments (14)
Hi, what MPI version and fabric are you using? We have seen some implementations not behaving correctly.
from swift.
I'm using openmpi 3.1.3 with InfiniBand.
Thanks for the quick response.
from swift.
Ok. So that seems fine. And what version of the OFED driver are you using? The regular Linux-kernel one or the Mellanox-optimised?
from swift.
Also, what transport library are you using in OpenMPI?
We recommend psm
and not psm2
. That is running with --mca btl vader,self --mca mtl psm
.
from swift.
Ok. So that seems fine. And what version of the OFED driver are you using? The regular Linux-kernel one or the Mellanox-optimised?
I think the Mellanox-optimised version
We recommend
psm
and notpsm2
. That is running with--mca btl vader,self --mca mtl psm
.
I will try that
from swift.
Ok. So that seems fine. And what version of the OFED driver are you using? The regular Linux-kernel one or the Mellanox-optimised?
I think the Mellanox-optimised version
Right, then that is likely the issue. Their curent driver hangs if too many asynchronous communications are in-flight at a given point in time. SWIFT makes extensive use of this mechanism so you may be facing this issue here.
from swift.
Would removing the Mellanox driver fix the issue?
from swift.
I can only speculate as I have never seen this issue on machines where we have control over things but it may help.
Otherwise, trying a different mtl
in OpenMPI might help (instead of changing driver).
from swift.
Right, then that is likely the issue. Their curent driver hangs if too many asynchronous communications are in-flight at a given point in time. SWIFT makes extensive use of this mechanism so you may be facing this issue here.
I seem to get the same problem on a system with the regular driver.
from swift.
Did you try changing the mtl
to psm
?
from swift.
Hi Jan,
Have you had some luck with the code?
from swift.
I'm having some trouble as the OpenMPI version I'm using apparently doesn't support psm and installing it with psm is a little problematic at the moment, but I'm still on it.
from swift.
I'm also using Slurm as job scheduler, I forgot to mention that earlier. I hope that that is not interfering with anything.
from swift.
Did you try changing the
mtl
topsm
?
using psm
did not seem to resolve the issue.
Edit: I made another mistake while running. Using psm
does seem to resolve the problem
from swift.
Related Issues (20)
- MPI issue at start HOT 22
- Energy Conservation in giant impact simulations HOT 1
- HM80 ice sitting on HM80 hydrogen helium in giant impact simulations HOT 2
- Can't compile swift with MPI VELOCIraptor HOT 10
- metis.h doesn't exist for ParMETIS
- Interacting unsorted cells HOT 15
- Compilation error due to MPI_Waitall HOT 2
- port to ARM system
- Mistake in BlobTest_3D example HOT 3
- Planetary impact simulation failing after random amount of time: "error: [05595.1] xmf.c:xmf_prepare_file():86: Unable to open temporary file" HOT 4
- Swift webstorage is inaccessible HOT 2
- SIDM Branch not compiling due to parallel issues HOT 5
- Configure output does not match config.log HOT 1
- Precision problem when reading ICs HOT 1
- Can't find METIS library HOT 12
- README.md link to onboarding guide is broken HOT 2
- Onboarding guide could suggest better configure flags HOT 20
- Configure SWIFT with VELOCIraptor HOT 1
- Issue configuring SWIFT. HOT 3
- Issue Encountered Running autogen.sh Script in SWIFT-master Directory HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from swift.