Comments (6)
Hi
How many openMP threads do you use ? This behaviour is typical of an oversubscribing of the cores of your CPU. Either because you have too many threads per core and use hyperthreading or simply because you pin your openMP threads on the same core. Make sure your openMP threads are correctly pinned.
When not specified sometimes the maximum number of openMP threads are used by certain system and that leads to bad performances. If you want to run the code without openMP at all and just run mpi checks you can compile the code with the option config=nopenmp.
Let us know how that goes.
from smilei.
I confirm Arnaud's comment.
You'll find below the behaviour observed on a double E5-2670 architecture (fixing OMP_NUM_THREADS
to 1).
It has been run with IntelMPI which is smart regarding the MPI process affinity without specifying it.
Number of MPI | 1 | 2 | 4 | 8 | 16 |
---|---|---|---|---|---|
time loop | 25.06 | 12.53 | 6.36 | 3.38 | 1.86 |
Particles | 24.51 | 12.22 | 6.13 | 3.20 | 1.70 |
Maxwell | 0.22 | 0.09 | 0.05 | 0.03 | 0.02 |
Sync Particles | 0.11 | 0.05 | 0.04 | 0.03 | 0.03 |
Sync Fields | 0.00 | 0.01 | 0.01 | 0.03 | 0.03 |
Sync Densities | 0.01 | 0.04 | 0.06 | 0.05 | 0.05 |
Efficiency | 100.00% | 100.00% | 98.57% | 92.68% | 84.12% |
from smilei.
OK you are right,
For any reason when OMP_NUM_THREADS is not set, the scalability problem i mentioned in this issue
shows up immediately after more than 2 MPI cores. So one has to explicitly set this env. variable and the speed up is clea and linear when using n= 1 or 2 threads but for higher threads number is see no real improvement. May be some other dual MPI - OpenMP effects ?...
from smilei.
As I said in my first message, if you do not set your OMP_NUM_THREADS, sometimes the system uses a stupid value. The SMILEI simulation log tells you how many openMP threads are being used so that you can check this.
You should be able to reproduce an almost linear improvement for openMP as well but it is more difficult to achieve because you have to properly pin your threads to the cores.
The first thing we advise is not to use openMP across different sockets. So make sure you allocate at least one MPI process per socket.
Then you have to allocate at least as many cores to each of your MPI process as you have openMP threads.
It is often good to bind threads to cores.
And finally make sure your threads are properly positioned across your physical cores and not stack on each other on a same core because of hyperthreading.
All of this is completely independent of SMILEI and is just standard good practice of hybrid codes on many core systems. It is difficult to tell you exactly how to do it because it depends on your environment, compiler, Mpi version etc.
from smilei.
thanks a lot for the detailed explanation.
from smilei.
Anytime. Hope this will help you get the most out of SMILEI. Let us know if we can be of further help.
from smilei.
Related Issues (20)
- Possibility of adding time-dependent ionization rate HOT 2
- This requires setting certain spatial attributes of the materials such as dielectric constants and conductivity in the simulation space. HOT 3
- smilei_test passed, but actual run failed HOT 5
- Operations between quantities in Scalar Diagnostic HOT 1
- Tasks Parallelisation HOT 2
- Initial phase for LaserGaussian3D HOT 2
- Explanation/example for ParticleBinning units HOT 2
- EM_boundary_conditions set as "PML" in 3Dcartesian is ok? HOT 2
- The Screen diagnostic Data at instantaneous time step. HOT 1
- Clarification on the Screen diagnostic HOT 1
- Shortcut to profiles in the documentation HOT 7
- Segmentation faults HOT 16
- Segmentation fault HOT 16
- Unit conversion in PostProcess
- Adjusting gridGlobalOffset and temporal offset HOT 7
- Keeping track of number of MPI processes and OpenMP threads HOT 3
- Beam current distribution in LWFA through AM HOT 1
- Error with "Collisions" block HOT 2
- Enabling Prescribed Field for AM Geometry
- Magnetic field in TrackParticle and ParticleBinning diagnostics HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from smilei.