Comments (3)
I will have a look at it. In my fork I have a branch /soerenjalas/fbpic/tree/test_tpb where the 1D and 2D threads per block can be set via environment variables (if you want I can do a WIP pull request). I will do some benchmarks similar to what @RemiLehe did with the threading stuff.
from fbpic.
That sounds great!! Thanks a lot for taking care of this ; I think we can really some non-negligible speed-up from this!
from fbpic.
The test of the 1D threads per block on several GPU types show that the deposition an gathering methods could benefit from an adjustment of the tpb settings. A reasonable value for all GPUs would be 64.
Note that even though it's usually recommended to use a multiple of 32 tpb, the deposition kernels perform best with 48 tpb on the K80 and P100. I don't know why.
The GPU used below was actually a K80 on Jureca
from fbpic.
Related Issues (20)
- Conducting boosted frame simulation on self-modulated LWFA with Gaussian-like transverse density HOT 1
- anode cathode definition HOT 2
- Question about LWFA with ionization HOT 3
- issue of running fbpic by multi-cores HOT 3
- settings of laser focus position and beam focus position HOT 1
- Error in running the simulation HOT 4
- cuda- memory overflow. HOT 2
- Transverse focusing field HOT 5
- Convergence of boosted frame and lab frame simulations (external injection emittance) HOT 2
- plotting rho in x and z axis HOT 2
- Highly uneven memory usage distribution in a parallel 2-GPU job HOT 2
- Spurious fields at MPI boundaries HOT 7
- Strong density filaments in underdense plasma HOT 1
- openPMD diagnostics HOT 1
- Issue with last version of pyfftw on power9 HOT 1
- Problem about particle tracking in ionization injection HOT 3
- Set the delay between two Laser profile traveling in the same direction
- Feature request HOT 3
- About ionization elements in FBPIC HOT 1
- Bug in particle sorting when particle travel beyond guard cells HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fbpic.