Comments (11)
Would it be easier/better to adopt a simpler grid generation algorithm --- e.g. a tree-structure with fixed-sized patches?
from ibamr.
That may be more efficacious, but I think just redoing the load balancer will be easier. If we use a different grid generation algorithm we'll have to redo a load balancer and also the gridding algorithm instead of just the load balancer. Furthermore, being able to chop boxes in more arbitrary ways might be advantageous for IB load balancing.
from ibamr.
This has come up a couple of times on slack in different places so I'll summarize what I know here.
I've been running the nozzle benchmark on my machine to try and address some IFED scalability issues - at this point we have more parallelization problems with SAMRAI then IFED, which is good since, fundamentally, we can fix a lot of these by just writing a better LoadBalancer
.
A few highlights from callgrind:
- some basic stuff which is very easy to patch (#1562, #1563, #1564) and marginally improves performance (each is a few percent of the overall solver) (the 'recompute overlapping boxes on demand' patch actually makes my benchmark run twice as fast)
- SAMRAI's communication uses extra buffers on both ends - we can get another marginal improvement in communication cost by writing directly into
PatchData
from MPI's buffer. - we spend roughly half our time when doing a
MatMult
communicating ghost data: that's good motivation to get rid of some more intermediate copies in SAMRAI and reduce the total number of ghost cells by writing a better load balancer. Reducing copies is a marginal improvement compared to writing a really good load balancer. Maybe we can collaborate with the AMREX people on this? I haven't found a ton of load-balancing algorithms for patch-based FDM in the literature. I should look more. - Nodal IFED is now fast enough that we spend a lot more time communicating Cartesian-grid data than actually doing IFED :) The global FE scatter isn't that bad (and there isn't much more we can do here - see drwells/fiddle@d24a478) but doing a global redistribution of Cartesian grid data is about 10x as expensive as doing IFED stuff. Scattering the accumulated forces back is also quite slow (maybe 3x the cost of doing the actual spreading). This implies that we need to get rid of
SecondaryHierarchy
and have a single global partitioning of the Cartesian-grid data which load balances both the total number of cells/processor and number of IB points/processor.
Update 1: some relevant discussion on AMR in the Parthenon paper (AFAICT they use trees, not patches)
- https://arxiv.org/pdf/2202.12309.pdf: "PARTHENON provides infrastructure for AMR with both cell- and face-centered data. The size of these arrays must be the same on all MeshBlocks, and moreover the overall domain must contain an integer number of MeshBlocks in each dimension. However, the number and size of individual MeshBlocks tiling the computational domain is arbitrary."
Update 2: I spent a little bit playing around with a possible new load-balancing algorithm for patches. It does a better job than the default LoadBalancer because it alternates between chopping and balancing. The algorithm is something like:
- Calculate a weight for each patch (e.g., number of cells)
- use a bin packing algorithm (e.g., Karmarkar-Karp) to balance patches between bins
- pair the biggest and smallest bin quartiles together. The bigger bin donates something (either a patch or a subset of a patch) to the smaller bin to move it close to the optimal workload.
- Repeat 6-7 times.
My rough Python implementation of this algorithm converges nicely: the difference between the biggest and smallest bins roughly halves at each step and we create a constant (n_procs / 4) number of new patches at most. It's just a simulacrum (I approximate bad chops by adding some randomness) but I get far better results than SAMRAI's load balancer (60 total patches instead of 300, maximum difference of 100 instead of 3000)
Update 3: We may have found a capable undergraduate student to work on this this summer.
Another thing we should examine, if we end up using integer programming, is penalizing using more ghost regions. Formulations of bin packing as an integer program typically involve minimizing some quantity r
. We could penalize the existence of ghost regions by just adding up the number of ghost regions present with certain bin packing operations. It might not help a lot but it could keep adjacent patches preferentially together (and then we can just merge them in a postprocessing step).
from ibamr.
from ibamr.
@knepley We can probably interpret this load-balancing problem as a weighted graph-balancing problem. A fundamental difference between this and a more normal situation (e.g., FEM) is that we can chop patches. For example, on a single processor SAMRAI generates
for exactly the same grid as the one in my first picture. Just partitioning this isn't going to work since there are only 16 patches: we need a way to chop them up in a nice way too.
from ibamr.
In some sense, we don't have to use SAMR as long as the levels are properly nested. Would it make sense to use p4est
?
from ibamr.
Also, by checking the refinement criteria you input, you can force p4est to deliver blocks if you want, and the 2:1 balance is also optional.
from ibamr.
Is it possible straightforward to get p4est
to generate what SAMR folks call "properly nested" grids?
from ibamr.
Would it make sense to use
p4est
?
Maybe! That's essentially what Parthenon is doing (and a lot of other libraries). Notably, AMREX does not (they still use SAMR).
Is it straightforward to get
p4est
to generate what SAMR folks call "properly nested" grids?
I believe this is the default - e.g., everything in deal.II is properly nested.
from ibamr.
Seems pretty tempting to try to use p4est
here...
from ibamr.
I have been using it for a while now. All the plasma stuff I am doing with Mark Adams uses p4est. You can convert seamlessly to a Plex (and pretty seamlessly back). We have some built-in stuff to define Vec objects that define the refinement/coarsening (see VecTagger and DMAdaptLabel/DMAdaptMetric).
from ibamr.
Related Issues (20)
- Bug in IBTK_MPI::allGather() HOT 3
- Diffusion time stepping scheme in AdvDiffSemiImplicitHierarchyIntegrator HOT 8
- some problems about the guide "building on Linux" HOT 3
- IBAMR 0.14: January 26, 2023 HOT 5
- Remove our dependency on open.cdash.org
- bug in nullspace removal? HOT 7
- IBAMR 0.15: June 7, 2024 HOT 2
- Add a function to index utilities which can compute cell centers
- CartGridFunction::setDataOnPatch() should probably be protected HOT 2
- P_problem_coef objects HOT 5
- Specifying P_problem_coef in StaggeredStokesOperator class HOT 12
- Add an automatic restart option
- INSStaggeredHierarchyIntegrator::initializePatchHierarchy() doesn't always get called HOT 3
- More restart DB cleanups HOT 2
- CFL is unset in levelsetops2d.f.m4
- Add an interface to `CartGridFunction` for determining times within time steps HOT 1
- Add a uniform interface for time extrapolation
- Running a parallel LU test on the CI HOT 4
- Bug in HierarchyGhostCellInterpolation and helper classes
- Minor bug in #1707 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ibamr.