amrex-astro / microphysics Goto Github PK

common astrophysical microphysics routines with interfaces for the different AMReX codes

Home Page: https://amrex-astro.github.io/Microphysics

License: Other

Makefile 0.30% Fortran 0.02% Python 1.46% Shell 0.01% C++ 97.80% CSS 0.01% HTML 0.01% CMake 0.19% Batchfile 0.08% BrighterScript 0.01% Hack 0.12%

equation-of-state reactions nuclear-reactions conductivity stars microphysics-routines

microphysics's People

Contributors

Stargazers

Watchers

microphysics's Issues

add eos_finalize

We should have an eos_finalize() and actual_eos_finalize() functionality.

Order Pizza for October 16 Hackathon

It would be fun to order pizza for those of us participating in the mini-hackathon.

I'd be happy to chip in $10.

BS scaling method

we scale based on abs(y) + dt abs(ydot), but shouldn't we try abs(y + dt*ydot) too? maybe scaling_method = 3?

make a tabular EOS in terms of (rho, e)

We should investigate making a tabular EOS in (rho, e) -- this would be especially useful for SDC. Perhaps the thing to tabulate is entropy, then we can express it in terms of rho, e and get p, T via partial derivatives.

scaling of Jacobian elements is not right

Applying the temp_scale and ener_scale to the Jacobian elements after they are filled doesn't seem right for the derivative wrt T. E.g., we do:

bs % jac(net_itemp,:) = bs % jac(net_itemp,:) * inv_temp_scale

but that shouldn't apply to bs % jac(net_itemp, net_itemp)

for BS integrator, ode_scale_floor seems arbitrary

the current value of ode_scale_floor means that trace abundances have essentially no input of the error and convergence. We should try to link ode_scale_floor to small_x somehow

test suite has poor coverage of different burning modes

In particular, there are no tests of anything other than burning_mode = 1

Also need to check if we have coverage of do_constant_volume_burn

BS SDC uses SVAR instead of SVAR_EVOLVE

The size of the system allocated in the BS actual_integrator_sdc.F90 is SVAR but it shouldn't it really be SVAR_EVOLVE? This affects, for example, the tolerances.

SDC integrators don't support nspec_evolve != nspec

The SDC integrators don't currently handle how we update species where nspec_evolve < nspec. Siince these still have advective terms, we still would need to do some integration. But the current update_unevolved_species mechanism probably is not enough

add SDC unit tests to regression suite

We don't have any SDC unit tests in the suite

create top-level rate tabulation module

We should create a microphysics-wide rate tablulation module that operates on the T-dependence of the reaction rates

test suite has poor coverage of eos in burn

None of the tests exercise use_eos_in_rhs or the dT_crit options

For aprox13, rate tabulation should be the default

This gives fairly accurate results relative to the direct rate evaluation method, but is much faster on CPUs and essentially necessary for GPUs.

This can be done by setting use_tables to .true. in the aprox13/_parameters file.

remove explicit bl_* stuff to generalize

to allow the code to be used by stuff other than AMReX codes, we should write wrappers for things like bl_error, etc.

make a VODE SDC integrator

We want to make VODE work with the SDC interface. Unlike the BS integrator, there is no VODE analog to the bs_t type. We need to do the following:

we need to create a version of vode_type.F90 for SDC.
- This will need to have have a clean_state the fixes up the internal energy, a fill_unevolved_variables routine,
- There will be no update_thermodynamics routine.
- there are no vode_to_eos or eos_to_vode routines
- we need vode_to_sdc and sdc_to_vode routines
the general rpar.F90 that lives in integrator/ will need some different components -- see the BS/ version as comparison. In particular, it will need the rho and momentum indices. We probably will also need to store the advective sources here.

decouple from amrex

With AMReX coming online, we need to decouple these routines from the ~~boxlib~~ AMReX dependency.

The main place this comes in is through calls to bl_error and using bl_constants_module.

We can instead provide a microphysics_error and microphysics_constants. These can simply wrap the BoxLib or AMReX routines, assuming that they provide the necessary info. We need to then have a build-time way of letting Microphysics know which of the libraries to link in.

esum is slow

We use esum() to do exact sums of specific terms in the RHS of the ODEs to prevent roundoff. But esum() is slow. At the moment, we have a general routine with a large max_esum_size -- this also causes trouble on the GPUs.

We should experiment with creating specific esum() routines for the number of terms involved, e.g., esum3(), esum4(), esum5(), ...

We know this ahead of time, since we are explicitly calling esum() on specific combination of terms in rate equations.

rate caching is dangerous and memory-intensive

the rate caches should be removed from the burn_t

helmeos coulomb corrections should gracefully turn off

If we encounter a situation where the Coulomb corrections make the pressure, energy, or entropy negative, we simply turn them off now. We should smoothly bring them to zero to prevent discontinuous derivatives.

vbdf should carry its own burn_t

We should modify the bdf_t to include a burn_t directly, eliminating much of the work done in bdf_to_burn. This will mirror what is done with BS.

need GPU tests in the regression test suite

We have no GPU tests in the PGI test suite. We should pick some basic tests to add coverage.

create an EXTRA_THERMO preprocessor

At the moment, the EOS returns all possible thermodynamic quantities, but sometimes we don't need all of these. We should create and EXTRA_THERMO preprocessor flag that will turn off some of the less-needed quantities. This also should be hooked into the eos_t type in the application codes.

VBDF fails for some networks on CPU

A table has been started to keep track of which integrators are able to integrate different networks on the CPU (space is also available for a similar table for the GPU, but isn't populated yet. We should work on the CPU before trying to work on the GPU anyway).

This issue addresses VBDF failures on the CPU. As the table shows, VBDF fails for the aprox13 and aprox19 networks using the configuration and input found in the unit test.

I'm currently comparing the integration of VBDF with VODE, which in theory implement the same algorithms. For aprox19 I've isolated the cell that fails for VBDF, which VODE seems fine with. I'm currently working to find where the algorithms deviate such that VODE is able to converge to a result while VBDF is not.

Make VODE clean_state look like BS clean_state

VODE's clean_state should include the calls to renormalize_species like BS does. Also, VODE doesn't have a check to ensure that the temperature stays reasonable, like BS does.

add neutrino losses to aprox13

This comes out of discussions with Sam Jones, Aron Michel, and @carlnotsagan

We should implement the neutrino losses from the weak reactions. This would mean keeping track of each reaction and what the actual Q value is (subtracting neutrino losses), and evolving an enuc equation that used these Q values.

From Sam:

I think we estimated the neutrino energy losses, and even though they were smaller than 
I had expected, I agree that they're still important.
...
The way I would implement it would be to introduce the Q value (binding energy difference 
between products and reactants) for each reaction, and additionally a Q_neu for the weak 
reactions, which is the average neutrino energy per reaction, Q_neu = eps_neu/lambda, 
where eps_neu and lambda are the neutrino luminosity [MeV/s] and the rate [/s] from the 
LMP tables, respectively. Q_neu is of course 0 for the reactions involving the strong 
nuclear force. Then the energy generation is the sum of the number of times a reaction 
takes place multiplied by (Q-Qneu).

reintroduce parameters into helmholtz/actual_eos.F90

When playing with OpenACC, there were compiler issues with Fortran parameters on GPUs. We got rid of the parameters to make things play nice. With our new CUDA methodology, we should go back to parameters. E.g., in helmholtz/actual_eos.F90, the variable pi

aprox21 missing rates (reported by Sam Jones)

from Sam:

I found a bug in your implementation of approx21 in BoxLib. The
jacobian is fine but the rhss do not include terms for fe56 and cr56
(i.e. they are zero). Looks like it was copied from approx19 but not
modified for approx21.

Switch CUDA VODE90 to use cuBLAS

Testing has shown system implementations of BLAS are much more efficient than compiling in BLAS ourselves. We should switch the CUDA version of VODE90 to use cuBLAS and check performance.

In particular, since cuBLAS calls require an on-device kernel launch, it will be interesting to see whether the overall performance gains from cuBLAS are worthwhile.

make species naming consistent

Some networks use He4 others he4 -- we should be consistent

BS SDC dimensioning

In bs_type_sdc we dimension:

     real(kind=dp_t) :: u(n_rpar_comps), u_init(n_rpar_comps), udot_a(n_rpar_comps)

but these should really be dimensioned as SVAR-SVAR_EVOLVE

Allow the user to set a maximum temperature in the integration

This should be done via a call to eos_get_max_temp so that the user's probin variables get used properly.

Include fundamental constants in this repo?

Currently this would be inconsistent with some EoS tables that are pre-generated with a specific set of constants, but could still be useful in the long-run for separate codes to have a shared set of constants.

remove integrate_molar_fraction option

We should completely remove the integrate_molar_fraction option and instead rely on networks
to always return stuff in terms of dX/dt. This will cut back on the complexity of the code a lot, eliminating a lot of unnecessary conversions

Vectorize helmholtz EOS!

The helmholtz EOS can represent a significant computational cost. We could consider vectorizing it.

need to reevaluate the tolerances

We ask for species to be evolved to a tolerance of 1.d-12 (in integration/_parameters).

This is pretty tight. We need to check whether it can be relaxed. We can relax on a network-by-network basis (using priorities in the _parameter files).

It seems that the original aprox13 and aprox19 networks used tolerances of 1.e-6

Profile the SDC implementation in the Microphysics integrators

Max has suggested we profile the SDC integration to determine how expensive the EOS calls really are.

The motivation for this is that the EOS calls use rho, e as input variables and it may be worthwhile to think about how to formulate T integration source terms so we could use rho, T as input variables to the EOS instead.

The cost of the EOS should be more apparent using tabulated rates, so this is related to issue #12

update C12(a,g)O16 rate

Consider using the new rate from this compilation:

https://journals.aps.org/rmp/abstract/10.1103/RevModPhys.89.035007

@carlnotsagan can advise us :)

Compare VODE to VODE90. Determine if a switch to VODE90 is in order.

At the moment, we have two VODE-style integrators, VODE and VODE90.

Earlier testing indicates they yield identical integration answers but there may be performance differences. We should compare the performance of each and determine whether it is worth switching to VODE90.

Issues to consider:

VODE90's use of derived types may slow down the GPU. We may need to refactor a bit to eliminate derived types.
VODE90 seems a bit slower than VODE, but we should check this and figure out why.

Migrate Test Suite to C++

Since MAESTRO is now moving to the C++ AMReX, we should migrate the test suite drivers in Microphysics to use the C++ AMReX as well.

A good starting point is test_react in Castro.

we should update VODE90 from linpack to lapack

VODE90 currently uses linpack (since that's what VODE used originally). We should switch it over to lapack so we can use system-optimized lapack routines. This shows the lapack equivalent functions:

http://www.netlib.org/lapack/lug/node147.html

(note, the table has routines starting with s for single precision, but ours, of course, use d)

burn_cell should be documented

We should add unit_tests/burn_cell/ to the Docs

add conductivities

we should move the stellar conductivity routine from Maestro to here

Bad GPU results

Many of the results from GPU-accelerated unit-test code appear to be wrong. As a concrete example, I've built an accelerated and CPU-only executable of the test_react unit test.

Build and execute an accelerated binary, move output for later comparison (note that I've supressed the output of commands):

cd $MICROPHYSICS_HOME/unit_test/test_react
make COMP=PGI NETWORK_DIR=ignition_simple ACC=t -j6
./main.Linux.PGI.acc.exe inputs_ignition.BS
mv react_ignition_test_react.BS react_ignition_test_react.BS.ACC

Build and execute a CPU-only binary:

make COMP=PGI NETWORK_DIR=ignition_simple -j6
./main.Linux.PGI.exe inputs_ignition.BS

If I now compare the two output files, we see they're very different:

fcompare.Linux.gfortran.exe --infile1 react_ignition_test_react.BS --infile2 react_ignition_test_react.BS.ACC

            variable name            absolute error            relative error
                                        (||A - B||)         (||A - B||/||A||)
 ----------------------------------------------------------------------
 level =  1
 density                           0.2384185791E-06          0.1192092896E-15
 temperature                       0.6854534149E-06          0.9792191642E-15
 Xnew_carbon-12                    0.9999999997              0.9999999999    
 Xnew_oxygen-16                    0.7999999999              0.9999999999    
 Xnew_magnesium-24                 0.9999999997               9.999436761    
 Xold_carbon-12                    0.9999999997              0.9999999999    
 Xold_oxygen-16                    0.7999999999              0.9999999999    
 Xold_magnesium-24                 0.9999999997               9.999999997    
 wdot_carbon-12                    0.2812178371E-03           1.000000000    
 wdot_oxygen-16                    0.1110223025E-14           1.000000000    
 wdot_magnesium-24                 0.2812178371E-03           1.000000000    
 rho_Hnuc                          0.3150192097E+24           1.000000000

So while many networks and integrators seem to be able to compile and run without crashing, it's not clear how many are generating correct physical results. I've seen a similar issue with the VBDF integrator, so it doesn't appear to be specific to an integrator or network. These results are from bender, which has PGI 16.9 and a GeForce GTX 960 GPU (with CUDA 8.0 drivers and CUDA 7.5 compilers).

BS integrator uses a single rtol

The BS integrator does not allow for different tolerances on each component, like we do with VODE. We should generalize it so that we can specify a separate rtol for each integration variable.

Basic GPU test fails

The basic GPU test described in Issue #15 fails. On my local machine, I get

[ajacobs@xrb test](development *)$ ./testburn.Linux.PGI.acc.exe

 Initializing Helmholtz EOS and using Coulomb corrections.

FATAL ERROR: data in update device clause was not found on device 1: name=pi
 file:/home/ajacobs/Codebase/Microphysics/networks/ignition_simple/test/../../../EOS/helmholtz/actual_eos.F90 actual_eos_init line:1327

On Stony Brook's bender, I get what may be an error in the system configuration:

[ajacobs@bender test](development)$ ./testburn.Linux.PGI.acc.exe 

 Initializing Helmholtz EOS and using Coulomb corrections.

modprobe: FATAL: Module nvidia-uvm not found in directory /lib/modules/4.7.5-200.fc24.x86_64
call to cuInit returned error 999: Unknown

The error happens with and without debug symbols.

The error seems to be saying pi isn't initialized, but in actual_eos.F90 it is declared. I'm investigating the error now.

num_rate_groups should be a network parameter

Not all the networks need the same amount of rate storage. num_rate_groups should be defined on a network-by-network basis

OpenACC F90 test_react w/ ignition_simple & VBDF giving ptx errors

Building test_react with

make COMP=PGI NDEBUG= OMP= NETWORK_DIR=ignition_simple INTEGRATOR_DIR=VBDF ACC=t

Errors like the following come up:

ptxas /tmp/pgaccBw5JrAtcYokR.ptx, line 1842; fatal   : Parsing error near '-': syntax error
ptxas fatal   : Ptx assembly aborted due to errors
PGF90-S-0155-Compiler failed to translate accelerator region (see -Minfo messages): Device compiler exited with error status code (../../integration/VBDF/actual_integrator.F90: 1)
  0 inform,   0 warnings,   1 severes, 0 fatal for 
make: *** [t/Linux.PGI.debug.acc/o/actual_integrator.o] Error 2
make: *** Waiting for unfinished jobs....

Through commenting out and slowly uncommenting, I've traced at least one triggering of the error to a derived type assignment in Microphysics/integration/VBDF/actual_integrator.F90 in the initial_timestep() subroutine: ts_temp = ts.

However, after writing and using a copy subroutine for bdf_ts types, the error continues. It seems any use of ts_temp triggers the error, even ts_temp%neq = 1.

need to document the backup integration mechanism

we now have the ability to use VODE or BS as the backup for integration -- this needs to be documented.

test suite should include some GPU tests

To my knowledge, the PGI test suite doesn't do any tests that utilize the GPU. I recommend adding a test using the ignition_simple network, BS integrator, and the GPU (ACC=t). Something along the lines of:

$ cd $MICROPHYSICS_HOME/networks/ignition_simple/test
$ make COMP=PGI ACC=t
$ ./testburn.Linux.PGI.debug.acc.exe

Something like this should serve as a minimal verification that basic GPU code is working. As more integrators and/or networks are robustly utilizing the GPU, we can add similar tests to test them (in this case, the default GNUMakefile has already chosen the BS integrator for us).

reset of integration needs to reset T_old

If integration failed and we reset to the initial state to try again, we need to reset T_old and the cv/cp too, for consistency. Perhaps this would be easier with a bs_init variable so we can just do bs = bs_init and go.

amrex-astro / microphysics Goto Github PK

microphysics's People

Contributors

Stargazers

Watchers

Forkers

microphysics's Issues

Recommend Projects

Recommend Topics

Recommend Org