Comments (7)
the fact that the density is different is telling -- nothing should be changing the density in this unit test.
although it is roundoff-level different
from microphysics.
I've compared temp_zone and dens_zone calculated inside the loop between PGI serial (debug) and gfortran serial. (with aprox13's input, haven't tried it for ignition_simple)
Printing those variables with 15 sf shows they can differ by about 1E-7, which is the absolute difference in density you see above. My guess is that the roundoff error differs between the log10 or power operations implemented for PGI vs GNU.
Doing the same for PGI-serial vs PGI-acc, I see smaller differences, but at least one difference nonetheless, e.g.
dens_zone = 3414548.873833601
vs.
dens_zone = 3414548.873833600
That suggests the difference you see in density, as Mike said, is roundoff, and that the integrator may not necessarily be doing anything to the density.
from microphysics.
good -- forgot that we are doing the exponentiation there.
from microphysics.
Some sleuthing indicates that xn_zone
contains junk on the GPU. After adding a print statement to main.f90
and doing
make COMP=PGI NDEBUG=t -j6
./main.Linux.PGI.exe inputs_3alpha.BS
indicates that all xn_zone
values are bounded by 0 <= xn_zone <= 1.0, as they should be. However,
make COMP=PGI ACC=t NDEBUG=t -j6
./main.Linux.PGI.acc.exe inputs_3alpha.BS.ACC
indicates bad output, such as
...
j, kk, xn_zone: 4 13 1584893192.466650
j, kk, xn_zone: 4 13 1584893192.466650
j, kk, xn_zone: 4 13 1584893192.466650
j, kk, xn_zone: 4 13 1584893192.466650
j, kk, xn_zone: 4 13 1584893192.466650
j, kk, xn_zone: 4 13 1584893192.466650
j, kk, xn_zone: 1 15 1.1651353297957938E-004
j, kk, xn_zone: 1 15 1.1651353297957938E-004
...
I'll continue investigating the origin of this, but wanted to note it in the issue thread.
from microphysics.
A quick note for the record: Max and I looked into this in depth and the origin of the issue appears to be the fact that 1) we're using pf
on the GPU without ever having it in a data statement (seems PGI should've complained) and 2) pf
has a Fortran character array (not supported by PGI on GPU) and a bound procedure (also not something I would expect to work on the GPU, though we don't actually try to use the procedure or character array). It's not clear how, but using this type on the GPU seems to be messing with memory, which may be why xn_zone
contains garbage. Will look into this more tomorrow.
from microphysics.
oh fun.
from microphysics.
After the code was changed to not use pf
, the error appears to have gone away. GPU and CPU comparison now yields
fcompare react_ignition_test_react.VBDF react_ignition_test_react.VBDF.ACC/
variable name absolute error relative error
(||A - B||) (||A - B||/||A||)
----------------------------------------------------------------------
level = 1
density 0.000000000 0.000000000
temperature 0.000000000 0.000000000
Xnew_carbon-12 0.1465605415E-10 0.4396816244E-10
Xnew_oxygen-16 0.000000000 0.000000000
Xnew_magnesium-24 0.1465605415E-10 0.4396775531E-10
Xold_carbon-12 0.000000000 0.000000000
Xold_oxygen-16 0.000000000 0.000000000
Xold_magnesium-24 0.000000000 0.000000000
wdot_carbon-12 0.1465605415E-09 0.4748322189E-05
wdot_oxygen-16 0.000000000 0.000000000
wdot_magnesium-24 0.1465605415E-09 0.4748322189E-05
rho_Hnuc 0.1581619200E+18 0.4574365704E-05
Relative errors are at most about 5e-6 between CPU and GPU.
from microphysics.
Related Issues (20)
- consider changing rhs() interface in integrator to pass y instead of integrator type HOT 2
- reintroduce the idea of multiple integrators
- switch Strang integration to work with (rho X, rho e) HOT 3
- burn_cell_primordial_chem should test if answer is correct
- what's the purpose of integrate_energy now? HOT 1
- for Strang scale_system are we doing the dedot/de term correctly
- for scale_system, the numerical Jacobian scaling is not correct
- regenerate the helmtable without needing std::log10 HOT 1
- variable length array in table_rates.H HOT 2
- consider having the Jacobian type depend on T
- can we operator split within VODE? HOT 1
- Compile-time RHS evaluation should have a pretty print option
- verify the Fermi integral approximations in sneut5.H HOT 2
- add a simpler version of neutrino cooling HOT 1
- add option to skip neutrino cooling if it is insignificant
- MathArray uses Fortran ordering -- are we looping correctly?
- compiler error: reference to 'initial_dt' is ambiguous HOT 5
- NSE update doesn't include plasma neutrino losses HOT 2
- macOS CI fails due to misconfigured GitHub actions runners
- code still runs if helm_table.dat is 0 length
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from microphysics.