Code Monkey home page Code Monkey logo

cice5's Introduction

Overview

This repository contains the trunk from the subversion (svn) repository of the Los Alamos Sea Ice Model, CICE, including release tags through version 5.1.2.

More recent versions are found in the CICE and Icepack repositories, which are maintained by the CICE Consortium.

If you expect to make any changes to the code, we recommend that you work in the CICE and Icepack repositories. Changes made to code in this repository will not be accepted, other than critical bug fixes.

Useful links

cice5's People

Contributors

aekiss avatar aidanheerdegen avatar eclare108213 avatar marshallward avatar nichannah avatar penguian avatar rmholmes avatar russfiedler avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

cice5's Issues

diagnostic bug in CICE when outputting 4d fields

Background:
For each grid point and ice category, cice computes its thermodynamics over nkice layers (this is 4 in ACCESS-OM2).

Under the BL99 thermodynamics option, the conductive salinity profile (aka 'ice internal salinity') is fixed and prescribed from a function - therefore we know what values to expect. For a given point and thickness category the expected values for these four layers are [0.64920187, 2.354581  , 3.0310922 , 3.1892977 ].

The Issue:
However, if Sinz is output (f_sinz is the namelist field flag), the values do not appear as I expect.

Here's an example taken from the 1° model, given an Xarray DataArray:

In [4]: type(sinz)
Out[4]: xarray.core.dataarray.DataArray

In [5]: sinz.shape
Out[5]: (1, 5, 4, 300, 360)

selecting a point for a given time (which has length 1 in this case), lat and lon (that has ice, aka aice>0)

In [6]: sinz.isel(time=0,ni=30,nj=40,nc=0).values
Out[6]: array([0.64920187, 0.64920187, 0.64920187, 0.64920187], dtype=float32)

We do not see the 4 values we expect to see.

Consider the same time,lat,lon for all layers and thickness categories:

n [8]: sinz.isel(time=0,ni=30,nj=40).shape
Out[8]: (5, 4)
In [9]: sinz.isel(time=0,ni=30,nj=40).values
Out[9]: 
array([[0.64920187, 0.64920187, 0.64920187, 0.64920187],
       [0.64920187, 2.354581  , 2.354581  , 2.354581  ],
       [2.354581  , 2.354581  , 3.0310922 , 3.0310922 ],
       [3.0310922 , 3.0310922 , 3.0310922 , 3.1892977 ],
       [3.1892977 , 3.1892977 , 3.1892977 , 3.1892977 ]], dtype=float32)

The values appear to be ordered along the wrong dimensions. What I think should be the correct answer can be retrieved (for this time,lat,lon) by:

In [10]: temp = sinz.isel(time=0,ni=30,nj=40).values

In [11]: temp.reshape((4,5)).transpose()
Out[11]: 
array([[0.64920187, 2.354581  , 3.0310922 , 3.1892977 ],
       [0.64920187, 2.354581  , 3.0310922 , 3.1892977 ],
       [0.64920187, 2.354581  , 3.0310922 , 3.1892977 ],
       [0.64920187, 2.354581  , 3.0310922 , 3.1892977 ],
       [0.64920187, 2.354581  , 3.0310922 , 3.1892977 ]], dtype=float32)

These code snippets have been produced in ipython using pyXarray, but the same answer is obtained using a variety of tools, including inspecting the netCDF file with ncview.

If there is an issue here, this may affect other 4d fields like Tinz, but these may be harder to diagnose from their values, as they aren't prescribed and fixed like Sinz.

An example output file with the Sinz variable can be found on gadi at:

/home/548/sxa548/access-om2-sample_output/iceh.2018-08-15.nc

cice output has the wrong date

Since libaccessom2 restarted cice jobs have incorrect date. Also clearly the libaccessom2 model time synchronisation checks are not working correctly.

halo update bugs in ACCESS driver?

Hi @russfiedler, following on from #68, these also look wrong to me (should be field_type_scalar, otherwise their signs will be flipped on the tripole in halo updates)

call ice_HaloUpdate(um_tmlt, halo_info, field_loc_center,field_type_vector)
call ice_HaloUpdate(um_bmlt, halo_info, field_loc_center,field_type_vector)

and

call ice_HaloUpdate(um_swflx, halo_info,field_loc_center,field_type_vector)
call ice_HaloUpdate(um_lwflx, halo_info,field_loc_center,field_type_vector)
call ice_HaloUpdate(um_shflx, halo_info,field_loc_center,field_type_vector)
call ice_HaloUpdate(um_press, halo_info,field_loc_center,field_type_vector)
call ice_HaloUpdate(um_co2, halo_info, field_loc_center, field_type_vector)
call ice_HaloUpdate(um_wnd, halo_info, field_loc_center, field_type_vector)

Initialise offset in pack_coupling_array

At line 462 of https://github.com/COSIMA/cice5/blob/master/drivers/auscom/cpl_interface.F90
in subroutine pack_coupling_array, the variable offset should be initialised to 0.
The current lack of initialisation causes errors such as

Segmentation fault: address not mapped to object

History: In commit c132689 which resolves issue #21, at line 465 in subroutine pack_coupling_array, the variable offset should have been initialised to 0. Compare with line 436 in subroutine unpack_coupling_array in the same commit.

Add compression level namelist parameter

Netcdf compression level affects cice runtime, e.g. in 3mo runs with daily output (/g/data3/hh5/tmp/cosima/access-om2-01/01deg_jra55v13_ryf8485_spinup7_newexe):

level                          Timer  12: ReadWrite   file size
0 (no compression*)               1510s                  1600Mb
1                                1833s                   252Mb
5                                2501s                   236Mb

It would be nice to be able to control this speed/space tradeoff via a namelist parameter setting the compression level.
There are only a few code changes required - see 9e69c99

*this test still used nf90_def_var_deflate, but at level 0 - perhaps it would be faster to skip nf90_def_var_deflate if level is 0. Before switching to netcdf4 (d2ef6b1) IO took about 1300s. It's unclear whether the extra 200s in the table is the cost of netcdf4 or the level-0 deflate step.

Slack discussion: https://arccss.slack.com/archives/C9Q7Y1400/p1557809112231100

Support non-BGC configs

I should have thought a bit harder before merging in @hakaseh's support for coupled BGC 10b3527

CICE now expects to find surface nitrate and algae in the coupling fields, so it won't work with our usual physics-only configurations.

I guess this could be fixed with lots of #ifdefs, but that will give us 2 exes for each resolution. Is there a more elegant way to do it?

diag bugs with AusCOM driver (affecting ACCESS-OM2)

Thanks to Stewart Allen for flagging this issue (see Slack discussion https://arccss.slack.com/archives/C6PP0GU9Y/p1627269245007400).

These diagnostics are identical in access-om2, and shouldn't be:

  • fresh and fresh_ai
  • fsalt and fsalt_ai
  • fhocn and fhocn_ai
  • fswthru and fswthru_ai

This issue probably also affects these diagnostics and their _ai counterparts: alvdr, alidr, alvdf, alidf, fNO, fNH, fN, fSil, but I haven't checked.

I did a test run in /home/156/aek156/payu/1deg_jra55_iaf_cice_diag_test, which gives these test results for equality of some diagnostics and their _ai counterparts:

import xarray as xr

ds = xr.open_dataset('/scratch/v45/aek156/access-om2/archive/1deg_jra55_iaf_cice_diag_test/output132/ice/OUTPUT/iceh.1968-02.nc')
allvars = list(ds.variables.keys())
vs = [v for v in allvars if v[:-2]+'_ai_m' in allvars]

for v in vs:
    print(v, ds[v].equals(ds[v[:-2]+'_ai_m']))

prints

snow_m False
rain_m False
fswabs_m False
flat_m False
fsens_m False
flwup_m False
evap_m False
fresh_m True
fsalt_m True
fhocn_m True
fswthru_m True

Backport weight-per-block set by file from CICE6

Presently the amount of work done in each block is estimated as a linear function of latitude. This is obviously not a very close approximation to the amount of ice work at a given geographic location.

CICE6 has an option to set the work weight for each grid point using a file containing a 2d field. We think this will allow a lot more accurate specification of the amount of work and hence better load balancing.

This issue will back-port the CICE6 functionality to our version of CICE.

abort_ice does not pass an error code to MPI_abort

This is the same issue Martin Dix discovered:

https://accessdev.nci.org.au/trac/ticket/318

The call signature for MPI_Abort is
https://www.open-mpi.org/doc/v1.10/man3/MPI_Abort.3.php

MPI_ABORT(COMM, ERRORCODE, IERROR)            
    INTEGER        COMM, ERRORCODE, IERROR
comm
Communicator of tasks to abort.
errorcode
Error code to return to invoking environment.

It is called here:
https://github.com/OceansAus/cice5/blob/master/mpi/ice_exit.F90#L61
but the value of ierr is not set in the routine, so it is whatever it was initialised to by the compiler.

They had a similar issue with oasis_abort:

https://portal.enes.org/oasis/faq-forum/oasis3-forum/real-coupled-models/548853210

Apparently they decided oasis_abort should default to a non-zero value, it is an abort call after all.

Put atm coupling field halo updates after ocean send

Presently the halo updates in the CICE ACCESS coupling code slow the ocean down. The ocean is waiting to receive on the ice while the ice does halo updates.

This issue moves the halo updates to after the ocean communication.

restart and input directories are the same

The information model for MOM (and I think other models) is for inputs to be read from one directory, outputs to be saved to different directory, and restarts to be saved into another unique directory. In this way the contents of the restart directory can be copied/linked to the input directory for the next run.

Currently sicemass, u_star and the coupling fields are read from restart_dir

https://github.com/OceansAus/cice5/blob/fe7300227107bde802a217ff0d6ef7f92a6eb6c2/drivers/auscom/CICE_RunMod.F90#L106
https://github.com/OceansAus/cice5/blob/05597824ac633a1c6ce444ac78b651f3844092e1/drivers/auscom/CICE_InitMod.F90#L170

and written to restart_dir

https://github.com/OceansAus/cice5/blob/fe7300227107bde802a217ff0d6ef7f92a6eb6c2/drivers/auscom/CICE_RunMod.F90#L228

This can cause issues if these files are symbolic links then writing to them will overwrite the previous version of the restart file.

I would like to have separate INPUT and RESTART directories.

Thoughts @nicjhan @aekiss @marshallward

Can't output Tinz

As reported here, ACCESS-OM2 1deg_jra55_ryf aborts when f_tinz is anything other than ‘x’. It aborts at the first time the data would be written.

Abort with message Unknown Error: Unrecognized error code in file /g/data/v45/aek156/CHUCKABLE/access-om2/src/cice5/ParallelIO/src/clib/pio_darray_int.c at line 687

This is the offending line: https://github.com/NCAR/ParallelIO/blob/7e242f78bd1b4766518aff44fda17ff50eed6188/src/clib/pio_darray_int.c#L687

Possibly related: #62 (comment)

It has been possible to output Tinz in other runs, e.g. 0.1° IAF.

Valgrind error in CICE

I have been running valgrind on access-om2 to track down a segfault. Currently I am getting the following error message before a crash. I don't know if this is triggering the crash so may be a general problem with CICE.

==31611== Invalid write of size 8
==31611== at 0x50C326: ice_gather_scatter_mp_scatter_global_dbl_ (ice_gather_scatter.f90:959)
==31611== by 0x5AFA4F: ice_read_write_mp_ice_read_nc_xy_ (ice_read_write.f90:1163)
==31611== by 0x41E93A: cpl_forcing_handler_mp_get_u_star_ (cpl_forcing_handler.f90:251)
==31611== by 0x40F5EC: cice_init (CICE_InitMod.f90:199)
==31611== by 0x40F5EC: cice_initmod_mp_cice_initialize_ (CICE_InitMod.f90:63)
==31611== by 0x40C841: MAIN__ (CICE.f90:56)
==31611== by 0x40C7DD: main (in /short/x77/nah599/access-om2/bin/cice_auscom_1440x1080_480p_maxblocks_4.exe)
==31611== Address 0x2c6fdc80 is 8 bytes after a block of size 22,312 alloc'd
==31611== at 0x4C2A8FA: malloc (vg_replace_malloc.c:298)
==31611== by 0x9279AB: _mm_malloc (in /short/x77/nah599/access-om2/bin/cice_auscom_1440x1080_480p_maxblocks_4.exe)
==31611== by 0x8A4E07: for_alloc_allocatable (in /short/x77/nah599/access-om2/bin/cice_auscom_1440x1080_480p_maxblocks_4.exe)

The command line I used to run this was:

mpirun --mca orte_base_help_aggregate 0 -wdir /short/x77/nah599/access-om2/work/025deg_jra55_ryf/atmosphere -np 1 /short/public/access-om2/bin/yatm_c2868e5b.exe : -wdir /short/x77/nah599/access-om2/work/025deg_jra55_ryf/ocean -np 1455 /short/x77/nah599/access-om2/bin/fms_ACCESS-OM_quantify_load_imbalance.x : -wdir /short/x77/nah599/access-om2/work/025deg_jra55_ryf/ice -np 393 -x LD_PRELOAD=/home/599/nah599/more_home/usr/local/lib/valgrind/libmpiwrap-amd64-linux.so /home/599/nah599/more_home/usr/local/bin/valgrind --main-stacksize=200000000 --max-stackframe=200000000 --error-limit=no --freelist-vol=10000000 --suppressions=/short/v45/nah599/more_home/mom-run-scheduler/valgrind_suppressions.txt /short/x77/nah599/access-om2/bin/cice_auscom_1440x1080_480p_maxblocks_4.exe

Diagnostic output and restarts not compressed

Slack conversation:

Hi All. Just a quick query on what you all think we should do about CICE output. Right now I have a few problems with the CICE output — I dislike the single file per month, I don’t know why we need to have all CICE output stored in ice/OUTPUT/ rather than just ice/ and, finally, it is uncompressed! A quick test with 025deg output indicates that, for monthly output, CICE output is costing us 4 times the MOM output. By individually compressing each file, we could reduce the ice storage by a factor >5 and the total storage by a factor of 2.5. Obviously this is a no-brainer, and we should do it.
At the same time we could also consider trimming down the number of files. I would like this from a user point of view, but maybe I am just old-fashioned. It would require us to have a postprocessing script to collate monthly files in (say) annual files. My quick tests today indicated it would only save a few %, and we would have to re-build the cookbook database once we change the file structure. Any thoughts on whether we should do this?
Finally, should we automate compression and/or collation of CICE output within payu for future runs?

aidan [3:39 PM]
Should look first to see how simple it might be to accomplish the compression part in CICE itself

andy [4:40 PM]
OK, yes, let’s see what CICE can do. In the meantime, Aidan, do you have time to attempt a postprocessing script for us to trawl through existing cice files and do a straight compression? I can then test on some of our less important datasets before we set it going for real.

aidan [4:40 PM]
I’ll add it to the list (and prioritise!)
Can you say EXACTLY what you want done, preferably with an example directory and description of before and after

andy [4:55 PM]
No worries. In my testing, I copied a CICE OUTPUT directory to /home/157/amh157/v45/amh157/temp. There I made a parallel directory OUTPUT_PROCESSED, and tested a few of the files with the following command:
nccopy -d 5 -7 OUTPUT/iceh.2256-01.nc OUTPUT_PROCESSED/iceh.2256-01.nc
Basically, I guess the best strategy is to nccopy every file like that and overwrite the old one??

Investigate using parallel IO

It may be worth trying to compile with parallel IO using PIO (setenv IO_TYPE pio).

We currently compile CICE with serial IO (setenv IO_TYPE netcdf in bld/build.sh), so one CPU does all the IO and we end up with an Amdahl's law situation that limits the scalability with large core counts.

At 0.1 deg CICE is IO-bound when doing daily outputs (see Timer 12 in ice_diag.d), and the time spent in CICE IO accounts for almost all the time MOM waits for CICE (oasis_recv in access-om2.out) so the whole coupled model is waiting on one cpu. With daily CICE output at 0.1deg this is ~19% of the model runtime (it's only ~2% without daily CICE output). Lowering the compression level to 1 (#33) has helped (MOM wait was 23% with level 5), and omitting static field output (#32) would also help.

Also I understand that PIO doesn't support compression - is that correct?

@russfiedler had these comments on Slack:

I have a feeling that the CICE parallel IO hadn't really been tested or there was some problem with it.
We would have to update the netcdf versions being used in CICE for a start.
the distributors of PIO note that they need to use netCDF 4.6.1 and HDF5 1.10.4 or later for their latest version. There's a bug in parallel collective IO in earlier hdf5 versions. The NCI version of netCDF 4.6.1 is built with hdf5 1.10.2! marshall noted above that Rui found a performance drop off when moving from 1.10.2 to 1.10.4.
the gather is done on all the small tiles. So you have each PE sending a single horizontal slab several times to the root PE for each level.
the number of MPI calls is probably the main issue. It looks like there's an individual send/recv for each tile rather than either a bulk send of the tiles or something more funky using MPI_Gather(v) and MPI_Type_create_subarray.

Slack discussion: https://arccss.slack.com/archives/C9Q7Y1400/p1557272377089800

Code cleanup

The CICE code and particularly the AUSCOM driver has accumulated some mess - the usual bad formatting, unused variables, pointless code changes, etc.

It would be nice to clean this up mainly for readability but also to make it easier to track and apply changes from other CICE repositories such as:

https://github.com/CICE-Consortium

and

https://github.com/NCAR/CICE

The code cleanup should not change results.

Add option to not output static data in history files

The grid-related fields below are static but are included in every output .nc file, wasting a lot of runtime and storage. It would be good to provide a namelist flag which would write this grid data to a separate .nc file once per run, and omit it from all other .nc files. It looks like this would require code changes, e.g. at

! define information for required time-invariant variables

float TLON(nj, ni) ;
float TLAT(nj, ni) ;
float ULON(nj, ni) ;
float ULAT(nj, ni) ;
float NCAT(nc) ;
float tmask(nj, ni) ;
float blkmask(nj, ni) ;
float tarea(nj, ni) ;
float uarea(nj, ni) ;
float dxt(nj, ni) ;
float dyt(nj, ni) ;
float dxu(nj, ni) ;
float dyu(nj, ni) ;
float HTN(nj, ni) ;
float HTE(nj, ni) ;
float ANGLE(nj, ni) ;
float ANGLET(nj, ni) ;

WOMBAT requires 10m winds to be passed in OM mode

This is also required if the new Langmuir mixing parameterisation is required in the OM version. In fully coupled mode the winds are passed but not when running in OM mode.

I'd rather not hard code this as it would break existing existing configs. Using CPP preprocessing (it's in the MOM code as ACCESS_WND) is nasty but I'll probably have to do it for the moment. I'd rather this just get done on the fly via a flag that gets read in somewhere and use the OASIS error codes to test whether it should be passed or not.

NetCDF large file option is ignored for history files

CICE has a namelist option to turn on netcdf large file support. This flag is only applied to restarts not to diagnostic output.

We have been getting this error when writing output:

ice: Error in nf90_enddef: NetCDF: One or more variable sizes violate format constraints

This error usually happens when the file size is too big for the NetCDF format file type. We hope that by enabling large file support on the diagnostic output this will go away.

All non-master error output going to a single file and being overwritten

We seem to be losing error messages in CICE. At the moment we're guessing that it's because we have all (non-master) error output going to a single file. The error messages coming from one PE are being overwritten by info/debug output coming from others.

This issue should fix the code so that error messages are sent to stderr.

An alternative is to have individual output files for each PE however this has down-sides like too many files and it being difficult to find the one that has the relevant error message.

Crash in thickness_changes (ice_therm_vertical.f90)

Turns out the crash I thought was MATM (COSIMA/matm#4) is in cice.

This is not a CICE issue, but I thought it important to document in case someone else has the same problem.

To recap, this is the ACCESS-OM-1deg JRA55 RYF config, but with a new (KDS50) vertical level scheme. I have interpolated the initial conditions but as far as I know nothing else depends on the ocean vertical grid.

The crash is a divide by zero, the initial traceback has no information:

Image              PC                Routine            Line        Source             
cice_auscom_360x3  000000000092C391  Unknown               Unknown  Unknown
cice_auscom_360x3  000000000092A4CB  Unknown               Unknown  Unknown
cice_auscom_360x3  00000000008D9274  Unknown               Unknown  Unknown
cice_auscom_360x3  00000000008D9086  Unknown               Unknown  Unknown
cice_auscom_360x3  0000000000857A49  Unknown               Unknown  Unknown
cice_auscom_360x3  00000000008623F9  Unknown               Unknown  Unknown
libpthread-2.12.s  00002B9E5C3CE7E0  Unknown               Unknown  Unknown
cice_auscom_360x3  0000000000624132  Unknown               Unknown  Unknown
cice_auscom_360x3  0000000000621D5C  Unknown               Unknown  Unknown
cice_auscom_360x3  00000000005F92C6  Unknown               Unknown  Unknown
cice_auscom_360x3  000000000040E75C  Unknown               Unknown  Unknown
cice_auscom_360x3  000000000040C47D  Unknown               Unknown  Unknown
cice_auscom_360x3  000000000040C41E  Unknown               Unknown  Unknown
libc-2.12.so       00002B9E5C5FAD1D  __libc_start_main     Unknown  Unknown
cice_auscom_360x3  000000000040C329  Unknown               Unknown  Unknown

even though I recompiled cice with -g. If I load the core dump with gdb, I get this info:

#4  ice_therm_vertical::thickness_changes (nx_block=Cannot access memory at address 0x1
) at ice_therm_vertical.f90:1556
#5  0x0000000000621d5c in ice_therm_vertical::thermo_vertical (nx_block=Cannot access memory at address 0x1
) at ice_therm_vertical.f90:421
#6  0x00000000005f92c6 in ice_step_mod::step_therm1 (dt=Cannot access memory at address 0x1
) at ice_step_mod.f90:481
#7  0x000000000040e75c in ice_step () at CICE_RunMod.f90:323
#8  cice_runmod::cice_run () at CICE_RunMod.f90:180
#9  0x000000000040c47d in icemodel () at CICE.f90:57
#10 0x000000000040c41e in main ()
#11 0x00002b9e5c5fad1d in __libc_start_main () from /lib64/libc.so.6
#12 0x000000000040c329 in _start ()
(gdb) where
#0  0x00002b9e5c60e495 in raise () from /lib64/libc.so.6
#1  0x00002b9e5c60fc75 in abort () from /lib64/libc.so.6
#2  0x0000000000861d4c in for__signal_handler ()
#3  <signal handler called>
#4  ice_therm_vertical::thickness_changes (nx_block=Cannot access memory at address 0x1
) at ice_therm_vertical.f90:1556
#5  0x0000000000621d5c in ice_therm_vertical::thermo_vertical (nx_block=Cannot access memory at address 0x1
) at ice_therm_vertical.f90:421
#6  0x00000000005f92c6 in ice_step_mod::step_therm1 (dt=Cannot access memory at address 0x1
) at ice_step_mod.f90:481
#7  0x000000000040e75c in ice_step () at CICE_RunMod.f90:323
#8  cice_runmod::cice_run () at CICE_RunMod.f90:180
#9  0x000000000040c47d in icemodel () at CICE.f90:57
#10 0x000000000040c41e in main ()
#11 0x00002b9e5c5fad1d in __libc_start_main () from /lib64/libc.so.6
#12 0x000000000040c329 in _start ()
(gdb) bt full                                                                                                                                  
#0  0x00002b9e5c60e495 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x00002b9e5c60fc75 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x0000000000861d4c in for__signal_handler ()
No symbol table info available.
#3  <signal handler called>
No symbol table info available.
#4  ice_therm_vertical::thickness_changes (nx_block=Cannot access memory at address 0x1
) at ice_therm_vertical.f90:1556
        phi_i_mushy = 0.84999999999999998
        qbot0 = 0
        qbotp = 0
        qbotm = 0
        hstot = 0
        wk1 = 0
        qbot = 0
        ts = 0
        ti = 0
        tmlts = 0
        ij = 30936576
        j = 21206080
        i = 33728
        dzi = (( 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...) ...)
#5  0x0000000000621d5c in ice_therm_vertical::thermo_vertical (nx_block=Cannot access memory at address 0x1
) at ice_therm_vertical.f90:421
        my_task = 7
        dhi = 0
        ij = 30936576
        fadvocn = (( 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...) ...)
        iage = (( 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...) ...)

The line number is probably not reliable (still -02 flag on), so I think the crash is here:

https://github.com/OceansAus/cice5/blob/49b36d4bfb97328e818d428d0b8438144dbd69a1/source/ice_therm_vertical.F90#L1554

as qbotp = 0.

I'm guessing there is some ice initial conditions issues, but I don't know why change the ocean vertical grid would impact the ice. Any ideas?

sectrobin block distribution scheme for CICE

We have been using roundrobin to distribute CICE blocks amongst ranks however the 'sectrobin' looks more suitable because it may decrease comms overhead allowing us to put more blocks per rank and hence get better load balancing.

It is also just nice to have in terms of flexibility.

Update CICE halos after all coupling calls and before time step

Presently ice field halos are updated immediately after receiving from atm/ocean. This is bad because it means halo updates occur between the coupling calls to send and receive from the ocean. So the ocean ends up waiting on ice halo updates which can be slow.

This issue moves all halo updates to a single place directly after the coupling but before the time step.

chio namelist value ignored

The value of chio is only used once, to calculate cpchr here:

cpchr = -cp_ocn*rhow*chio

but then cpchr is redefined here before it is used for anything
if (trim(fbot_xfer_type) == 'Cdn_ocn') then
! Note: Cdn_ocn has already been used for calculating ustar
! (formdrag only) --- David Schroeder (CPOM)
cpchr = -cp_ocn*rhow*Cdn_ocn(i,j)
else ! fbot_xfer_type == 'constant'
! 0.006 = unitless param for basal heat flx ala McPhee and Maykut
cpchr = -cp_ocn*rhow*0.006_dbl_kind
endif

fbot_xfer_type = 'constant' is the default, which effectively hard-codes chio=0.006, ignoring whatever was set in the namelist file.
This bug seems to have been introduced in the upgrade to cice 5.1.2 in 2015.
Thanks to Paul Sandery for pointing out the insensitivity to this parameter.

Broken timer

@russfiedler said:

Looks like one of the timers in CICE got broken in the layout update. It's the one that measures the time CICE is waiting for the ocean.
imer 18: waiting_o 10817.66 seconds
Timer stats (node): min = 10817.63 seconds
max = 10817.66 seconds
mean= 10817.64 seconds

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.