The mc_kernel from seismology

ADD LABEL TO NETCDF FILE: how to..

Hi,
I notice that individual misfit kernels are labelled in the xdmf file but NOT in the netcdf file. Can @sstaehler implement this? It would be great to be able to keep track of each kernel, labelled, via netcdf.

Thanks,
ET

improve test coverage

Obviously, some parts of the code cannot be tested by unit tests, but nevertheless, the current coverage is problematic.

Everybody feel free to add tests.

Kernel times 3D model should give dT measurements from SpecFEM

Should be done after we are sure that the kernels are similar to Dahlen kernels

Better control on memory consumption

@auerl and @sstaehler reported that usage on HPC machines is difficult due to the limited memory per node (~1GB) on many systems. It would be necessary to find the main memory users and give the user better control over it.

Was already mentioned in #32. b88be9b reduced memory consumption drastically by not keeping the whole mesh in memory, but there are still issues.

Idea: Let the user set a maximum amount of available memory in the inparam file and calculate buffer sizes from that.

stf deconvolution

because when the srf reader was removed, there was no stf reader anymore. should be reactivated after implementing the new stf reader,

Unnecessary features to remove

INT_OVER_BACKGROUND_MODEL

Test Fails

https://github.com/seismology/mc_kernel/blob/master/src/test_readfields.f90#L170

The file inparam_test_merged appears to be missing - does it need to be added?

time windows?

seem to be broken for kernel, but not for seismograms

Kernel values should be independent of scalar moment

Just adding this issue to be able to document the results.

Kernel values should be independent of source/receiver location

with same distance, of course

Normalization term initialized temperamentally

https://github.com/seismology/mc_kernel/blob/master/src/kernel.f90#L435-L460

I noticed when compiling mc_kernel on my local machine, the following warning for kernel.f90:

'Extension: Unary operator following arithmetic operator (use parentheses) at (1)
Warning: ‘normalization_term’ may be used uninitialized in this function [-Wmaybe-uninitialized]'.

By placing brackets around each of the four -1.d0 terms in the above lines (e.g. calc_misfit_kernel(itrace) = this%integrate_parseval( timeseries_cut, this%seis_velo_cut_fd) &
* this%normalization * (-1.d0)) , this warning goes away.

Code cannot be compiled with Intel due to issues in inv_mesh.f90

When compiling with the Intel compiler collection the compilation fails at line 1909 of inv_mesh.f90 with the following error message:

inv_mesh.f90(1909): error #6632: Keyword arguments are invalid without an explicit interface. [COMMAND]
call execute_command_line(command = sys_cmd, wait = .true., exitstat = exitstat, &
----------------------------^
[...]

Works, when commenting out this part of the code.

Save intermediate results

Due to the unpredictable runtime of the code, it would be good to store intermediate results. Would allow to restart calculation at the point of failure.

Easy to implement: in module master_queue.

Should use netcdf, therefore some preparations necessary, like put_var routines.

Vsh, Vpv, ...

Some minus signs seem to be wrong in computations of physical kernels (refer to Fichtner's book, p.169 eq 9.21):

mc_kernel/src/kernel.f90

Lines 749 to 753 in 9d2981d

    
           physical_kernel(it, :) = 2.d0 * bg_model%c_rho * bg_model%c_vsh**2 * & 
        
                (  2 * base_kernel(it, :, 1)                        &  ! Lambda 
        
                 +     base_kernel(it, :, 2)                        &  ! Mu 
        
                 +     base_kernel(it, :, 5)                        &  ! B 
        
                 + 2 * base_kernel(it, :, 6) * (1-bg_model%c_eta))     ! C

(reported by Alex)

Definition K_eta

Hi,

I was looking into the methodology of MC_Kernel, and comparing that to the equations given in Fichtner 2011. What I noticed is that kernel K_eta is slightly different defined in MC_Kernel than in Fichtner 2011:
MC_Kernel: 2*\rho ( v_{ph}^2 - v_{sh}^2 ) K_c
Fichtner: \rho ( v_{ph}^2 - 2*v_{sh}^2 ) K_c
Is this a bug or is there a reason this kernel is differently defined?

Cheers,

Janneke

Random NetCDF crashes for very long runs

Not much more to say. The code sometimes crashes with obviously wrong NetCDF error messages ('Parallel access failed', where we're not using any parallel access) on the ETH machine. My suspicion is the NetCDF library itself.

This issue is just for collecting information.

ABAQUS vs ASCII

In case of tetrahedron mesh, it seems that ABAQUS mesh works as it is expected while the same mesh in ASCII format gives no results (0? everywhere).
Examples for such inconsistency are provided in unit_tests.

Allocatable array bgmodel_parameter_names not allocated properly

The allocatable array bgmodel_parameter_names doesn't seem allocated properly before it is used in type_parameter.f90 -> read_parameters. The code compiles fine with gcc and intel, but with intel and the "fast" optimization flags, this seems to cause seg-faults later on. When compiling with intel in debugging mode, the code gives a runtime error:

forrtl: severe (408): fort: (8): Attempt to fetch from allocatable variable BGMODEL_PARAMETER_NAMES when it is not allocated

Segfault in kdtree

With gfortran 4.6.3 (Ubuntu 12.04LTS), I get a segfault in build_kdtree in readfields.f90, around line 963. No idea where it comes from exactly. Not traceable. Valgrind hints at a buffer overflow, but not reliable either.

Code slows down when model parameter is not vp

Especially when comparing vp and lam, which should basically be the same, a run with vp kernels only is several times faster than one with lam. No idea, why

Are the wavefields really strain?

Okay, the plot_wavefield mode finally works (will pull into master in a sec)

so, here is a rather ugly plot of the fw and the bw first strain component (tt-component).

To me, it looks like strain (two-sided pulse). @auerl @kasra-hosseini: What do you think?

Memory requirement for runs with high frequencies

A problem I encountered on SuperMUC for runs with periods of 2s (testing kernel convergence) is the memory requirement.
Supermuc offers 1.6GB per node, which is a usual value, I guess. Since the AxiSEM mesh of a 2s run has 3,8E7 GLL points, this means that storing the 12 mesh parameters for each GLL point alone eats up 1.7 GB or all of the memory.
Now it is not really necessary to waste that much memory on saving the model parameters. They only depend on depth and are set by some 100 parameters (the polynomial coefficients for all layers and all parameters in PREM for example).

So, what is the best solution for this?

A) Copy background_models.f90 from the Solver. I would rather kill myself than doing this.
B) Load the mesh, create a linear interpolation object in terms of depth and save only this object. This seems the cleaner solution to me, however, it was a pain to implement in the mesher.

Any suggestions?

S kernel bugfix

@EoghanTotten reported the following fixes: (@harriet-godwin and @sstaehler, would you please confirm these changes?)

mc_kernel/src/readfields.f90

Lines 1294 to 1296 in 9c534a9

    
           load_fw_points(:,:,ipoint) = rotate_symm_tensor_voigt_src_to_xyz( & 
        
                                          load_fw_points(:,:,ipoint),        & 
        
                                          source_params%lon, this%ndumps    )

should be changed to:

load_fw_points(:,:,ipoint) = rotate_symm_tensor_voigt_src_to_xyz( &
                                          load_fw_points(:,:,ipoint),        &
                                          rotmesh_phi(ipoint), this%ndumps    )

and

mc_kernel/src/readfields.f90

Lines 1456 to 1457 in 9c534a9

    
           load_bw_points(:,:,ipoint) = rotate_symm_tensor_voigt_src_to_xyz(load_bw_points(:,:,ipoint), & 
        
                                          receiver%lon, this%ndumps)

to:

load_bw_points(:,:,ipoint) = rotate_symm_tensor_voigt_src_to_xyz(load_bw_points(:,:,ipoint), &
                                          rotmesh_phi(ipoint), this%ndumps)

Ifort problem with automatic allocation

According to information from the LRZ helpdesk, the ifort option
-assume realloc_lhs

must be considered buggy, at least up to the current stable version 15. This would explain a number of strange untraceable malloc-segfaults I encountered on SuperMUC. If it is indeed the case, it would require changing a large number of lines, where left-hand-side allocation is used to be able to use it on ifort-based machines like SuperMUC.

Results are influenced by compiler/library versions or machine

Upon Simons request I created a new issue concerning the observation that there are large differences to the kernel values, when the code is compiled with different compiler/netcdf (?) versions.

See the figures below. No idea what this could be. The compiler is gfortran in both cases (4.8.1 in lugano and 4.8.2 in zurich), wavefields are the same, version of the code is the same, inparam file is the same, netcdf version is 4.3.1 in lugano and 4.1.3 in zurich.

Maybe an issue related to using different (incompatible) versions of netcdf to write / read the netcdf wavefields. When running the code in Zurich also the projected background model velocities are 6 (!) orders too large. From what I see, the kernel values (if not multiplied with the volume of the grid cells) does not seem to be influenced by the chosen mesh), which is good:

@ALL: I think, what would be extremely helpful would be to have a running version of the Princeton's group (Dahlen) kernel code on our machine in Zurich, to which everybody has access. Would allow us to output values at various locations for similar kernels, and help to get a feeling for how large they need to be. Also, it would be good to see how exactly the test of multiplying kernels with the background model to get absolute traveltimes is implemented.

@kasra-hosseini: Could you install such a test-version for us? I think you are the only one of us who knows the code.

block IDs for multi domain abaqus files

to allow filtering in paraview after generating the xdmf files

k_a in kernel.f90

Alex spotted the following mistake:

mc_kernel/src/kernel.f90

Lines 556 to 558 in 9d2981d

    
           case(4) ! k_a 
        
              conv_field_fd = ( fw_field_fd(:,2,:) + bw_field_fd(:,1,:) ) * & 
        
                              ( fw_field_fd(:,2,:) + bw_field_fd(:,1,:) )

Compared to Fichtner's book, p.168 eq 9.20d, it should be:

conv_field_fd = ( bw_field_fd(:,2,:) + bw_field_fd(:,1,:) ) * &
                          ( fw_field_fd(:,2,:) + fw_field_fd(:,1,:) )

Does this make sense?

Kdtree algorithm crashes when same heterogeneity points are defined repeatedly

Not a big issue, but one may run into this again, when reading .rtpv formatted (spherical coordinates) files where points at the north or south poles are redundant in cartesian coordinates.

Gabor filter 'not vanishing fast enough'

Hi,
Getting following error below. Only modification of my inparams was to change the kernel SPP from 4 to 8 for my AxiSEM wavefield (parameters otherwise unchanged!).

ERROR: Filter Gabor_20 is not vanishing fast enough for
high frequencies.
Numerical noise from frequencies above mesh limit will be propagated
into the kernels. Check the files
filterresponse*
filterresponse_stf_*
Relative value of forward TF at half mesh period : 0.35443443457184026
Relative value of backward TF at half mesh period : 0.40594410662365144
Cutoff period: 3.5677747662280885

Modifying the sigmaIFC from default 0.5 does not help, whether raised or lowered, nor using higher or lower frequency filters. Any ideas? @sstaehler @kasra-hosseini @harriet-godwin

Writing to intermediate files is really slow

Problem occurs mainly on HPC systems and for runs with very large meshes.

Call of sem_data%reopen_files(...) in slave.f90 causes error resulting in incorrect kernel units

I am not sure what exactly happens but traveltime predictions based on multiplying kernels with a 3d model fail (i.e. give predictions which are an mutliple orders of magnitude off) when the call to the routine sem_data%reopen_files(...) is present. A workaround is to just uncomment the corresponding line in slave.f90

Setting inparams correctly for running on HPC

Hello,

I just wanted to check that I am using the optimal inparams when running MC Kernel on many, many processors and using parallel netcdf on High Performance Computing resources e.g MAXIMUM_ITERATIONS, STRAIN_BUFFER_SIZE, INTERMEDIATE_DUMP_TIME (what fraction of the wall time should this be set at or should it be left at 0), ALLOWED_ERROR and ALLOWED_RELATIVE_ERROR.

Any pointers/advice to save wall time/ run time?

Many thanks.

random points in triangle meshes

are sometimes outside the element and not evenly distributed

Kernel values are not the same as Dahlen Kernels

I have taken this part from #22, since it is a new question.

I have calculated the Kernels with (presumably) the same physical parameters as @kasra-hosseini did with the Dahlen method, and they look very similar, alas, they differ by 7 orders of magnitude (left: MC Kernel, right: MC Tony Dahlen).

Also, there are differences in the source region and with a second Fresnel zone above the kernel. On the one hand, this is something we expected, on the other hand, it might affect tomography quite significantly. Exciting times, people!

Branch basekernels_restored crashes when computing v_s kernels on the T or R component

When computing v_s kernels and specifying the transversal ("T") or radial ("R") component as the desired receiver component, the code crashes after "Starting to distribute the work" in readfields.f90
with...

At line 1653 of file readfields.f90
Fortran runtime error: Index '2' of dimension 2 of array '__result.0'
outside of expected range (1:1)
At line 1653 of file readfields.f90
Fortran runtime error: Index '2' of dimension 2 of array '__result.0'
outside of expected range (1:1)

Model times kernel is not travel time.

The most basic test, implemented in master_queue.f90 fails

Model integration does not work

Integration of model parameters gives weird results. Might be mixed up variables somewhere

Kernel values should be independent of period of AxiSEM run

Check kernel calculations on different inversion grids

@sstaehler as we have talked about this, I have created 5 inversion grids that maybe we can start with. Here are short descriptions for each:

bernhard_20_96: it is called bernhard because I originally created for him. Here are some info:

Mantle: 600km (edge length)

in longitude 0 and latitudes 20 to 96 (I do not remember why 96): the edge length is 20km

Suggestion: put event and station on lat, lon: (30, 0) and (-30, 0) to check P kernel, for the others maybe we can check Pdiff kernels.

Example:
event: (30, 0)
station: (-30, 0)
taup for P: 608.28
for 15sec: 5 sec before and 32.7 sec after
window: 603.28 - 640.98

ppdiff_200_660

0<=depth<=900: 200km
900<=depth<=2889: 200 --> 400 (linearly increases to 400)
around stations: 66 km

Suggestion: maybe we can put one station in North america since there are lots of stations

Example:
event: (40, -113)
station: (-80, -113)
taup for Pdiff: 915.52
for 15sec: 5 sec before and 32.7 sec after
window: 910.52 - 948.22

ppdiff_900_300

0<=depth<=900: 200km
900<=depth<=2889: 200 --> 300 (linearly increases to 300)
around stations: 66 km

ppdiff_homo_300

300km homogeneous

staev_200

0<=depth<=2889: 200km
2889<=depth<=4000: 200-->250
around stations: 66 km

all the files (vertices and facets) can be found at:

/home/khosseini/mesh_example (ETH machine)

Best,
KH

submit script available

General idea

It's a little Python program that makes extensive use of the argument parser to make interaction with the input file unnecessary in most daily use cases. The script creates a rundir and copies all necessary files, including the inversion grid files there. It does not copy the AxiSEM wavefields, obviously.

Basic run with default settings

The only necessary option is a job name, which will become the name of the rundir.

python submit.py kerner_run

It will create a job with default settings and 2 slaves using the receiver and filter and stf files in the main directory and AxiSEM wavefields at wavefield/fwd and wavefield/bwd

Basic run with settings from inparam file

That's obvioisly not very useful, if your files are elsewhere, so you can specify these settings in an inparam_basic file somewhere, especially those which do not change (like the wavefield dirs) and invoke it with

python submit.py -i inparam_basic kerner_run

This will use the settings in inparam_basic.

Basic run with settings from inparam file and some specific settings

The new thing is that you can easily override the settings in inparam_basic now, e.g. by

python submit.py -i inparam_basic -rec_file receivers_USARRAY.dat -src_file CMTSOLUTION_VIRGINIA -n 15 kerner_run

which will use the receiver and source files specified above, 15 slaves and otherwise the settings from inparam_basic.

Help

All possible options are available via the help pages

python submit.py -h

Job submission message (description)

Another feature is that you have to add a description of the run. I tried to mimic the functionality of git commit as much as possible. You can give a short description with the -m option

python submit.py testrun -m "Just a test run"

If -m is not added, an EDITOR window opens up during submission, where you can enter a more detailed description of your job. Either way, the description will be written into a file README.run in the rundir.

Things to add:

support for HPC queues (starts a local job with mpirun in the moment)
automatic setting of buffer sizes based on available memory

Please keep me informed about any issues!

Latest version crashes after creating file to store intermediate results (both gcc and intel)

The latest version of the code crashes in master_queue.f90 after "Create file for intermediate results" at the location

call nc_putvar_by_name(ncid = ncid_intermediate, &
varname = 'K_x', &
values = real(K_x, kind=sp) )

With the error message:

ERROR: CPU 0 could not write 2D variable: 'K_x'( 1) in NCID 65536
start ( 1) + count( 7260) is larger than size ( 100)
of dimension K_x_1 (1)

This happens both with triangular meshes, and voxel meshes (I haven't tried tetrahedral ones). Interestingly, the code manages to pass this place, when compiling it with the intel fortran compilers, With intel, however, the code crashes with a

*** Error in `./kerner': double free or corruption (!prev): 0x BB5
forrtl: error (78): process killed (SIGTERM)

after "Write mesh partition and convergence to disk" in master_queue.f90

Does anyone else encounter this issue?

Kernel times homogeneous velocity change should give correct dT

comparing with dT from ray theory or whatever

Speed optimization

There have been complaints about the slow execution of the code, which I share to an extend. Part of the problem is the IO, which we can only get around to a certain extend, especially, since there is a tradeoff with memory consumption, which is a big constrain on HPC environments. Another issue might be that we inhibit further implicit parallelization and vectorization by the compiler with calls to the clock routines (which are not pure) or input parameter checking (which can lead to program exits and is therefore not pure as well).

This might be a specific issue to discuss with HPC experts, even though I am not very certain on how much speed-up we could gain.

division by number of basis functions per element is wrong

mc_kernel/src/master_queue.f90

Line 284 in 9d2981d

valence = 1.d0/real(inv_mesh%nbasisfuncs_per_elem, kind=dp)

This division should be replaced by a multiplication over element volume (Nolet, p.229)

nptperstep or ntraces?: Initialise FFT

mc_kernel/src/slave.f90

Line 143 in 2dcbf02

call fft_data%init(ndumps, sem_data%get_ndim(), nptperstep, sem_data%dt, &

I notice that the third argument when calling the function above is nptperstep. Within fft.f90, however,

https://github.com/seismology/mc_kernel/blob/master/src/fft.f90#L175-L206
suggests this number should be set to ntraces i.e.
"
!< This routines initializes a FFTw plan for 1D DFTs.
!! The time series is supposed to have length ntimes_in, and is stored along the first
!! dimension of a three-dimensional array. The other two dimensions are ndim and ntraces.
!! Internally, this array will always be converted to two-dimensional array of dimensions
!! [ntimes_in, ndim*ntraces]'."

This would make the parameter, ntraces_fft = [ndumps, sem_data%get_ndim()*nptperstep].

Is this okay?

Weights are sometimes <0

Happens in @kasra-hosseini's huge mesh after 24h runtime, so it is a bit difficult to debug. More of a memo to self to look into it.

ERROR: weights is smaller zero! Check whether point is outside of element
    weight: -0.49999999953433383     
    dx    :   40809.243542678654     
    dy    :  -32979.621791672747     
    dz    :  -69273.540243368130

	physical_kernel(it, :) = 2.d0 * bg_model%c_rho * bg_model%c_vsh*2 &
	( 2 * base_kernel(it, :, 1) & ! Lambda
	+ base_kernel(it, :, 2) & ! Mu
	+ base_kernel(it, :, 5) & ! B
	+ 2 * base_kernel(it, :, 6) * (1-bg_model%c_eta)) ! C

	load_fw_points(:,:,ipoint) = rotate_symm_tensor_voigt_src_to_xyz( &
	load_fw_points(:,:,ipoint), &
	source_params%lon, this%ndumps )

	load_bw_points(:,:,ipoint) = rotate_symm_tensor_voigt_src_to_xyz(load_bw_points(:,:,ipoint), &
	receiver%lon, this%ndumps)

	case(4) ! k_a
	conv_field_fd = ( fw_field_fd(:,2,:) + bw_field_fd(:,1,:) ) * &
	( fw_field_fd(:,2,:) + bw_field_fd(:,1,:) )

seismology / mc_kernel Goto Github PK

mc_kernel's People

Contributors

Stargazers

Watchers

Forkers

mc_kernel's Issues