Code Monkey home page Code Monkey logo

Comments (5)

mccoys avatar mccoys commented on June 26, 2024

Hi. You say no informative error message. But something appeared right? Did not say what kind of error?

from smilei.

tj9726 avatar tj9726 commented on June 26, 2024

Hi. Sorry I missed some of the output files.
The standard output I get is

                    _            _
  ___           _  | |        _  \ \   Version : 4.8-1-g0293ceb2b-master
 / __|  _ __   (_) | |  ___  (_)  | |   
 \__ \ | '  \   _  | | / -_)  _   | |
 |___/ |_|_|_| |_| |_| \___| |_|  | |  
                                 /_/    
 
 

 Reading the simulation parameters
 --------------------------------------------------------------------------------
 HDF5 version 1.12.2
 Python version 3.10.8
	 Parsing pyinit.py
	 Parsing 4.8-1-g0293ceb2b-master
	 Parsing pyprofiles.py
	 Parsing test.py
	 Parsing pycontrol.py
	 Check for function preprocess()
	 python preprocess function does not exist
	 Calling python _smilei_check
	 Calling python _prepare_checkpoint_dir
	 Calling python _keep_python_running() :
�[1;36mCAREFUL: Patches distribution: hilbertian
�[0m
 

 Geometry: 2Dcartesian
 --------------------------------------------------------------------------------
	 Interpolation order : 2
	 Maxwell solver : Yee
	 simulation duration = 100.000000,   total number of iterations = 2000
	 timestep = 0.050000 = 0.707107 x CFL,   time resolution = 20.000000
	 Grid length: 40, 40
	 Cell length: 0.1, 0.1, 0
	 Number of cells: 400, 400
	 Spatial resolution: 10, 10
 

 Electromagnetic boundary conditions
 --------------------------------------------------------------------------------
	 xmin periodic
	 xmax periodic
	 ymin periodic
	 ymax periodic
 

 Vectorization: 
 --------------------------------------------------------------------------------
	 Mode: off
	 Calling python writeInfo
 

 Initializing MPI
 --------------------------------------------------------------------------------
	 applied topology for periodic BCs in x-direction
	 applied topology for periodic BCs in y-direction
	 MPI_THREAD_MULTIPLE not enabled
	 Number of MPI processes: 4
	 Number of threads per MPI process : 12
	 OpenMP task parallelization not activated
 
	 Number of patches: 16 x 16
	 Number of cells in one patch: 25 x 25
	 Dynamic load balancing: never
 

 Initializing the restart environment
 --------------------------------------------------------------------------------
 
 
 

 Initializing species
 --------------------------------------------------------------------------------
	 
	 Creating Species #0: electron
		 > Pusher: boris
		 > Boundary conditions: periodic periodic periodic periodic
		 > Density profile: 2D built-in profile `constant` (value: 1.000000)
	 
	 Creating Species #1: positron
		 > Pusher: boris
		 > Boundary conditions: periodic periodic periodic periodic
		 > Density profile: 2D built-in profile `constant` (value: 1.000000)
 

 Initializing Patches
 --------------------------------------------------------------------------------
	 First patch created
		 Approximately 10% of patches created
		 Approximately 20% of patches created
		 Approximately 30% of patches created
		 Approximately 40% of patches created
		 Approximately 50% of patches created
		 Approximately 60% of patches created
		 Approximately 70% of patches created
		 Approximately 80% of patches created
		 Approximately 90% of patches created
	 All patches created
 

 Creating Diagnostics, antennas, and external fields
 --------------------------------------------------------------------------------
 

 finalize MPI
 --------------------------------------------------------------------------------
	 Done creating diagnostics, antennas, and external fields
 

 Minimum memory consumption (does not include all temporary buffers)
 --------------------------------------------------------------------------------
              Particles: Master 976 MB;   Max 976 MB;   Global 3.81 GB
                 Fields: Master 5 MB;   Max 5 MB;   Global 0.0223 GB
            scalars.txt: Master 0 MB;   Max 0 MB;   Global 0 GB
 

 Initial fields setup
 --------------------------------------------------------------------------------
	 Solving Poisson at time t = 0
 

 Initializing E field through Poisson solver
 --------------------------------------------------------------------------------
	 Poisson solver converged at iteration: 0, relative err is ctrl = 0.000000 x 1e-14
	 Poisson equation solved. Maximum err = 0.000000 at i= -1
 Time in Poisson : 0.009117
	 Applying external fields at time t = 0
	 Applying prescribed fields at time t = 0
	 Applying antennas at time t = 0
 

 Open files & initialize diagnostics
 --------------------------------------------------------------------------------
 

 Running diags at time t = 0
 --------------------------------------------------------------------------------
 

 Species creation summary
 --------------------------------------------------------------------------------
		 Species 0 (electron) created with 40960000 particles
		 Species 1 (positron) created with 40960000 particles
 

 Expected disk usage (approximate)
 --------------------------------------------------------------------------------
	 WARNING: disk usage by non-uniform particles maybe strongly underestimated,
	    especially when particles are created at runtime (ionization, pair generation, etc.)
	 
	 Expected disk usage for diagnostics:
		 File scalars.txt: 8.98 K
	 Total disk usage for diagnostics: 8.98 K
	 
 

 Keeping or closing the python runtime environment
 --------------------------------------------------------------------------------
	 Checking for cleanup() function:
	 python cleanup function does not exist
	 Closing Python
 

 Time-Loop started: number of time-steps n_time = 2000
 --------------------------------------------------------------------------------
�[1;36mCAREFUL: The following `push time` assumes a global number of 48 cores (hyperthreading is unknown)
�[0m
    timestep       sim time   cpu time [s]   (    diff [s] )   push time [ns]

This is why I thought the initialization was complete but crashed immediately after entering the main loop.
The standard outputs are

Stack trace (most recent call last):
#11   Object "smilei", at 0x46d233, in 
#10   Object "/lib64/libc.so.6", at 0x400006844383, in __libc_start_main
#9    Object "smilei", at 0x93805b, in main
#8    Object "/opt/FJSVxtclanga/tcsds-ssl2-latest/lib64/libfjomp.so", at 0x4000041ab073, in __kmpc_fork_call
#7    Object "/opt/FJSVxtclanga/tcsds-ssl2-latest/lib64/libfjomp.so", at 0x4000041b85cb, in __kmp_fork_call
#6    Object "/opt/FJSVxtclanga/tcsds-ssl2-latest/lib64/libfjomp.so", at 0x4000041b7603, in 
#5    Object "/opt/FJSVxtclanga/tcsds-ssl2-latest/lib64/libfjomp.so", at 0x4000042145ff, in __kmp_invoke_microtask
#4    Object "smilei", at 0x93a343, in 
#3    Object "smilei", at 0x7ed4d7, in VectorPatch::dynamics(Params&, SmileiMPI*, SimWindow*, RadiationTables&, MultiphotonBreitWheelerTables&, double, Timers&, int)
#2    Object "smilei", at 0x7ed877, in VectorPatch::dynamicsWithoutTasks(Params&, SmileiMPI*, SimWindow*, RadiationTables&, MultiphotonBreitWheelerTables&, double, Timers&, int)
#1    Object "smilei", at 0x97876b, in Species::dynamics(double, unsigned int, ElectroMagn*, Params&, bool, PartWalls*, Patch*, SmileiMPI*, RadiationTables&, MultiphotonBreitWheelerTables&)
#0    Object "smilei", at 0x91cca4, in PusherBoris::operator()(Particles&, SmileiMPI*, int, int, int, int)
Segmentation fault (Address not mapped to object [(nil)])

for 1 process
and Stack trace (most recent call last): for others (I only saw these one before).
And the system output is [WARN] PLE 0610 plexec The process terminated with the signal.(rank=1)(nid=0x03010004)(sig=11)
I have tried multiple times and the timing of crashes were always the same.
And again, there were no problems with the Fujitsu trad mode and GCC.

from smilei.

xxirii avatar xxirii commented on June 26, 2024

Hello,

I worked on adapting Smilei on Fugaku few years ago. Unfortunately I can't access the system anymore.

Is compiling in clang mode really important for you since you already have 2 solutions ?

I can suggest to compile with the armclang compiler instead of the Fujitsu compiler that used to give decent performance on A64FX for us. Fugaku compiler was behind but it is possible that they catch up since then. This is a third alternative.

Unfortunately I can't do more especially if you don't have more error output. Perhaps you can ask the system support.

from smilei.

tj9726 avatar tj9726 commented on June 26, 2024

Hi,

I was checking performance with different compilers for future large-scale simulations.
I read your paper and thought clang mode may perform better than the trad mode.

Did you need anything special to compile when you tested a few years ago?
If you did not, then newer versions of the compiler could be the reason.

I will proceed with what is available (or ask Fujitsu developers).

Thank you for your advice.

from smilei.

xxirii avatar xxirii commented on June 26, 2024

Sorry for my late reply. Please find below the configuration I have used for my tests :

Spack Env


Example of configuration for ssh:
```bash
Host fugaku
   Hostname login.fugaku.r-ccs.riken.jp
   ForwardX11 yes
   ForwardAgent yes
   User <your ogin>
   ServerAliveInterval 60
   Compression yes

II. Environment

Login nodes are Intel processors. You should use cross compilation or compile on a compute node.

On compute nodes

The Fugaku super-computer relies on Spack for the most advanced libraries and tools.
You first have to source the Spack environment:

. /vol0004/apps/oss/spack/share/spack/setup-env.sh

You can check available libraries by doing:

spack find -xl <lib name>

Check regularly the last library versions because Spack is regularly updated.
For instance:

spack find -xl python
spack find -xl hdf5
. /vol0004/apps/oss/spack/share/spack/setup-env.sh
#Python
spack load /7sz6cn4
#Numpy
spack load /q6rre3p
#HDF5
spack load /l53s4lp

export SMILEICXX=mpiFCCpx
export HDF5_ROOT=/vol0004/apps/oss/spack-v0.16.2/opt/spack/linux-rhel8-a64fx/fj-4.6.1/hdf5-1.10.7-hza6f4rwqjon62z4q7a6vavtrkafvz35/

Use which h5c++ to get the path to the HDF5 library you use.
Save this configuration in a file that you can source in job scripts.

Compilation

Trad mode

The trad mode uses the Fujitsu flags to compile the code.
In this case, we use the machine file fugaku_fujitsu_tm.

#!/bin/sh -x
#PJM -N  "smilei"
#PJM -L  "node=1"                          # Assign node 1 node
#PJM -L  "rscgrp=small"                    # Specify resource group
#PJM -L  "elapse=00:30:00"                 # Elapsed time limit 1 hour
#PJM -x PJM_LLIO_GFSCACHE=/vol0004
#PJM -s

source ~/env/smilei_env

mpiFCC -show

make -j 48 config="verbose" machine="fugaku_fujitsu_tm"

See this page for more information: https://www.fugaku.r-ccs.riken.jp/doc_root/en/user_guides/lang_latest/FujitsuCompiler/C%2B%2B/tradmode.html

Clang mode

In clang mode, the Fujitsu compiler uses the same as the flags Clang compiler.
The flag -Nclang has to be provided.
In this case, we use the machine file fugaku_fujitsu_cm.

#!/bin/sh -x
#PJM -N  "smilei"
#PJM -L  "node=1"                          # Assign node 1 node
#PJM -L  "rscgrp=small"                    # Specify resource group
#PJM -L  "elapse=00:30:00"                 # Elapsed time limit 1 hour
#PJM -x PJM_LLIO_GFSCACHE=/vol0004
#PJM -s

source ~/env/smilei_env

mpiFCC -show

make -j 48 config="verbose" machine="fugaku_fujitsu_cm"

See this page for more information: https://www.fugaku.r-ccs.riken.jp/doc_root/en/user_guides/lang_latest/FujitsuCompiler/C%2B%2B/clangmode.html

IV. Execution

Single node execution

#!/bin/bash
#PJM -L "node=1"                  # 4 nodes
#PJM -L "rscgrp=small"            # Specify resource group
#PJM -L "elapse=10:00"
#PJM --mpi "max-proc-per-node=4"  # Upper limit of number of MPI process created at 1 node
#PJM -x PJM_LLIO_GFSCACHE=/vol0004
#PJM -s

source ~/env/smilei_env

export PLE_MPI_STD_EMPTYFILE=off # Do not create a file if there is no output to stdout/stderr.
export OMP_NUM_THREADS=12
export OMP_SCHEDULE="static"

rm *.out.*
rm *.err.*

cp ~/smilei/develop-mat/smilei .
cp ../template.py input.py

# execute job
mpiexec -n 4 ./smilei input.py               # Execute with maximum number of available process

from smilei.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.