Comments (2)
I will get to that soon. I'll recap what I understand of the issue here for when I get to it.
I do not think slurmworkflow
will need an update. It's bare bone design makes it more resilient to these changes. The scenarios step_templates on EpiModelHPC
are still working fine on all HPCs so no issues here for now as well.
Next step:
Check if the issue is in the allocation of resources by slurm
(ntasks
vs cpus-per-task
). That would be weird as MOX always allocate full nodes.
Check if the issue is on the discovery of the CPUs available by R. Check pull_env_vars
and the state of the SLURM_CPUS_PER_TASK
env var
Reminder: the calls produced only 1 sim rep per array job instead of 10. Even with a wrong number of CPU, the 10 should be run sequentially. (hint: nsims
vs ncores
on netsim
)
# master.9999.sh
sbatch -p ckpt -A csde-ckpt --array=1-3 --nodes=1 --cpus-per-task=10 --time=10:00:00 --mem=100G --job-name=s9999 --export=ALL,SIMNO=9999,NJOBS=3,NSIMS=30 runsim.9999.sh
# runsim.9999.sh
#!/bin/bash
#SBATCH -o ./out/%x_%a.out
source ~/loadR.sh
Rscript sim.9999.R
# sim.9999.R
library("methods")
library("EpiModelHIV")
library("EpiModelHPC")
pull_env_vars()
# Params
epistats_list <- readRDS("est/epistats_list.rds")
epistats <- epistats_list[[1]]
netstats_list <- readRDS("est/netstats_list.rds")
netstats <- netstats_list[[1]]
# Epidemic model
time.unit <- 7
method <- 1
param <- param_msm(
netstats = netstats,
epistats = epistats,
pr.postprep.condom.nonrebounder = 0.25,
dur.postprep.condom.nonrebounder = 52,
cai.cutoff.times = (600+(1:2))*52
)
init <- init_msm()
control <- control_msm(simno = fsimno,
nsteps = (600*52) + 5,
nsims = ncores,
ncores = ncores,
start = (600*52) + 1,
initialize.FUN = reinit_msm,
verbose = TRUE,
tergmLite = TRUE)
netsim_hpc("data/sim.n1206_bcb.rda", param, init, control, compress=FALSE,
verbose = TRUE, save.min=FALSE, save.max=TRUE)
from epimodelhpc.
It seems that it is not an issue but a case of breaking changes between v2.1.0 and v2.2.0
Closing unless new info arise
from epimodelhpc.
Related Issues (20)
- Change (make flexible) placement of loadR.sh in sbatch_master HOT 17
- Equivalent function to deactivate.edges HOT 3
- Function initialize_cp HOT 1
- Solidify new HPC methods from ALG into EpiModelHPC HOT 1
- EpiModelHPC: define a way to run all the calibrations at once on HPC and gather the final calibrated model. HOT 1
- add a slurmworkflow step_tmpl for scenarios with replication
- Implement checkpointing Slurm HOT 9
- wrong batch number in the log of `step_tmpl_netsim_scenarios` HOT 1
- merge_simfiles might be not working? HOT 1
- make `step_tmpl_renv_restore` init renv if not done
- revisit the tests
- Runs don't parallelize across cores when nsims = ncores (ie with only one node) HOT 19
- Fix vignette / doc related to `netsim_scenarios`
- Use [email protected] from `module` in SPH HOT 1
- merge_simfiles not working again HOT 9
- issue with merge_netsim_scenarios HOT 7
- `spack unload -a` still necessary
- `netsim_scenarios`: use future instead of inner parallelization
- do not `rep` the scenario list in `netsim_scenarios`
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from epimodelhpc.