craig44 / cabm Goto Github PK

CABM (C++ Agent Based Model)

C 0.31% C++ 8.86% R 0.73% Batchfile 0.01% Python 0.24% HTML 64.92% CSS 0.09% JavaScript 0.48% Roff 8.78% M4 0.01% CMake 15.17% Cuda 0.01% Fortran 0.01% Shell 0.09% Emacs Lisp 0.07% Tcl 0.02% Vim Script 0.21%

fisheries ibm

cabm's Introduction

C++ Agent Based Model (CABM)

This repository contains a generalised Agent-based model (ABM) that I developed during my PhD. CABM was primarily developed as an operating model to simulation test ecological and fishery models. CABM tracks agents which can represent many individuals that have identical attributes over a discretized spatially explicit domain, which is user defined. CABM allows users to specify different life-histories, agent dynamics and interactions with fisheries. The current repository is a mixture of ideas that need to be acknowledged. Firstly to the Casal2 team found here, which formulated the basis of the core code for error handling, parameter structure, configuration syntax and more. I also learnt about ABM's via the model found here, where some idea's have been brought across. And finally to the SPM team found here, where most of the spatial code structure was inspired from.

This is an open source project and if anyone is interested in this project please get in touch, the more input the better. I am interested in adding features that are being used in recent applications of ABM's such as; energetic functionality habitat based movement, via currents and active gradient searches like in this paper. I would also like to make CABM a full life cycle model so that we can do end to end modelling.

ABM's are notorious for being limited by CPU, so a big emphasis is to make it as modular and thread safe as possible. It is currently coded to be used on a desktop (as apposed to HPC) as I belief that is where most users will use it.

CABM has a forced spatial structure, where users must define at least one spatial area. The reason I have gone down this road is mainly because I am interested in spatial characteristics and I believe they are fundamental to ABM's. This means for simple spatial models (single area models) the model/input files could be a bit lousy, But when you start having high spatial models and spatial processes such as fishing. The spatial memory management will benefit massively in efficiency's.

Current Status of Repo

Currently users can have any spatial resolution they desire. I have an example model in found in this directory that is 5x5 spatial model, with a single fishery. During my PhD I configured a 20 x 50 spatial model to mimic hoki dynamics. CABM can simulate a range of age, length compositional data and biomass indicies. The parallelization is not working as well as I had hoped so if you run the model use max_threads 1.

Compiling the code

If you need CABM to run a management strategy evaluation (MSE) you will need to compile it. Otherwise you should be able to get the latest exe from the release section of the Github repo. When compiling CABM for MSE mode, there will be a dependency with R and you must have it in your system path, to check this, open a terminal and type in R. As well as R being in the path, compilation requires the following R packages to be installed.

install.packages(c("RInside","Rcpp"));

Once those dependencies have been addressed open a terminal in the BuildSystem directory, to check you have the right dependencies on your program run the following command

doBuild.bat check

if you have windows 10 and you are building from the power shell you may have to put a .\ in front of the command, and for linux

.\doBuild.sh check

This will look for g++, git, R, and g++ version maybe some others.

If the check worked okay, run the command to build the within repository libraries Python, cmake and Boost

doBuild.bat thirdparty

Once that is done you should (in theory) be able to compile the code and build an executable using the following call in the command.

doBuild.bat release

This will put the executable in the following directory BuildSystem\bin\'system'\release, where 'system' can be windows or linux depending on your OS. An alternative is to build it in a IDE such as eclipse.

For more information on build parameters look at the python scripts or run the command

doBuild.bat help

Tips for compiling on windows

I use the cross gcc compiler found here. Make sure when you install, that openMP and I think fortran tool sets are available in the gcc compiler. If you use the compiler that is linked above, this will involve manually selecting extra settings in the install process.

Avoid white space in the absolute path name of the repository, in the past we have found issues when white space is present in the absolute path of the repository.

Running the Program

There is a partial user manual found in the Documentation folder, but a quick fire run through. CABM is run through the command line (no gui), using the executable name and run parameters e.g. cabm.exe -r --loglevel finest > output.log 2> err.log This command will run (-r) the cabm, which by default looks for a file named config.ibm to look for specific model tasks. The main output defined in the configuration files will be printed in the file labeled output.log. I have also added an additional print statement (--loglevel) which will print extra information at run time into the file err.log. There are 4 levels of loglevel, medium, fine, finest and trace. If anyone is interested you can look in the code base for statements such as LOG_FINEST() which will print information if the parameters is defined at run time.

Godspeed

cabm's People

Contributors

Stargazers

Watchers

Forkers

facile-visual

cabm's Issues

Functionality to add to R-Library

FIx R-library extract files to deal with multi-line comments and !include statements.

Equal detection probability of recapturing tagged fish

Jeremy has R code that solves the issue with agents scalars so that tag recapture probabilities for generating unbiased observations.

ADD: plug in MSE

Is your feature request related to a problem? Please describe.
I want the ABM to run an iteration simulating data, that is compatible with a particular generalised stock assessment SS, CASAL, MULTIFAN-CL.

Describe the solution you'd like
The things that will need to be done

Create new reports that spit out simulated data that are compatible with a generalized package (SS, MULTFAN-CL)
Create Harvest control rules, that take output from a generalized package output to create a new catch, that then gets run the next year to simulate data.
create calls from within IBM to call estimation of other programs SS and Casal2

Describe alternatives you've considered
Need to demonstrate that this IBM is computationally efficient going forward, and maybe get some people on board to help flush out reports for generalized packages.

Additional context

CHG: underlying partition container from list ->vector

have hit very poor performance recently, so that has forced me to check handling agents as vectors. Basically less memory mangement, we put a bool on an agent and check whether it is a valid agent and we can overwrite invalid agents. This will allow for fast random element lookup which is critical for mortality processes.

Cache exp() calls.

Is your feature request related to a problem? Please describe.
There can be serious computational savings to be made if we cache exp() calls on the agent class. Currently each growth call which n_years * n_time_steps*n_agents we more often than not call exp() on M and growth parameterss. For a model that doesn't time-vary this could be reduced to just n_agents.

ADD a check in CMAKE so that OpenMP is available to compiler

ADD Threading

Add threading on all processes where each cell can be threaded independently, only in movement there will have to be a bit more thought when splicing agents to the same destination cell, but a mutex with condition variable should be adequate.

Implement sex in the IBM

Some scrappy code exists to add Sex, but this needs to be tidied up and tested

Split out MSE functionality into a branch

I have found out that the MSE functionality of the ABM relies on Rinside and Rcpp. When you compile the code with these dependencies they use local paths for dependent libraries. I either need to add these dependencies .dll or .so with the exe or, I am thinking for now just create a branch so for users that want to use this. They will need to compile the code in order to use it.

add notify subscribers from base on the CASAL2 code base

Notify subscribers creates a link between objects and allows an efficient way to update dependencies that are cached.
this will allow us to try the threading. With the knowledge that the right cache will be rebuild e.g time-varying selectivity or M.

Enhancements

This is just a space where people can add ideas (a kind of wish list), if the idea gains traction we can migrate to its own issue as not to clutter.

Adding tagging

In Jeremy's and Nokome's ABM they release tags before harvesting, thus making it independent to harvesting. I think I will make this a user decision. Users can release tags before mortality events or during mortality events.

Currently the inputs for the other ABM are tags released and scanned proportions.

tag release by area
scanned for each year which is a proportion of total catch

ADD Larval dynamics

Currently we do not distinguish any life history trait in the partition structure. So any specific process that effects a single life history 'could' (needs to be tested) be slow. For this example, say we have our full fishery model like in the example SpatialModel and we want to add a larval component. Say incorporate larval drift from currents for a 6 week period after recruitment.
Options to test

leave partition structure as is and just iterate over everything and only apply process to the age class of interest.
Create a specific larval partition and start building up the concepts of "categories" in the program, could be an overhead when moving individuals back into the main partition.

ADD: selectivity to movement processes.

Add a selectivity to the movement processes, this can be done by checking if an agent is available to move before we do a multinomial draw on where it moves.

This will allow for age or length based movement.

Thread safety on processes

currently I create a private vector of random numbers that each cell/thread builds and accesss, and so we only have to control for a race condition when populating the vector. Why do this is because the random number generator is a shared resource so we get segment fault for simultaneous access.

Scott has suggested an alternative which follows the code

Before get stuck in, iterate over world and calculate number of agents in world

Populate the single Global vector (rand_vec) with random numbers
cell_offsets= 0
for (i in rows)
for (j in cols)
random number = rand_vec[cell_offsets + agent_iter]

this way we only allocate memory once, and because of the business rules we will never be accessing the same element.

Add a new numeric layer for recruitment process

A concept for a new layer specifically related to the recruitment event. In the recruitment process users must define a layer of proportions where to see recruits. If we want a full life cycle model then I think age 0 individuals should be seeded proportional to the SSB over the spatial domain.

create a new numeric layer that will convert population biomass or abundance into proportion layer over a user specific set of cells.

This means that recruits will be spatially proportional to the SSB at the point of recruitment. Which will also allow the recruitment spatial pattern to change over time with the SSB.

Random Number generation with threading

I have discovered that as soon as we go down the threading route we can never reproduce exact model runs (unless we run on a single thread) because we have no control over the order of thread execution and so random number chain will be executed in a different order each time. More of a note than anything, I am happy with this, it just means that in order to validate functionality, either run comparisons on single thread with a fixed seed or set agents to a massive number so that it would be obvious that the model is behaving differently to how you expect.

Add a flag to constant rate mortality process

Add a subcommand to this process saying apply_selectivity. Many models assume constant selectivity, for this type of model2, we can ask agents to cache the exp(M) call. The process will need to find out how many times it is applied in the annual cycle as this will define the 'selectivity'. It might also want to cache for each time-step.

Update to GCC 9.1

Speed up simulations

Add a flag on the model if doing a simulation whether initialization phase needs to be re-run.

Unless you are doing a -i with values such as B0, M or growth changing you shouldn't need to re-run the initialization phase. This would save quite a bit of time. Check if we are doing simulation, cache the world and then iterate over the model years for each simulation.