Code Monkey home page Code Monkey logo

biosimulators / biosimulators_copasi Goto Github PK

View Code? Open in Web Editor NEW
2.0 6.0 3.0 8.43 MB

COPASI biochemical network simulation program via BioSimulators-compliant command-line interface and Docker container

Home Page: https://docs.biosimulators.org/Biosimulators_COPASI

License: MIT License

Dockerfile 1.76% Python 96.48% CSS 1.76%
copasi dynamical-modeling sed-ml combine-archive omex-metadata sbml biochemical-networks systems-biology computational-biology simulation

biosimulators_copasi's Issues

Lowering Cognitive complexity as per SonarCloud

Justification
After several attempts of pushing type annotations to biosimulators_copasi with a PR, I have been repeatedly met with an error from SonarCloud which indicates that a given function’s “cognitive complexity” is above the acceptable threshold. The documentation specifies that it will assess a negative mark for a variety of statements and situations including: nesting, recursion, and any statement that includes if/else/while/except/and/or/is/not. Exceptions to this rule are made with try/finally blocks and boolean dictionaries.

Examples:
For example, I was able to devise a solution that satisfied the bot while still returning the desired result for the optional keyword argument “config”, where the default is None:

Originally: (+1 cognitive complexity)
config = config or get_config()

Revised: (+0 cognitive complexity)
config = {True: config, False: get_config()}.get(bool(config), config)

Clearly, this solution is neither scalable nor ‘Pythonic’ in nature. Another source of SonarCloud contention pertains to nested-if statements whose depth of nesting is > 1. For example:

` for change in model.changes:
target_sbml_id = model_change_target_sbml_id_map[change.target]
copasi_model_obj = get_copasi_model_object_by_sbml_id(copasi_model, target_sbml_id, units)
if copasi_model_obj is None:
invalid_changes.append(change.target)
else:
model_obj_parent = copasi_model_obj.getObjectParent()

              if isinstance(model_obj_parent, COPASI.CCompartment): 
                  set_func = model_obj_parent.setInitialValue
                  ref = model_obj_parent.getInitialValueReference()
  
              elif isinstance(model_obj_parent, COPASI.CModelValue): 
                  set_func = model_obj_parent.setInitialValue
                  ref = model_obj_parent.getInitialValueReference()
  
              elif isinstance(model_obj_parent, COPASI.CMetab): 
                  if units == Units.discrete: 
                      set_func = model_obj_parent.setInitialValue
                      ref = model_obj_parent.getInitialValueReference()
                  else:
                      set_func = model_obj_parent.setInitialConcentration
                      ref = model_obj_parent.getInitialConcentrationReference()
  
              model_change_obj_map[change.target] = (set_func, ref)`

Possible solutions:
One effective solution to this could include a guard clause. The proposal at hand involves refactoring nested conditional statements whose traversal depth is > 1. The above code is a perfect example of this cognitive complexity. If setup required python >= 3.10, then the implementation of match/case statements in place of ternary-if statements would satisfy the requirements of this issue.

Standardize inputs for VCELL, COPASI and Biosimulation images

  • Add sedml parser inside copasi
  • Inside copasi container
    • Parse timepoints/task
    • Run simulation in same format as VCell
  • Add http call to get request token from auth0 servers
  • Add sedml parser to extract timepoints
    • Modify algorithms map with KISAO

Get values of non-constant parameters

Some simulation experiments (SED documents) need to record non-constant SBML parameters set via assignment rules, event assignments, etc.

@fbergmann @shoops @pmendes What's the syntax to use to get time courses for SBML parameters?

Here's a sketch of the code we're currently using. This only retrieves time courses for SBML species. How does this need to be edited

import COPASI

filename = 'path/to/model.xml'

data_model = COPASI.CRootContainer.addDatamodel()
data_model.importSBML(filename)

task = data_model.getTask('Time-Course')
task.setMethodType(COPASI.CTaskEnum.Method_deterministic)
task.setScheduled(True)

model = data_model.getModel()
problem = task.getProblem()
model.setInitialTime(0.)
problem.setOutputStartTime(0.)
problem.setDuration(10.)
problem.setStepNumber(10)
problem.setTimeSeriesRequested(True)
problem.setAutomaticStepSize(False)
problem.setOutputEvent(False)

result = task.process(True)

time_series = task.getTimeSeries()

COPASI or wrapper should support KISAO terms for algorithms and algorithm parameters

Currently, COPASI appears to only support KISAO terms for three algorithms. COPASI also appears to have no KISAO support for algorithm parameters.

  • Timecourse
    • Continuous
      • KISAO:0000019: CVODE
    • Discrete
      • KISAO:0000241: Gillespie-like method
  • Steady-state
    • KISAO:0000282: KINSOL

COPASI and/or the wrapper needs to support more of the algorithms supported by COPASI and their parameters.

Add BioContainers-style metadata to Dockerfile

Example

LABEL base_image="ubuntu:18.04"
LABEL version="2.4.1"
LABEL software="tellurium"
LABEL software.version="2.4.1"
LABEL about.summary="Python-based environment for model building, simulation, and analysis that facilitates reproducibility of models in systems and synthetic biology"
LABEL about.home="http://tellurium.analogmachine.org/"
LABEL about.documentation="https://tellurium.readthedocs.io/"
LABEL about.license_file="https://github.com/sys-bio/tellurium/blob/master/LICENSE.txt"
LABEL about.license="SPDX:Apache-2.0"
LABEL about.tags="kinetic modeling,dynamical simulation,systems biology,SBML,SED-ML,COMBINE,OMEX"
LABEL maintainer="Jonathan Karr <[email protected]>"

More info: https://biocontainers-edu.readthedocs.io/en/latest/what_is_biocontainers.html?highlight=label#how-to-request-a-container

Review specifications of COPASI's capabilities (biosimulators.json)

Hi @copasi @pmendes. As you probably know, we're containerizing COPASI to make it available through BioSimulations. The metadata about COPASI is captured in biosimulators.json. This includes the modeling frameworks, formats, and algorithms that COPASI supports and the parameters of those algorithms. Basically, this file will control how BioSimulations presents COPASI to end users. Could you review this, or ask the appropriate person to review this? We also welcome any feedback on the structure of the file.

Incorrectly raised Exception regarding number of time points

The NotImplementedError (Time course must specify an integer number of time points) in core.py#L197 is raised for specific integer numbers of steps in combination with certain time values.

Example (core.py#L191):

# sim.number_of_points = 500
# sim.initial_time = 0.0
# sim.output_start_time = 0.0
# sim.output_end_time = 1.1
step_number = (
    sim.number_of_points
    * (sim.output_end_time - sim.initial_time)
    / (sim.output_end_time - sim.output_start_time)
)

Even though the number of points is an integer, the step_number now has a value of 499.99999999999994, which raises the exception.

One way to fix this would be to add more parentheses:

step_number = (
    sim.number_of_points
    * (
        (sim.output_end_time - sim.initial_time)
        / (sim.output_end_time - sim.output_start_time)
    )
)

Handling `listOfOutputs` for version >= 4.28.226

<listOfOutputs>
    <plot2D id="plot_1_task1" name="f3a">
      <listOfCurves>
        <curve id="p1_curve_1_task1" name="[BE]" logX="false" logY="false" xDataReference="time_task1" yDataReference="BE_1_task1"/>
      </listOfCurves>
    </plot2D>
    <report id="simulation_1" name="simulation 1">
      <listOfDataSets>
        <dataSet id="time" label="time" dataReference="time"/>
        <dataSet id="FeDuo" label="FeDuo" dataReference="FeDuo"/>
      </listOfDataSets>
    </report>
</listOfOutputs>

Latest COPASI version by default generates TSV file as result along with some additional metadata which is not useful for visualisation purposes.
From the above SED-ML, the user can able to define plot2D, plot3D or DataSet block.
In older versions <4.28.226, if there is no dataset block, COPASI lib throws error/fails if SED-ML doesn't have dataset definition so that we can create a default one.
But in versions >= 4.28.226, the COPASI lib doesn't notify/fail if there is no DataSet definition, rather it creates a TSV on its own (along with metadata).

How should we handle it? Should the user take care of it to provide a correct SED-ML? Or do we need to handle it in simulator code?

PFA zip archive, which contains a sample OMEX used to test the scenario, and old.csv and new.csv generated by the old and new version of COPASI respectively.

To reproduce the CSV files pull the latest and 4.27.214 docker images and run locally with the attached OMEX archive. Refer this.

@moraru @jonrkarr @bilalshaikh42 Thoughts?

Archive.zip

Internalize functionality into COPASI itself (obviating the need for this repository)

@shoops @fbergmann @pmendes I'm creating a separate thread for the discussion that @shoops initiated in #46.

Helllo Jonathan,

I think a better approach would be to let COPASI read the SED-ML directly. The problem I see is that if SED-ML asks for plot and you are running CopasiSE tor any language bindings he plot will not be generated. I think I have an idea on how to circumvent that. I will talk with @fbergmann about it.

Stefan

I would also greatly prefer to use COPASI directly -- this would be easier to maintain and lead to making COPASI's handling of SED-ML consistent with the SED-ML specs and other tools (in part through using our test suite to check COPASI). In fact, my preference is to internalize this command-line interface and Docker image into the COPASI repository so the COPASI team has control over this and can maintain it, release it, etc. At that point, this repository could be deleted. Ideally all tools would follow this path.

This purpose of this repository was to enable us to prototype handling of SED-ML consistent with the SED-ML specifications more quickly than we could by trying to help build/fix similar functionality in COPASI itself.

Capabilities of the current approach

This library does have capabilities for the other features of SED-ML such as below. This is imported from another library.

  • Model resolution (e.g., URLs, URI fragments, MIRIAM ids)
  • Model changes
  • Repeated tasks
  • Algorithm substitution utilizing relationships in KiSAO
  • Reports -- including saving metadata in HDF5, multidimensional reports
  • Plots

Rationale for current approach

We haven't used COPASI due to several points of divergence from the SED-ML specifications and various bugs. I didn't investigate every issue in detail. Some of this may only affect SED-ML export, and some of this may have been resolved since I first evaluated this.

  • COPASI supports SED-ML L2
  • COPASI ignores most KiSAO information
  • KiSAO terms didn't even exist yet for several COPASI methods
  • From what I could tell, COPASI has limited support for model changes
  • COPASI appears to ignore namespaces for model changes and variables; as a result COPASI appears to have limited support for XPaths
  • COPASI has bugs in exporting repeated tasks
  • Based on what I've seen from SED-ML exported from COPASI,
    • It seemed that COPASI has issues with ids of SED-ML objects (e.g., not unique)
    • COPASI seems to confuse the ids and names of model variables in XPaths
    • Sometimes duplicate SED-ML objects are exported
    • Sometimes required attributes such as labels of reports are missing
  • Exported time course sometimes differ from that in COPASI files
  • COPASI's validation appears to ignore many issues with SED-ML files

Similarities of the current approach to COPASI to our approaches to tellurium, VCell, and other tools

For similar reasons, we haven't fully used tellurium's SED-ML handling either. We've posted issues about a number of issues and missing features. VCell was also in a similar situation -- multiple points of divergence from SED-ML and other tools, and lack of support for some features which felt were key. We've also posted a variety of issues for VCell. We've had to approach all other tools the same because they either have no, minimal, or divergent support for SED-ML.

CCopasiException on invalid archive

CCopasiException is a strange exception in the sense that it cannot be catched (excepted).
I tried wrapping the calling code into try block, still everytime we feed an invalid/corrupt omex archive it completely crashes the application. Reasons are still unknown.

It crashes with an error something like this:
libc++abi.dylib: terminating with uncaught exception of type CCopasiException [1] 48832 abort copasi -i ../notebooks/omexes/multi_sedml/BIOMD0000000793.omex -o

So, how should we handle it? Should we execute the omex extraction in separate thread/process so that it doesn't crash the main application? @jonrkarr

Biosimulators copasi gives incorrect results that normal copasi doesn't

The attached OMEX files all produce different results on copasi-on-biosimulations than they do for copasi-on-my-desktop (or, for that matter, from tellurium-on-biosimulations).

BIOMD0000000005.omex.zip
BIOMD0000000006.omex.zip
BIOMD0000000007.omex.zip

The phenotype is the same for all: a value that should change in time instead is fixed and doesn't change. For Biomodels 5, the variable in question is YT, aka 'total_cyclin'. You can see the difference in

https://run.biosimulations.org/runs/64406c7ad5954382281dde42#tab=viz (copasi, incorrect)
vs
https://run.biosimulations.org/runs/64669bea9a62a8be53f457d1#tab=viz (tellurium, correct, and matches my copy of copasi, v4.36)

Biomodels 6:
https://run.biosimulations.org/runs/64406c7a44d76b2ce805707f#tab=viz
https://run.biosimulations.org/runs/64669bea49c0d898ffdaa536#tab=viz

Biomodels 7:
https://run.biosimulations.org/runs/64406c7be039ed31faa8dc62#tab=viz
https://run.biosimulations.org/runs/64669beb0dcacb5567885fc0#tab=viz

biosimulators no longer working on macOS

Running the bio simulator, just freezes whenever the StandardOutputErrorCapturer is used. When I run without log file, it works, however no output is produced. so

        config = bio_copasi.get_config()
        config.LOG=True
        config.DEBUG=True
        config.VERBOSE=True
        results, log = bio_copasi.exec_sedml_docs_in_combine_archive(
            omexfile, 
            outdir, 
            config=config, 
            fix_copasi_generated_combine_archive=False)

freezes the program, needing for the python process to be killed. while

        config = bio_copasi.get_config()
        config.LOG=False
        config.DEBUG=True
        config.VERBOSE=True
        results, log = bio_copasi.exec_sedml_docs_in_combine_archive(
            omexfile, 
            outdir, 
            config=config, 
            fix_copasi_generated_combine_archive=False)

works fine, however it does not produce any output, and there is a warning message:

StandardOutputNotLoggedWarning: Standard output and error could not be logged because capturer is not installed. To install capturer, install BioSimulators utils with the `logging` option (`pip install biosimulators-utils[logging]`).

it would be nice if output could be produced without needing capturer.

Apply new KISAO terms for algorithms and parameters

  • Replace KISAO_0000088 --> KISAO_0000560 new KISAO id for LSODA/LSODAR
  • Replace KISAO_0000483 --> KISAO_0000559 new KISAO id for initial step size
  • Replace KISAO_0000231 --> KISAO_0000561 new KISAO id for hybrid-runge-kutta method
  • Replace KISAO_0000231 --> KISAO_0000562 new KISAO id for hybrid-lsoda method
  • Replace KISAO_0000231 --> KISAO_0000563 new KISAO id for hybrid-rk45 method
  • Replace KISAO_0000036 --> KISAO_0000566 new KISAO id for Stochastic second order Runge-Kutta Method
  • Replace KISAO_0000243 --> KISAO_0000567 new KISAO id for force physical correctness parameter
  • Replace KISAO_0000242 --> KISAO_0000565 new KISAO id for Absolute tolerance for root finding

Support hybrid RK45 method

As discussed in #25, this package cannot support the hybrid RK45 method because python-COPASI doesn't support setting the partitioning parameter. @fbergmann please let us know when python-COPASI is updated so we extend support to this package, the COPASI docker image, and all downstream tools such as runBioSimulations.

Improve messages for COPASI errors

@fbergmann how can I get error information from COPASI to pass this along to users, e.g. when an algorithm fails?

I inherited the following from Akhil, but this doesn't work.

error_message = COPASI.CCopasiMessage.getAllMessageText(True)

Enhancements

  • Fix COPASI results file for the latest version
  • Fix nested structure in COPASI simulator (BIOMD0000000297.omex)

Documentation discrepancies between repo and website.

Abstract

I have been using biosimulators.org as the perspective for our “landing page” as it is our own unique little section of the internet. I am also inspired by other major Python packages whose primary source of documentation is their own website (sometimes directly on the home page!).

Evidence

Given this viewpoint, it is clear that the presentation of entrypoints on our website needs some help. To support my assertion, I will reference the following examples:

  1. For example, if you were to click on “Tutorial” under “Contents” on the
    “BioSimulators-COPASI documentation” page
    biosimulators-COPASI_doc_screenshot

...only the cli and Docker entrypoints are referenced. To me, this is
confusing for a ‘Tutorials’ or ‘Getting Started’ page as only some of
the entrypoints from the “Overview” section are referenced. As a programmer it can be difficult to navigate the documentation, let alone someone with no coding experience.

  1. Furthermore, when you click on "Tutorial" here: ...
    tutorial

    ...you are taken to a binder tutorial which ITSELF is a demonstration of the programmatic capabilities of the Python api.

  2. Finally, there are several references to this page which is on the (for all intents and purposes) very different Biosimulations website.

I can understand why these methods are highlighted as they are the methods that require the least amount of setup. Of course, there may exist the philosophical "best practice" of interacting with this repo, but I fear that there is an overall lack of consistency in regard to the scope of a user's ability to interact with this fantastic tool.

Proposal

  • Create a landing page which references each application entrypoint in an explicit manner
  • Remove or change the reference to the runBiosimulations getting started guide to be a Biosimulators specific reference
  • Create a visualization of the relationship between Biosimulators, Biosimulations, and runBiosimulations. Let this visualization also describe their individual purpose/role in the framework.
  • Create a smart tool for model inference. Leverage the current abilities to identify simulation objects to provide the "best model for the job".
  • Leverage the biosimulators-tutorials repo and add a pre-configured, shareable, editable Google Colab "sandbox" notebook to the repo (see this pull request)

Bare minimum to manage simulation

  1. Place copasi.img on HPC
  2. Get job id, simulation params from environment
  3. If the job succeeds send info to jobhook

Modifications on COPASI Image
1. Make COPASI Singularity Image work with stochastic methods dynamically accepting inputs as env
- Gibson+Bruck(Both VCELL and COPASI)
- Direct Method
- Tau Leap
- LSODA
All methods available from UI are completed

Docker run command returning "is not a file" error on omex path

Without error, I am able to call:
docker pull {copasi image str}

However, when calling:
docker run {copasi image str} -i {path to omex} -o {path to output}

I am greeted with the following error:
{path to omex} is not a file

I have taken care to point to a .omex file and NOT the directory of omex contents.

PROPOSAL:

Refactoring docker abilities to recursively index a desired omex path. PR to come.

All biomodels Copasi currently fails for

tellurium_failures.xlsx
copasi_failures.csv

I'm attaching two lists: one of all models where Tellurium currently fails, and one of all models where Copasi currently fails. In general, it should in theory be possible for Copasi to succeed at everything that Tellurium succeeds on, and it might succeed at a few more, but I wouldn't count on it.

The list has links to the logs of all 8/25/23 runs of all biomodels on Copasi, which should tell you what the problems were. Many of them are the same problem; i.e. for runs like

https://run.biosimulations.org/runs/64e932fc548ceb84f3effcf6#tab=log

where the problem is that the number of points, originally 1000, got read into a real value instead of an int, so couldn't be used later. Once that's fixed, many of the other runs will also be fixed. Similar things will be true of the other runs; I'm happy to explain how they work in Tellurium if you like.

Expose static and steady-state analyses through SED-ML

Such as

  • Steady-State Analysis
  • Stoichiometric State Analysis
  • MCA
  • Lyapunov Exponents
  • Time Scale Separation
  • Sensitivity Analysis

Todo

  • Map each COPASI method/parameter to a KiSAO term (e.g., build a table of the correspondence between COPASI and KiSAO)
    • Add additional KiSAO terms as needed
  • Add the above algorithms to the curation of COPASI
  • Expand map of KiSAO terms to COPASI methods
  • Curate a few test cases (SED-ML files)
  • Add unit tests involving these test cases

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.