Code Monkey home page Code Monkey logo

biosimspacetutorials's People

Contributors

adelehardie avatar annamherz avatar dlukauskis avatar fjclark avatar jenkescheen avatar jmichel80 avatar lohedges avatar ppxasjsm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

biosimspacetutorials's Issues

FEP tutorial comments

Hi @JenkeScheen ,

A few comments/questions after going through your setup notebook:

  • The notebook assumes the user has access to parameterised ligands (in vacuum) that have coordinates matching those of the parameterise protein input. We need separate BSS nodes to generate this type of input starting from 'sane' PDB/mol2 files.

  • The inputs haven't been energy minimised/equilibrated before starting the production runs. This may cause SOMD to crash. It would be better to equlibrate the solvated ligand0:complex, and then use a snapshot after equilbration to generate the merged molecule.

  • The slurm submission script will run each lambda value serially. It would be better to run the lambda windows in parallel to decrease time to answer.

Questions and todo's as they arise around the FEP tutorial

I am using this issue to help with merging this and this set of FEP tutorials. I'll keep updating to dos adding questions etc.

Questions:

  • We keep the ABFE tutorial as part of the 4th FEP tutorial?
  • Are we archiving keeping Jenke's tutorial on FEP?
  • Do we need a copy of the source of freenrgyworkflows in the tutorial? This seems very dodgy from a coding perspective and can lead to diverging repos.
  • Replacement of freenrgyworkflows with arsenic (or whatever the latest name now is) will this happen? Has this happened?
  • Are we offering some link to azure or co-lab to be able to run these directly from the tutorial repository?
  • Should the ccpbiosim repo be moved to the ccpbisim gitbub?
  • Should the headers be made uniform and the specific reference to ccpbiosim training week removed?
  • Should all tutorial jupyter notebooks have uniform headers and also clearly state the contributions of people writing the tutorials?
  • @jmichel80: Should we add solutions as drop-down menus in all notebooks? This will avoid having to maintain duplicate notebooks with answers.

Questions on FEP tutorial structure:

  • Should 04_FEP contain 3 separate tutorials? 01_Intro_to_alchemy, 02_advanced_RBFE and 03_ABFE?

Overall to do:

  • Create a uniform style guide for all tutorials
  • Identify how to track authors contributions and assign maintainers
  • Merge the two tutorial repositories.
  • 01_intro:
    • Do formatting check
    • Do spell check
    • Do basic functionality check
    • Include answers as dropdowns to avoid having to maintain duplicate repositories
  • 02_rbfe:
    • Do formatting check
    • Do spell check
    • Do basic functionality check
    • Include answers as dropdowns to avoid having to maintain duplicate repositories
  • 01_abfe:
    • Do formatting check
    • Do spell check
    • Do basic functionality check
    • Include answers as dropdowns to avoid having to maintain duplicate repositories

Issues 01_intro to alchemistry

Some suggested fixes:

In section 2 merged systems maybe link to 7.1.1 directly:

  • There are different topologies, and which is used, single or dual, depends largely on the software. For more information, check the further reading : 7.1.1 (Topologies).

ff14SB issue

I got an error for ff14SB parametersization.
Could you please help me to solve this issue?
Screenshot from 2024-01-11 10-44-19

Another way of minimize and equilibrate system

Hi guys! Recently I'm using BioSimSpace to minimize and equilibrate perturbable systems before running the main simulation and meet some problems. I got some new ideas to combine Gromacs simulation and SOMD free energy simulation:

  1. Save the perturbable system to Gromacs GroTop files, and save Amber parameter files (i.e. rst7, prm7) use savePerturbableSystem.
  2. Running Gromacs minimization, NVT, NPT equilibration with GroTop files in lambada 0.
  3. Extract the equilibrated system coordinate, replace the non-equilibrated coordinate saved by savePerturbableSystem.
  4. Construct a new perturbable system by readPerturbableSystem.
  5. Replace the molecule in the system equilibrated by Gromacs and create somd free energy protocol.

Here is the main part of my code:

import BioSimSpace as BSS
import argparse
import os
from pathlib import Path
import logging

logging.basicConfig(level=logging.INFO)

def create_system(work_dir, mode):
    # 1. load the equilibrated system
    logging.info('Load the equilibrate system')
    if mode == 'solvated':
        mode_dir = 'Run_Solvated'
        mode_prefix = 'morph'
        somd_dir = 'free'
    elif mode == 'bound':
        mode_dir = 'Run_Bound'
        mode_prefix = 'complex_final'
        somd_dir = 'bound'
    else:
        raise AttributeError(f'No {mode} !')

    cor = f'{work_dir}/0/{mode_dir}/Equilibration_NPT/Equilibration_NPT_0.gro'
    top = f'{work_dir}/{mode_prefix}.top'

    sys_equilibrated = BSS.IO.readMolecules([cor, top])

    # 2. save the equilibrated morph coordinate
    logging.info('Save the equilibrated morph coordinate')
    n_residues = [mol.nResidues() for mol in sys_equilibrated]
    n_atoms = [mol.nAtoms() for mol in sys_equilibrated]

    for i, (n_resi, n_at) in enumerate(zip(n_residues, n_atoms)):
        if n_resi == 1 and n_at > 5:
            system_ligand_1 = sys_equilibrated.getMolecule(i)
            BSS.IO.saveMolecules(f'{work_dir}/equilibrated_ligand', system_ligand_1, 'rst7')
            break

    # 3. construct morph use the new coordinate
    logging.info('Construct new morph')
    p0 = str(list(Path(work_dir).glob('*0.prm7'))[0])
    p1 = str(list(Path(work_dir).glob('*1.prm7'))[0])
    morph = BSS.IO.readPerturbableSystem(f'{work_dir}/equilibrated_ligand.rst7', p0, p1)

    # 4. replace the molecule with the new morph
    logging.info('Replace the morph')
    sys_equilibrated.removeMolecules(system_ligand_1)
    sys_equilibrated.addMolecules(morph)

    # 5. create somd free energy protocol
    logging.info('Create SOMD free energy protocol')
    protocol = BSS.Protocol.FreeEnergy(num_lam=32, report_interval=100000, restart_interval=50000)
    BSS.FreeEnergy.Relative(sys_equilibrated, protocol, engine='SOMD',
                            work_dir=f'{work_dir}/{somd_dir}', setup_only=True)

    # 6. then we run 'somd-freenrg' in somd_dir

I've done some tests, now the energy printed by somd-freenrg and printed by Gromacs's log seems to agree to some extent.

###=======================Minimisation========================###
Running minimisation.
Energy before the minimisation: -21275 kcal mol-1
Tolerance for minimisation: 1
Maximum number of minimisation iterations: 1000
Energy after the minimization: -26609.8 kcal mol-1
Energy minimization done.
###===========================================================###
        <======  ###############  ==>
        <====  A V E R A G E S  ====>
        <==  ###############  ======>

        Statistics over 100001 steps using 2001 frames

   Energies (kJ/mol)
           Bond          Angle    Proper Dih.  Improper Dih.          LJ-14
    2.57235e+01    4.37455e+01    5.32598e+01    5.22061e+00    4.89850e+01
     Coulomb-14        LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
   -9.54520e+02    1.35434e+04   -4.36026e+02   -1.03363e+05    2.96770e+02
      Potential    Kinetic En.   Total Energy    Temperature Pres. DC (bar)
   -9.07364e+04    1.60799e+04   -7.46565e+04    2.98222e+02   -1.10754e+02
 Pressure (bar)    dVremain/dl      dVcoul/dl       dVvdw/dl    dVbonded/dl
    3.25410e+00    8.52377e+01    3.48136e+00    5.25007e+01    4.21458e+01
 dVrestraint/dl   Constr. rmsd
    0.00000e+00    0.00000e+00

I'll keep digging into it, before that, I got some questions and would like to hear from you first, any comments or thoughts are appreciated!

  1. In Gromacs pipeline, the system in each lambda was minimized and equilibrated before Production run, here we only use the lambda 0's equilibrated system and create somd-freenrg input files for each lambda, would that be a problem?
  2. In Gromacs pipeline, the last frame with coordinates and velocities will be read from NPT trajectory before Production run , here I only read the coordinates from gro file, would that be a problem?

Best wishes!

How/where will data be stored for the workshop

Since we are not running live simulations themselves and will be using prepared data for the analysis parts of the workshop, I was wondering where it will be stored and how much space we have? For my last part, I would illustrate the start of MSM building, but that requires featurising 100s of 100 ns trajectories (~ 1GB per dry trajectory).

Another concern with that that I have is that it can take a while for the featurisation to be executed - would it be better to use less trajectories as an example or just explain the code without running it (could have saved output in the notebook for illustration).

This repository is massive

Are all of the files in this repository strictly necessary for the purposes of the tutorials. I've just done a fresh clone and see the following...

The compressed repository size:

git bundle create tmp.bundle --all 
du -sh tmp.bundle 
522M    tmp.bundle

The unpacked size:

du -h --max-depth=1 
2.0G    ./04_fep
3.8M    ./01_introduction
523M    ./.git
27M     ./03_steered_md
372K    ./LIVECOMS
141M    ./02_funnel_metad
3.2G    .

Files for the FEP section are 2GB in size! Do we really need absolutely everything here? If this repository is intended to be persistent for the purposes of a living journal then it will quickly become unwieldy if we make edits or add additional files. This will also cause issues for the purposes of hosting these tutorials within a minimal Docker image for hosting on a cloud service.

Will there by any tutorials for different FEP protocols such as non-equilibrium FEP or Hamiltonian Replica Exchange?

Hi,

I just discovered this repo and look forward to having a play around with BioSimSpace. I was hoping to conduct some non-eq FEP calculations such as those detailed in this paper. I wonder whether this will be possible using your API. It's my understanding that during the production stage, there needs to be functionality to change lambda values over the course of the short transition simulations between state A and state B. And then during the analysis stage implementations of the Crooks fluctuation and Jarzynski theorems are required.

I'd also be interested in eventually attempting to implement Hamiltonian Replica Exchange
(HREX) as a method for eq-FEP.

Are there any considerations for making these non-canonical FEP protocols easily implemented in BioSimSpace? Or would anyone have any pointers for how to go about implementing such protocols with existing architecture?

Thanks,
Noah

ABFE

I have performed funnel metadynamics simulation using the Biosimspace. During ABFE calculation, I am a little bit confused.
As in the tutorial file "https://github.com/michellab/BioSimSpaceTutorials/blob/main/02_funnel_metad/03_bss-fun-metad-analysis.ipynb"

mentioning the:

the last 0.5 nm of the projection CV are between indices -75 to -25

free_nrg_floats = [ i.kj_per_mol().magnitude() for i in free_nrg0[1][-75:-25] ]

I am not understanding this line. What would be the indices? Mean how we know about the indices range from our data.
I am looking forward to hearing from you.
Thanks in advance

Funnel Metadynamics tutorial error

This looks like a great repository however I cannot get the funnel metadynamics to run.

I am using version BioSimSpace 2022.2.1, AmberTools20 and Gromacs 2021.3

I am getting the following error by running the steps in the tutorial both with the pdb file from the repo and with the file generated by the cells above. The generated PDB file contains indeed no box information.
I also tried using the solvated system set up above but this produces a different error (AttributeError: 'Length' object has no attribute 'magnitude')

---------------------------------------------------------------------------
IncompatibleError                         Traceback (most recent call last)
Input In [12], in <cell line: 5>()
      2 system = BSS.IO.readMolecules(["input_files/solvated.pdb"])
      4 # Create the funnel parameters.
----> 5 p0, p1 = BSS.Metadynamics.CollectiveVariable.makeFunnel(system)
      7 # Define the collective variable.
      8 funnel_cv = BSS.Metadynamics.CollectiveVariable.Funnel(p0, p1)

File ~/miniconda3/envs/biosimspace/lib/python3.9/site-packages/BioSimSpace/Metadynamics/CollectiveVariable/_funnel.py:751, in makeFunnel(system, protein, ligand, alpha_carbon_name, property_map)
    749 space_prop = property_map.get("space", "space")
    750 if space_prop not in system._sire_object.propertyKeys():
--> 751     raise _IncompatibleError("The system contains no simulation box property!")
    753 # Store the space.
    754 space = system._sire_object.property("space")

IncompatibleError: The system contains no simulation box property!

Hysteresis and deviation of TYK2 example

Dear BioSimSpace developers:
Thanks for the detailed tutorials provided here, it offers a good starting point for beginners like me to start simulation. Followed the tutorial, I've conducted several free energy perturbation tasks, but there are two common problems I've encountered frequently:

  1. Hysteresis between 'growing' and 'shrinking' perturbations is large.
  2. Deviation between different runs is large.

I used to believe that's because I did not precisely follow the tutorial (eg. I used Gromacs to equilibrate the system ). But, after analyzing the fep output in this repo, I found the same problems exist here.

For example, the average hysteresis in somd_run_1_i.csv is 1.23 kcal/mol, for reference, the average hysteresis of TYK2 in this paper is 0.35 kcal/mol (calculated from the fep_results_freenrgworkflow_cresset_valid.csv in the supplementary material).

For deviation between different runs, the bar plot shows that some of the perturbations exceed 1 kcal/mol:
image

Are there some good practices we could follow to reduce the hysteresis and deviation? Thanks! ๐Ÿค—

FEP tutorial for protein-protein binding

Hi guys, thanks for your series of FEP tutorials, BioSimSpace is a powerful tool to assess the protein-small molecule ligand binding affinities.
Recently, I'm trying to model protein-protein binding by free energy calculations, which were used to optimize antibody potency. After some search, I found there was no such information related to that topic in BioSimSpaces' materials. I would like to ask:

  1. Would that be feasible for the current version of BioSimSpace?
  2. If it's not feasible yet, would that capability be added in the future?

Thanks~

General todo's questions for finishing live coms paper

Posting already but will edit more to fill in details in abit.

Text:

General readme:

  • Make sure it contains all essential information including installation etc. It seems outdated right now.

Change log:

  • Do we want to maintain one big changelog or individual ones for each tutorial section?

Section checks:

  • intro
  • ...

Figures:

  • Where are the figures being kept for the tutorial paper? Can the files that made figures be added somewhere to the repo?

Tutorial checks:

  • Tutorial 01: @lohedges

    • Tested that all scripts run
    • uniformised feel of headers of all notebooks
    • clear instruction on how to execute the tutorial in terms of order etc.
    • Checked spelling
  • Tutorial 02:

    • Tested that all scripts run
    • uniformised feel of headers of all notebooks
    • clear instruction on how to execute the tutorial in terms of order etc.
    • Checked spelling
  • Tutorial 03: @AdeleHardie

    • Tested that all scripts run
    • uniformised feel of headers of all notebooks
    • clear instruction on how to execute the tutorial in terms of order etc.
    • Checked spelling
  • Tutorial 04: @ppxasjsm, @annamherz @fjclark

    • Tested that all scripts run
    • uniformised feel of headers of all notebooks
    • clear instruction on how to execute the tutorial in terms of order etc.
    • Checked spelling

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.