Code Monkey home page Code Monkey logo

rpreactor's Introduction

Anaconda-Server Badge Anaconda-Server Badge

rpreactor

A command-line and python package to handle biochemical reaction rules.

rpreactor is designed to use reaction rules from RetroRules, and to be at the core of more complex bioretrosynthesis tools such as RetroPathRL. It relies extensively on RDKit to handle chemicals and reactions.

Please submit your questions or any new issue you may encounter with rpreactor using GitHub's issue system.

If you are interested in rpreactor you may also be interested in:

  • Reactor: ChemAxon's "A high performance virtual synthesis engine"
  • ATLAS: "A repository of all possible biochemical reactions for synthetic biology and metabolic engineering studies" by our friends from the LCSB
  • MINE: "Metabolic In silico Network Expansion (MINE) Database Construction and DB Logic" by our friends from the Tyo Lab

Installation

Important: rpreactor works with Python >=3.6 and was tested with rdkit 2020.03*.

We strongly recommend you to use conda package manager, and to follow those steps:

# installation in a new conda environment <myenv>
conda create --name <myenv> -c conda-forge -c brsynth rdkit=2020.03 rpreactor
conda activate <myenv>
# installation in an already existing environment <myenv>
conda activate <myenv>
conda install -c conda-forge -c brsynth rdkit=2020.03 rpreactor

Usage

From command line:

conda activate <myenv>
python -m rpreactor.cli --help
python -m rpreactor.cli --with_hs true inline --inchi "InChI=1/C3H6O3/c1-2(4)3(5)6/h2,4H,1H3,(H,5,6)" --rsmarts "([#8&v2:1](-[#6&v4:2](-[#6&v4:3](-[#8&v2:4]-[#1&v1:5])=[#8&v2:6])(-[#6&v4:7](-[#1&v1:8])(-[#1&v1:9])-[#1&v1:10])-[#1&v1:11])-[#1&v1:12])>>([#15&v5](=[#8&v2])(-[#8&v2]-[#1&v1])(-[#8&v2]-[#1&v1])-[#8&v2:1]-[#6&v4:2](-[#6&v4:3](-[#8&v2:4]-[#1&v1:5])=[#8&v2:6])(-[#6&v4:7](-[#1&v1:8])(-[#1&v1:9])-[#1&v1:10])-[#1&v1:11].[#7&v3](=[#6&v4]1:[#7&v3]:[#6&v4](-[#8&v2]-[#1&v1]):[#6&v4]2:[#7&v3]:[#6&v4](-[#1&v1]):[#7&v3](-[#6&v4]3(-[#1&v1])-[#8&v2]-[#6&v4](-[#6&v4](-[#8&v2]-[#15&v5](=[#8&v2])(-[#8&v2]-[#1&v1])-[#8&v2]-[#15&v5](-[#8&v2]-[#1&v1:12])(=[#8&v2])-[#8&v2]-[#1&v1])(-[#1&v1])-[#1&v1])(-[#1&v1])-[#6&v4](-[#8&v2]-[#1&v1])(-[#1&v1])-[#6&v4]-3(-[#8&v2]-[#1&v1])-[#1&v1]):[#6&v4]:2:[#7&v3]:1-[#1&v1])-[#1&v1])"

From within a script:

import json
import rpreactor

inchi = 'InChI=1/C3H6O3/c1-2(4)3(5)6/h2,4H,1H3,(H,5,6)'
rsmarts = '([#8&v2:1](-[#6&v4:2](-[#6&v4:3](-[#8&v2:4]-[#1&v1:5])=[#8&v2:6])(-[#6&v4:7](-[#1&v1:8])(-[#1&v1:9])-[#1&v1:10])-[#1&v1:11])-[#1&v1:12])>>([#15&v5](=[#8&v2])(-[#8&v2]-[#1&v1])(-[#8&v2]-[#1&v1])-[#8&v2:1]-[#6&v4:2](-[#6&v4:3](-[#8&v2:4]-[#1&v1:5])=[#8&v2:6])(-[#6&v4:7](-[#1&v1:8])(-[#1&v1:9])-[#1&v1:10])-[#1&v1:11].[#7&v3](=[#6&v4]1:[#7&v3]:[#6&v4](-[#8&v2]-[#1&v1]):[#6&v4]2:[#7&v3]:[#6&v4](-[#1&v1]):[#7&v3](-[#6&v4]3(-[#1&v1])-[#8&v2]-[#6&v4](-[#6&v4](-[#8&v2]-[#15&v5](=[#8&v2])(-[#8&v2]-[#1&v1])-[#8&v2]-[#15&v5](-[#8&v2]-[#1&v1:12])(=[#8&v2])-[#8&v2]-[#1&v1])(-[#1&v1])-[#1&v1])(-[#1&v1])-[#6&v4](-[#8&v2]-[#1&v1])(-[#1&v1])-[#6&v4]-3(-[#8&v2]-[#1&v1])-[#1&v1]):[#6&v4]:2:[#7&v3]:1-[#1&v1])-[#1&v1])'

o = rpreactor.RuleBurner(rsmarts_list=[rsmarts], inchi_list=[inchi], with_hs=True)
o.compute()
res = json.loads('[' + ', '.join(o._json) + ']')

For developers

Development installation

After a git clone:

cd <repository>
conda env create -f environment.yml -n <dev_env>
conda activate <dev_env>
conda develop -n <dev_env> .

Warning: if you do not specify an environment name with -n <dev_env>, then 'dev_rpreactor' will be used.

Test your installation with:

conda activate <dev_env>
python -m rpreactor.cli -h

To uninstall:

conda deactivate
conda env remove -n <dev_env>

Test suite

cd <repository>
python -m pytest --doctest-modules 

Build the documentation

To build a local Sphinx HTML documentation at <repository>/docs/_build/html/index.html:

cd <repository>/docs
make html

Build and deployment

The process is automated with GitHub's Action.

If you want to check the build process locally:

CONDA_BLD_PATH=<repository>/conda-bld
mkdir -p ${CONDA_BLD_PATH} 
cd <repository>

conda env create -f recipe/conda_build_env.yaml -n <build_env>
conda activate <build_env>
conda build -c conda-forge --output-folder ${CONDA_BLD_PATH} recipe

conda convert --platform osx-64 --platform linux-64 --platform win-64 --output-dir ${CONDA_BLD_PATH} ${CONDA_BLD_PATH}/*/rpreactor-*

rpreactor's People

Contributors

bdelepine avatar tduigou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

rpreactor's Issues

handle stereochemistry in reaction rules

Stereochemistry is not handled and raises a rpreactor.Utils.ChemConversionError and a traceback.

TODO:

  • nicer "warning" message
  • add a test with stereo
  • implement the stereochemistry

Example:

python -m rpreactor.cli --with_hs true --match_timeout 1 --fire_timeout 1 --with_stereo true  inline --inchi "InChI=1S/C12H22O11/c13-1-3-5(15)6(16)9(19)12(22-3)23-10-4(2-14)21-11(20)8(18)7(10)17/h3-20H,1-2H2/t3-,4-,5+,6+,7-,8-,9-,10-,11?,12+/m1/s1" --rsmarts "([#6@&v4:1](-[#8&v2:2])(-[#1&v1:3])(-[#6&v4:4])-[#6@@&v4:5](-[#8&v2:6]-[#6@@&v4:7]1(-[#1&v1:8])-[#6@@&v4:9](-[#8&v2:10]-[#1&v1:11])(-[#1&v1:12])-[#6@&v4:13](-[#8&v2:14])(-[#1&v1:15])-[#6&v4:16]-[#8&v2:17]-[#6@@&v4:18]-1(-[#6&v4:19](-[#8&v2:20])(-[#1&v1:21])-[#1&v1:22])-[#1&v1:23])(-[#1&v1:24])-[#8&v2:25]-[#6&v4:26])>>([#6@&v4:1](-[#8&v2:2])(-[#1&v1:3])(-[#6&v4:4])-[#6@@&v4:5](-[#8&v2:6]-[#1&v1])(-[#1&v1:24])-[#8&v2:25]-[#6&v4:26].[#6@@&v4:13]1(-[#8&v2:14])(-[#1&v1:15])-[#6@&v4:9](-[#8&v2:10]-[#1&v1:11])(-[#1&v1:12])-[#6@@&v4:7](-[#8&v2]-[#1&v1])(-[#1&v1:8])-[#6@&v4:18](-[#6&v4:19](-[#8&v2:20])(-[#1&v1:21])-[#1&v1:22])(-[#1&v1:23])-[#8&v2:17]-[#6&v4:16]-1)" 
09/09/2020 15:28:33 -- CRITICAL -- Stereo is not handled at the time being.
Traceback (most recent call last):
  File "/workdir/rpreactor/cli.py", line 251, in compute
    rd_mol = standardize_chemical(rd_mol, with_hs=self._with_hs, with_stereo=self._with_stereo, heavy=True)
  File "/workdir/rpreactor/Utils.py", line 77, in standardize_chemical
    raise ChemConversionError("Stereo is not handled at the time being.")
rpreactor.Utils.ChemConversionError: CHEM-CONVERSION-ERROR: Stereo is not handled at the time being.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/conda/envs/dev_rpreactor/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/opt/conda/envs/dev_rpreactor/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/workdir/rpreactor/cli.py", line 448, in <module>
    __cli()
  File "/workdir/rpreactor/cli.py", line 444, in __cli
    args.func(args)
  File "/workdir/rpreactor/cli.py", line 352, in inline_mode
    r.compute()
  File "/workdir/rpreactor/cli.py", line 253, in compute
    raise ChemConversionError(e) from e
rpreactor.Utils.ChemConversionError: CHEM-CONVERSION-ERROR: CHEM-CONVERSION-ERROR: Stereo is not handled at the time being.

Timeout issue

From the README:

Buggy tales
20190110.01 -- Timeout issue

  • Checking results: timeout neither reached... Maybe fine but strange.
  • Checking results: bunch of error messages mentioning unknown "pool" variable (e.g. "match_error": "name 'pool' is not defined")
  • Bug: typo led to wrongly handling Pool of process
  • Fix: commit ca9ff17 @ rule_fire
  • Imply: rerun jobs impacted (DONE, all jobs have been reran)

@tduigou, it's not clear to me if it has been fixed and the text above is merely a backlock; should we close the issue?

Empty InChI for products

From the README:

Buggy tales
20190110.02 -- Empty InChI for products

  • Checking results: some inchi string are empty, e.g.: rule (radius 0) 8a4666fc9dab applied on MNXM366
    • 8a4666fc9dab: ([#6&v4:1]1:[#6&v4:2]:[#6&v4:3]:[#6&v4:4]:[#7&v3:5]:[#6&v4:6]:1)>>([#6&v4:1](-[#6&v4:2]-[#6&v4:3]:[#6&v4:4]-[#7&v3:5]-[#1&v1])(-[#6&v4:6](-[#7&v3](-[#1&v1])-[#1&v1])-[#1&v1])-[#1&v1])
    • MNXM366: [H][O][c]1[c]([C]([H])([H])[H])[n][c]([H])[c]([C]([H])([H])[O][P](=[O])([O][H])[O][H])[c]1[C]([H])([H])[N]([H])[H]
  • Reason: sanitization issues of products (see below) leading to empty InChI.
[16:06:30] non-ring atom 2 marked aromatic
10/01/2019 16:06:30 -- WARNING -- Partial sanization only
[16:06:30] ERROR: Unrecognized bond type: 0
[16:06:30] non-ring atom 2 marked aromatic
10/01/2019 16:06:30 -- WARNING -- Partial sanization only
[16:06:30] ERROR: Unrecognized bond type: 0
  • Reason: probably due to rule miss-encoding. Below are rule known to raise with bug (checked for radius 1, 3, 5, 7, 9, 10):
Count Rule_hash
15 24f4a1100077
57 300bddc687b7
15 88eee4c1d443
57 e9f88e91236e
  • Consequences: probable wrong transformations.
  • Workaround: reject all results from a couple substrate-rule as soon as an "empty" InChI is produced
  • Fix: commit e980692 @ rule_fire
  • Imply: take care of this situation in results already generated / rerun code and make new results (DONE: new results generated)

Cannot convert back InChI to RDKit mol

From the README:

Buggy tales
20190110.03 -- Cannot convert back InChI to RDKit mol

  • Rule (radius 0) 3012ddcc4356 applied on MNXM2183
    • 3012ddcc4356: ([#6&v4:1](-[#7&v3:2]:[#6&v4:3](:[#7&v3:4]):[#6&v4:5])-[#8&v2:6])>>([#6&v4:1]=[#8&v2].[#7&v3:2]:[#6&v4:3](:[#7&v3:4]-[#1&v1])-[#6&v4:5].[#8&v2:6]-[#1&v1])
  • NXM2183: [H][N]=[c]1[n][c]([O][H])[c]2[n][c]([H])[n]([C]3([H])[O][C]([H])([C]([H])([H])[O][H])[C]([H])([O][P](=[O])([O][H])[O][H])[C]3([H])[O][H])[c]2[n]1[H]
  • Reason: issue with valencies, aromatic cycle and nitrogen.
  • Fix: none at the moment, but should probably be handled at the result parsing step (i.e. after rule firing)

The unkillable Pool

From the README:

Buggy tales
20190112.01 -- The unkillable Pool

  • Execution hang (sometimes) when renewing the Pool (done after every time out)
  • Due to timed out job lock that cannot be acquired (deadlock)
  • Issue do not happen every time
  • 100% chance of hang when the time out == 0. Hypothesis: the job was not fully initialized, cannot acquire a lock that have not been set up
  • But hang can also happen if more "reasonable" timeout condition (e.g. 5 seconds). Hypothesis: the job have been killed while it was actually ending, something went wrong.
  • Straightforward fix for all cases: force to release the job lock each time. Commit 52466a5 @ rule_file

stereochemistry assignment from MolToSmiles(sanitize=True) is not the same as MolToSmiles + SanitizeMol

From the README.

TD201904.01 -- stereochemistry assignment from MolToSmiles(sanitize=True) is not the same as MolToSmiles + SanitizeMol

Source (april 2019): rdkit/rdkit#2361

In the second case the stereo assignment is not made:

In [44]: m = AllChem.MolFromSmiles('[O-][n+]1onc2cc(/C=C/c3ccc(Cl)cc3)ccc21', sanitize=True)

In [45]: AllChem.MolToInchiKey(m)
Out[45]: 'AALOGNDNCMFSSI-OWOJBTEDSA-N'

In [46]: m = AllChem.MolFromSmiles('[O-][n+]1onc2cc(/C=C/c3ccc(Cl)cc3)ccc21', sanitize=False)

In [47]: AllChem.SanitizeMol(m, Chem.rdmolops.SanitizeFlags.SANITIZE_ALL)
Out[47]: rdkit.Chem.rdmolops.SanitizeFlags.SANITIZE_NONE

In [48]: AllChem.MolToInchiKey(m)
Out[48]: 'AALOGNDNCMFSSI-UHFFFAOYSA-N'

For the case of calling SanitizeMol after MolFromSmiles you can
force rdkit to calculate the correct InChI key by calling
AllChem.AssignStereochemistry(m1, cleanIt=True, force=True)
before calculating the key.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.