Code Monkey home page Code Monkey logo

esnuel's Introduction


ESNUEL (EStimating NUcleophilicity and ELectrophilicity) is a fully automated quantum chemistry (QM)-based workflow that automatically identifies nucleophilic and electrophilic sites and computes methyl cation affinities (MCAs) and methyl anion affinities (MAAs) to estimate nucleophilicity and electrophilicity, respectively.

TRY ESNUEL: https://www.esnuel.org

Installation

For the installation, we recommend using conda to get all the necessary dependencies:

conda env create -f environment.yml && conda activate esnuel

Then download the binaries of xtb version 6.5.1:

mkdir dep; cd dep; wget https://github.com/grimme-lab/xtb/releases/download/v6.5.1/xtb-6.5.1-linux-x86_64.tar.xz; tar -xvf ./xtb-6.5.1-linux-x86_64.tar.xz; cd ..

Furthermore, ORCA version 5.0.1 must be installed following the instructions found here: https://sites.google.com/site/orcainputlibrary/setting-up-orca

OBS!

  1. The path to ORCA must be modified in "src/esnuel/run_orca.py".
  2. The number of available CPUs and memory must be modified to match your hardware.

Usage

An example of usage via CLI command:

# Create predictions for a test molecule (OBS! Only names without "_" are allowed):
python src/esnuel/calculator.py --smiles 'Cn1c(C(C)(C)N)nc(C(=O)NCc2ccc(F)cc2)c(O)c1=O' --name 'testmol' &

The calculations are now saved in a "./calculations" folder along with a graphical output of the results (in .html format). The graphical output presents the user with the most electrophilic and nucleophilic sites within 3 kcal/mol โ‰ˆ 12.6 kJ/mol being highlighted.

An example of using ESNUEL in batch mode:

# Create predictions for a small dataset (example/testmols.csv):
python src/esnuel/calculator.py -b example/testmols.csv -n 'testmols'

The calculations are now saved in a "./calculations" folder, and a dataframe containing the results is found in "submitit_results/testmol/*_result.pkl"

The SLURM commands can be modified via the following command line arguments:

  • '--partition': The SLURM partition that you submit to, default='kemi1'.
  • '--parallel_calcs': The number of parallel molecule calculations (the total number of CPU cores requested for each SLURM job = parallel_calcs*cpus_per_calc), default=2.
  • '--cpus_per_calc': The number of cpus per molecule calculation (the total number of CPU cores requested for each SLURM job = parallel_calcs*cpus_per_calc), default=4.
  • '--mem_gb': The total memory usage in gigabytes (gb) requested for each SLURM job, default=20.
  • '--timeout_min': The total allowed duration in minutes (min) of each SLURM job, default=6000.
  • '--slurm_array_parallelism': The maximum number of parallel SLURM jobs to run simultaneously (taking one molecule at a time in batch mode), default=25.

For the QM calculations, the molecular charge is defined by the formal charge of the molecule using RDKit, and the spin is hardcoded to S=0 (multiplicity=1), as we focus on closed-shell molecules. This can be modified in the "calculateEnergy" function in "src/esnuel/calculator.py".

Citation

Our work is open access on Digital Discovery, where more information is available.

@article{ree2024esnuel,
  title = {Automated quantum chemistry for estimating nucleophilicity and electrophilicity with applications to retrosynthesis and covalent inhibitors},
  ISSN = {2635-098X},
  url = {http://dx.doi.org/10.1039/D3DD00224A},
  DOI = {10.1039/d3dd00224a},
  journal = {Digital Discovery},
  publisher = {Royal Society of Chemistry (RSC)},
  author = {Nicolai Ree and Andreas H. G\"{o}ller and Jan H. Jensen},
  year = {2024}
}

esnuel's People

Contributors

nicolairee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

esnuel's Issues

Run only with xTB?

Is it possible to calculate the energies if we don't have access to ORCA? I see on esnuel.org website that the energies may be calculated only using xTB. Trying to replicate the same energies launched from the command line, but I'm getting nonsensical numbers (just -inf).
I followed the README and was able to get everything set up besides ORCA.

This is the output I get from running the testmol job
Electrophilic sites:
['Amide', 'double_bond', 'double_bond', 'double_bond', 'double_bond', 'double_bond', 'double_bond', 'double_bond', 'double_bond', 'double_bond', 'dou
ble_bond', 'double_bond']
[9, 2, 7, 8, 10, 13, 14, 15, 16, 20, 22, 23]
[-inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf]
Nucleophilic sites:
['Amide', 'Amine', 'Phenol', 'Pyridine_like_nitrogen', 'double_bond', 'double_bond', 'double_bond', 'double_bond', 'double_bond', 'double_bond', 'dou
ble_bond', 'double_bond', 'double_bond', 'atom_with_lone_pair', 'atom_with_lone_pair', 'atom_with_lone_pair', 'atom_with_lone_pair']
[10, 6, 21, 7, 2, 8, 9, 13, 14, 15, 16, 20, 22, 1, 11, 17, 23]
[-inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf, -inf]
submitit INFO (2024-02-13 22:43:29,906) - Job completed successfully

Great work btw!

Prediction Error for handling SMILES for cyclobutane stereoisomers

When inputting this SMILES string: [NH-][C@H]1CC@HC1
The webtool returns this error: Prediction ERROR! Please submit an issue on GitHub with ID: ffbade4b0a82ef44195598db220bf988

The webtool works for the neutral molecule with stereochemistry specified: N[C@H]1CC@HC1
Also works for the anionic molecule with stereochemistry specified with the other nitrogen deprotonated: N[C@H]1CC@HC1
It also works when I don't specify stereochemistry and deprotonate the desired nitrogen. However, when I click to see the 3D structure, it had arbitrarily run the calculation with the cis diastereomer (I want the trans): [NH-]C1CC(NC(OC(C)(C)C)=O)C1

Any help you can provide for fixing this error would be much appreciated! I really like the web tool, thank you!

Running ESNUEL on Google Colab

@NicolaiRee I'm not sure if I'm running the same problem, but I got the same output.

I was trying to run the code in Jupyter notebook on google Colab with following code:
(I tried original and esnuel-xtb branch and it gave the same output)
calc_MAA_and_MCA('O=C1CCCC1', 'test')

Output:
Electrophilic sites:
Electrophilic sites:
['Ketone', 'double_bond']
[1, 0]
[-inf, -inf]
Nucleophilic sites:
['Ketone']
[0]
[-inf]
([[1, 0]],
[['Ketone', 'double_bond']],
[['CC1([O-])CCCC1', 'CO[C-]1CCCC1']],
[[-inf, -inf]],
[[[[None, 'test/reac01/conf01/gfn1/test_reac01_conf01_gfn1_opt.sdf'],
[None,
'test/reac01prodE01/conf02/gfn1/test_reac01prodE01_conf02_gfn1_opt.sdf']],
[[None, 'test/reac01/conf01/gfn1/test_reac01_conf01_gfn1_opt.sdf'],
[None,
'test/reac01prodE02/conf04/gfn1/test_reac01prodE02_conf04_gfn1_opt.sdf']]]],
[[0]],
[['Ketone']],
[['CO[C+]1CCCC1']],
[[-inf]],
[[[[None, 'test/reac01/conf01/gfn1/test_reac01_conf01_gfn1_opt.sdf'],
[None,
'test/reac01prodN01/conf05/gfn1/test_reac01prodN01_conf05_gfn1_opt.sdf']]]])

I tried this molecule on the website esnuel.org and it gave MAA value = 287.90 kJ/mol when only using xTB
Was I doing something wrong in my code? Please let me know if you need more information.

Thank you!

Originally posted by @tranJen in #1 (comment)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.