Code Monkey home page Code Monkey logo

molecule_generator's Introduction

abstract

Conditional Molecule Generator

This repository contains the source code and data sets for the graph based molecule generator discussed in the article "Multi-Objective De Novo Drug Design with Conditional Graph Generative Model" (https://arxiv.org/abs/1801.07299).

Briefly speaking, we used conditional graph convolution to structure the generative model. The properties of output molecules can then be controlled using the conditional code.

Requirements

This repo is built using Python 2.7, and utilizes the following packages:

  • MXNet == 1.3.1
  • RDKit == 2018.03.3
  • Numpy
  • Scikit-learn (for the predictive model)

To ease the installation process, please use the dockerfile environment defined in the Dockerfile.

Quick start

Project structure

  • train.py: main training script.
  • mx_mg: package for the molecule generative model:
    • data: packages for data processing workflows:
      • conditionals.py: callables used to generate the conditional codes for molecules
      • data_struct.py: defines atom types and bond types
      • dataloaders.py , datasets.py and samplers.py: data loading logics
      • utils.py: utility functions
    • models: library for graph generative models
      • modules.py: define modules (or blocks) such as graph convolution
      • networks.py: define networks (MolMP, MolRNN and CMolRNN)
      • functions.py: autograd.Function objects and operations
    • builders.py: utilities for building molecules using generative models
  • rdkit_contrib: functions used to calculate QED and SAscore (for older version of rdkit)
  • example.ipynb: tutorial

Usage

To train the model, first unpackdatasets.tar.gz (download here) to the current directory, and call:

./train.py {molmp|molrnn|scaffold|prop|kinase} path/to/output

Where {molmp|molrnn|scaffold|prop|kinase} are model types, and path/to/output is the directory where you want to save the model's checkpoint file and log files. The following call:

./train.py {molmp|molrnn|scaffold|prop|kinase} -h

gives help for each model type.

For any questions | problems | criticisms | ...

Please contact me. Email: [email protected] or [email protected]

molecule_generator's People

Contributors

kevinid avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

molecule_generator's Issues

There are some RDkit errors when I run the mxnet version code with pretrained models

When I run the example codes as follows

# loading models
mol_mp = builders.Vanilla_Builder('ckpt/molmp_0/', gpu_id=0)
res = mol_mp.sample(100)

I got the error info as follows:

[09:24:17] Can't kekulize mol.  Unkekulized atoms: 1 2 3 5 7 8 9 11 12 13 17

RDKit ERROR: [09:24:17] Can't kekulize mol.  Unkekulized atoms: 1 2 3 5 7 8 9 11 12 13 17
RDKit ERROR: 
[09:24:17] Can't kekulize mol.  Unkekulized atoms: 3 5 7

RDKit ERROR: [09:24:17] Can't kekulize mol.  Unkekulized atoms: 3 5 7
RDKit ERROR: 
[09:24:22] Can't kekulize mol.  Unkekulized atoms: 3 4 5 6 8 9 10 11 12 13 27

RDKit ERROR: [09:24:22] Can't kekulize mol.  Unkekulized atoms: 3 4 5 6 8 9 10 11 12 13 27
RDKit ERROR: 
[09:24:22] Explicit valence for atom # 38 N, 4, is greater than permitted
RDKit ERROR: [09:24:22] Explicit valence for atom # 38 N, 4, is greater than permitted
98

Is this a normal phenomenon? Many thanks.

Trying to run examples.ipynb on Colab - serializing/multiprocessing issues

Hi! Thanks for providing the source code. I've been trying to get the mx_net implementation to work on an M1 Macbook Pro on Google Colab, since there were quite some issues with the MXNet + Cuda integration locally. I've made the neccesary changes where needed, adapted the imports of the files, but when trying to run the training code, it gives the following error in Colab:

!python3 train.py scaffold /content/molecule_generator

Traceback (most recent call last):
File "/content/molecule_generator/train.py", line 535, in
_engine(**params)
File "/content/molecule_generator/train.py", line 146, in _engine
inputs = [next(it_train) for _ in range(len(gpu_ids))]
File "/content/molecule_generator/train.py", line 146, in
inputs = [next(it_train) for _ in range(len(gpu_ids))]
File "/usr/local/lib/python3.10/dist-packages/mxnet/gluon/data/dataloader.py", line 689, in iter
for item in t:
File "/usr/local/lib/python3.10/dist-packages/mxnet/gluon/data/dataloader.py", line 484, in next
batch = pickle.loads(ret.get(self._timeout))
File "/usr/lib/python3.10/multiprocessing/pool.py", line 774, in get
raise self._value
File "/usr/lib/python3.10/multiprocessing/pool.py", line 540, in _handle_tasks
put(task)
File "/usr/lib/python3.10/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/usr/lib/python3.10/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
File "/usr/lib/python3.10/multiprocessing/pool.py", line 643, in reduce
raise NotImplementedError(
NotImplementedError: pool objects cannot be passed between processes or pickled

I was wondering if anyone else had a similar error, and whether someone manages to fix this one.

Is [#6,#8,#16][#6](=O)O[#6] an alert?

This smart is in the structural alert list. This means that all esters are unwanted structures. Is that so, or the '#6' should be removed from the first square bracket. At this time, it means that compounds like carbonates are containning the alert structures. Thanks!

running errors

I run exmples.ipynb, but received error message like this
No module named 'utils'

I found file utils.py at /data/ folder. Can you tell me why?
I am using python 3.5

Thanks

Error while running the Python3 version of this code

I converted the entire code into Python-3.7 using 2to3 package provided by Python. While running the converted code, I got following error:
AttributeError: Can't pickle local object '_engine.<locals>.<lambda>'

Complete error is as follows:

$ ./train.py molmp logs
Traceback (most recent call last):
  File "./train.py", line 551, in <module>
    _engine(**params)
  File "./train.py", line 133, in _engine
    inputs = [next(it_train) for _ in range(len(gpu_ids))]
  File "./train.py", line 133, in <listcomp>
    inputs = [next(it_train) for _ in range(len(gpu_ids))]
  File "/home/iit/anaconda3/envs/kevinid_experiments/lib/python3.7/site-packages/mxnet/gluon/data/dataloader.py", line 505, in __next__
    batch = pickle.loads(ret.get(self._timeout))
  File "/home/iit/anaconda3/envs/kevinid_experiments/lib/python3.7/multiprocessing/pool.py", line 657, in get
    raise self._value
  File "/home/iit/anaconda3/envs/kevinid_experiments/lib/python3.7/multiprocessing/pool.py", line 431, in _handle_tasks
    put(task)
  File "/home/iit/anaconda3/envs/kevinid_experiments/lib/python3.7/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/home/iit/anaconda3/envs/kevinid_experiments/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object '_engine.<locals>.<lambda>'

Can someone please help me resolve this issue?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.