Code Monkey home page Code Monkey logo

Build Status

DISCLAIMER:

This code gives the tensorflow implementation of PaccMann as of our paper in Molecular Pharmaceutics.

PaccMann

paccmann is a package for drug sensitivity prediction and is the core component of the repo.

The package provides a toolbox of learning models for IC50 prediction using drug's chemical properties and tissue-specific cell lines gene expression.

Citation

Please cite us as follows:

@article{oskooei2018paccmann,
  title={PaccMann: Prediction of anticancer compound sensitivity with multi-modal attention-based neural networks},
  author={Oskooei, Ali and Born, Jannis and Manica, Matteo and Subramanian, Vigneshwari and S{\'a}ez-Rodr{\'\i}guez, Julio and Mart{\'\i}nez, Mar{\'\i}a Rodr{\'\i}guez},
  journal={arXiv preprint arXiv:1811.06802},
  year={2018}
}

@article{manica2019paccmann,
author = {Manica, Matteo and Oskooei, Ali and Born, Jannis and Subramanian, Vigneshwari and Saez-Rodriguez, Julio and Rodriguez Martinez, Maria},
title = {Toward Explainable Anticancer Compound Sensitivity Prediction via Multimodal Attention-Based Convolutional Encoders},
journal = {Molecular Pharmaceutics},
year = {2019},
doi = {10.1021/acs.molpharmaceut.9b00520},
note = {PMID: 31618586},
}

Installation

Setup of the virtual environment

We strongly recommend to work inside a virtual environment (venv).

Create the environment:

python3 -m venv venv

Activate it:

source venv/bin/activate

Module installation

The module can be installed either in editable mode:

pip3 install -e .

Or as a normal package:

pip3 install .

Models training

Models can be trained using the script bin/training_paccmann that is installed together with the module. Check the examples for a quick start. For more details see the help of the training command by typing training_paccmann -h:

usage: training_paccmann [-h] [-save_checkpoints_steps 300]
                         [-eval_throttle_secs 60] [-model_suffix]
                         [-train_steps 10000] [-batch_size 64]
                         [-learning_rate 0.001] [-dropout 0.5]
                         [-buffer_size 20000] [-number_of_threads 1]
                         [-prefetch_buffer_size 6]
                         train_filepath eval_filepath model_path
                         model_specification_fn_name params_filepath
                         feature_names

Run training of a `paccmann` model.

positional arguments:
  train_filepath        Path to train data.
  eval_filepath         Path to eval data.
  model_path            Path where the model is stored.
  model_specification_fn_name
                        Model specification function. Pick one of the
                        following: ['dnn', 'rnn', 'scnn', 'sa', 'ca', 'mca'].
  params_filepath       Path to model params. Dictionary with parameters
                        defining the model.
  feature_names         Comma separated feature names. Select from the
                        following: ['smiles_character_tokens',
                        'smiles_atom_tokens', 'fingerprints_256',
                        'fingerprints_512', 'targets_10', 'targets_20',
                        'targets_50', 'selected_genes_10',
                        'selected_genes_20', 'cnv_min', 'cnv_max', 'disrupt',
                        'zigosity', 'ic50', 'ic50_labels'].

optional arguments:
  -h, --help            show this help message and exit
  -save_checkpoints_steps 300, --save-checkpoints-steps 300
                        Steps before saving a checkpoint.
  -eval_throttle_secs 60, --eval-throttle-secs 60
                        Throttle seconds between evaluations.
  -model_suffix , --model-suffix 
                        Suffix for the trained moedel.
  -train_steps 10000, --train-steps 10000
                        Number of training steps.
  -batch_size 64, --batch-size 64
                        Batch size.
  -learning_rate 0.001, --learning-rate 0.001
                        Learning rate.
  -dropout 0.5, --dropout 0.5
                        Dropout to be applied to set and dense layers.
  -buffer_size 20000, --buffer-size 20000
                        Buffer size for data shuffling.
  -number_of_threads 1, --number-of-threads 1
                        Number of threads to be used in data processing.
  -prefetch_buffer_size 6, --prefetch-buffer-size 6
                        Prefetch buffer size to allow pipelining.

PaccMann's Projects

docs icon docs

Documentation for PaccMann service

fdsa icon fdsa

A fully differentiable set autoencoder

moses icon moses

Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models

paccmann_datasets icon paccmann_datasets

pytoda - PaccMann PyTorch Dataset Classes. Read the docs: https://paccmann.github.io/paccmann_datasets/

paccmann_generator icon paccmann_generator

Generative models for transcriptomic-driven or protein-driven molecular design (PaccMann^RL).

paccmann_gp icon paccmann_gp

PyTorch/skopt based implementation of Bayesian optimization with Gaussian processes - build to optimize latent spaces of VAEs to generate molecules with desired properties

paccmann_kinase_binding_residues icon paccmann_kinase_binding_residues

Comparison of active site and full kinase sequences for drug-target affinity prediction and molecular generation. Full paper: https://pubs.acs.org/doi/10.1021/acs.jcim.1c00889

paccmann_polymer icon paccmann_polymer

Graph-regularized VAE and the impact of topology on learned representations

paccmann_predictor icon paccmann_predictor

PyTorch implementation of bimodal neural networks for drug-cell (pharmarcogenomics) and drug-protein (proteochemometrics) interaction prediction

paccmann_rl icon paccmann_rl

Code pipeline for the PaccMann^RL in iScience: https://www.cell.com/iscience/fulltext/S2589-0042(21)00237-6

paccmann_sarscov2 icon paccmann_sarscov2

Code for paper on automation of discovery and synthesis of targeted molecules: https://iopscience.iop.org/article/10.1088/2632-2153/abe808

tape icon tape

Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology.

titan icon titan

Code for "T Cell Receptor Specificity Prediction with Bimodal Attention Networks" (https://doi.org/10.1093/bioinformatics/btab294, ISMB 2021)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.