This repository contains the code and data for the paper "Evaluation Framework for AI-driven Molecular Design of Multi-target Drugs: Brain Diseases as a Case Study".
The code is written in Python 3.11.6. To install the required packages, run the following command:
$ pip install -r requirements.txt
To download the required data and train the QSAR models for the brain diseases case study, run the following commands:
$ python scripts/external_data_sources.py
$ python scripts/qsar_pipeline.py
The target selction process is described in target-selection/README.md
.
- Build the dataset for the de Novo Design experiment:
$ python guacamol/data/get_data.py --holdout holdout_set_gcm_multitarget.smiles --destination guacamol/data/
- (Optional) Compute the top K molecules for the de Novo Design experiment:
$ python scripts/top_k.py
The top K molecules are stored in the guacamol/data/top_k
folder and save time when running the de Novo Design models over the benchmarks.
- Run the de Novo Design models over the benchmarks:
$ python scripts/assess_baselines.py
The results of the experiments are saved in the reports
folder. We can evaluate the QSAR models on the lead optimization task by running the following command:
$ python scripts/lo_task_benchmark.py