igashov / difflinker Goto Github PK
View Code? Open in Web Editor NEWDiffLinker: Equivariant 3D-Conditional Diffusion Model for Molecular Linker Design
License: MIT License
DiffLinker: Equivariant 3D-Conditional Diffusion Model for Molecular Linker Design
License: MIT License
delinker_utils/sascorer.py is calling a module of RDKit that was removed in Release_2024.03.1
See change log: https://github.com/rdkit/rdkit/blob/master/ReleaseNotes.md#code-removed-in-this-release-1
I encountered this when trying to run one of your case studies:
$ python ./DiffLinker/generate_with_protein.py \
--fragments 3hz1_modified_fragments_obabel.sdf \
--protein 3hz1_protein.pdb \
--output samples \
--model models/pockets_difflinker_full_given_anchors.ckpt \
--linker_size models/zinc_size_gnn.ckpt \
--anchors 12,22 \
--n_samples 1000 \
--max_batch_size 16 \
--random_seed 1
Traceback (most recent call last):
File "[...]/DiffLinker/generate_with_protein.py", line 14, in <module>
from src.lightning import DDPM
File "[...]/DiffLinker/src/lightning.py", line 7, in <module>
from src import metrics, utils, delinker
File "[...]/DiffLinker/src/delinker.py", line 7, in <module>
from src.delinker_utils import sascorer, calc_SC_RDKit
File "[...]/DiffLinker/src/delinker_utils/sascorer.py", line 22, in <module>
from rdkit.six.moves import cPickle
ModuleNotFoundError: No module named 'rdkit.six'
Hey, is it possible to run cpu mode? I saw it's possible from the generate.py
, but when I tried I got this error:
python -W ignore /home/softwares/DiffLinker/generate.py --fragments frag.sdf --model models/geom_difflinker.ckpt --linker_size models/geom_size_gnn.ckpt --anchors 9,19
Will generate linkers with sampled numbers of atoms
Sampling...
0%| | 0/1 [00:01<?, ?it/s]
Traceback (most recent call last):
File "/home/softwares/DiffLinker/generate.py", line 187, in <module>
main(
File "/home/softwares/DiffLinker/generate.py", line 156, in main
chain, node_mask = ddpm.sample_chain(data, sample_fn=sample_fn, keep_frames=1)
File "/home/softwares/DiffLinker/src/lightning.py", line 449, in sample_chain
chain = self.edm.sample_chain(
File "/home/softwares/mambaforge3/envs/difflinker/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/home/softwares/DiffLinker/src/edm.py", line 152, in sample_chain
z = self.sample_p_zs_given_zt_only_linker(
File "/home/softwares/DiffLinker/src/edm.py", line 188, in sample_p_zs_given_zt_only_linker
eps_hat = self.dynamics.forward(
File "/home/softwares/DiffLinker/src/egnn.py", line 383, in forward
edges = self.get_edges(n_nodes, bs) # (2, B*N)
File "/home/softwares/DiffLinker/src/egnn.py", line 464, in get_edges
return self.get_edges(n_nodes, batch_size)
File "/home/softwares/DiffLinker/src/egnn.py", line 459, in get_edges
edges = [torch.LongTensor(rows).to(self.device), torch.LongTensor(cols).to(self.device)]
File "/home/softwares/mambaforge3/envs/difflinker/lib/python3.10/site-packages/torch/cuda/__init__.py", line 216, in _lazy_init
torch._C._cuda_init()
RuntimeError: No CUDA GPUs are available
hi,
if using generate_with_protein.py
or generate_with_pocket.py
method, the para --anchors is necessary. however,
In order to take into account the diversity of compounds generated, is there any approach that takes into account the combination and limitation of pockets but doesn't limit the location of the anchors for the fragment links?
many many thanks,
Sh-Y
Hi, you guys did a good job about connecting diffusion model and de novo design based on fragments.
I'm now working on design small molecules, I want to use this model to work, but after I run the code I met some problems.
hi,
thank you to provide so interesting and powful tool to generate linkers.
while I try to repeat your model, run into the error
the below command line I used to run the model om_difflinker_given_anchors.ckpt
!python -W ignore generate_with_protein.py --fragments fragments/frags_3hz1.sdf \
--protein proteins/pro_3hz1_protein.pdb --model models/geom_difflinker_given_anchors.ckpt \
--linker_size 3 --anchors 14,11
and the anchors were set as you mentioned in other issues.
but the error:
Will generate linkers with 3 atoms
[15:28:10] Can't kekulize mol. Unkekulized atoms: 0 1 2 3 4 5 6 7 8
could you please provide some suggesitions about how to fix it and make it work?
many thanks,
Sh-Y
Hi,
How do you sample with protein??
thanks
After using your weights and trying to find the linker of the fragment with the following code:
generate.main(
input_path = "...generator_0.sdf",
model = "models/geom_difflinker.ckpt",
output_dir = "difflinker/output",
n_samples = 2,
n_steps = None,
linker_size = "...difflinker/models/geom_size_gnn.ckpt",
anchors = None)
I got the following error:
File "difflinker/src/egnn.py", line 134, in init
self.device = device
File "lib/python3.7/site-packages/torch/nn/modules/module.py", line 1317, in setattr
super().setattr(name, value)
AttributeError: can't set attribute
It was happening when I was trying to load the weights. After commenting parts with self.device = device in several files from src it started compiling properly. Please let me know I misunderstand smth or if you have experienced the same. Just in case I'm working with cuda if it might matter
Hi,
Thanks for your amazing work!
I am particularly interested in the metric of "steric clashes" as described in your paper
I am attempting to implement this metric in my project, but I did not find specific code in your repository for calculating this metric. Could you please provide it, or guide me on how to implement it?
Thank you for your time and consideration!
I am currently conducting research on structure-based drug design using proteins.
I find the concept of fragmentation linking to be a valuable approach in drug design, and I am particularly impressed with your work's ability to consider the conditioning on the protein pocket. Thank you for hard working on it!
I have a few questions regarding your research:
First, I attempted to reproduce the results from your paper using the Sampling section (https://github.com/igashov/DiffLinker#sampling). However, I noticed that the results for the ZINC and GEOM datasets differ significantly from the paper's reported results, especially concerning the SA score. While the paper's SA score is approximately 3.x, my results yielded a score of 6.x. I'm unsure why these results are different. Is there an additional step required to accurately reproduce the paper's findings?
Unfortunately, as mentioned in the readme, there is no pocket linker prediction model available. Therefore, I was unable to conduct experiments with the Pocket dataset. Could you provide some suggestions on how I can reproduce the paper's results without this model? Additionally, I am curious about the linker prediction model used in the Table 5 pocket section.
I came across Figure 2 in the paper, which showcases examples of linkers sampled by DiffLinker conditioned on pocket atoms. I attempted to replicate these results using the same molecule fragments from the MOAD dataset and the awesome hugging face. However, I couldn't achieve the same results as presented in the paper, even when utilizing the same protein anchors. Could you please guide me on how to accurately reproduce the results shown in Figure 2?
I am interested in reproducing the results shown in Figure 4 and 5 from the paper. However, I encountered a challenge as there is no index provided in the paper, which prevents me from attempting the test. Could you kindly provide me with the necessary information about the fragments used in the Figure 4 and 5 datasets? This would be immensely helpful in my efforts to replicate the results accurately
If you require any specific information or have any additional questions to aid in reproducing my experiments, please let me know, and I will promptly provide the requested details.
Thank you for your time and consideration.
Your work is especially helpful! I found that the two fragments in the SDF file you provided are complete molecules with 3D structures. If I have two fragments, must they also be complete molecules and 3D structures? And I want to ask are fragments from ligands in the original PDB protein structure broken up?
My fragment molecule for example: c1ccc((=O)(O)NC(C)(C)C)cc1”,c1nnn1
Hi DiffLinker Team,
Thank you for your great effort on this.
And I was wondering if the model is possible for sampling linkers for larger molecules (like connecting 2 10-mer peptides).
I tried using existing model (I know it's not appropriate, I just want to check if the model can run without error) , the model raised NanError during sample_p_zs_given_zt_only_linker
. Is it because the inputs contains too many atoms?
Thank you for you patient and help!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.