Code Monkey home page Code Monkey logo

fabind's Introduction

Qizhi Pei

  • 🔭 I’m Qizhi Pei, a second-year PhD student from Gaoling School of Artificial Intelligence, RUC.
  • 🌱 I’m currently doing research about AI4Science, especially for 3D biomolecular modeling and multi-modal learning on biomolecule.
  • 📫 How to reach me:

fabind's People

Contributors

apeterswu avatar eltociear avatar kygao avatar qizhipei avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

fabind's Issues

Does FABind provide some confidence/affinity score?

Hi Qizhi,

Does FABind provide some confidence/affinity score that I use to decide whether a pair of protein-ligand can bind or not?
I have tried setting the confidence score as

pred_index_true = pocket_cls_pred[i][:j].sigmoid().unsqueeze(-1) # pocket predicted probability 
pred_index_false = 1. - pred_index_true
pred_index_prob = torch.cat([pred_index_false, pred_index_true], dim=-1)

pred_index_log_prob = torch.log(pred_index_prob)
pred_index_one_hot = gumbel_softmax_no_random(pred_index_log_prob,
                                                          tau=self.args.gs_tau,
                                                          hard=self.args.gs_hard)
pred_index_one_hot_true = pred_index_one_hot[:, 1].unsqueeze(-1)
pred_confidence_gumbel = pred_index_one_hot_true * pred_index_true
pred_pocket_confidence[i] = pred_confidence_gumbel.sum(dim=0) / pred_index_one_hot_true.sum(dim=0)

However, I find the confidence scores are very high for arbitrary protein-ligand pairs, like > 0.9.
Therefore, I would like to ask if you have a better suggestion.

Segmentation fault

When running evaluation, an error occurs when iterating through the "new_dataset" in the test set at index 17300. The error message is as follows:
1703923755538
1703923807634

No CUDA runtime is found

Hi, while running the inference script after installing all the required packages, I encountered this error. I do have an Nvidia GPU and it was working fine for other projects but not this time.

======  preprocess molecules  ======
No CUDA runtime is found, using CUDA_HOME='/usr/bin/nvcc'
======  preprocess proteins  ======

FABind+

Hi,

I just saw that fabind_inference.py was updated 2 weeks ago. Is it updated to FABind+ or it is just an update of FABind? Are you going to create a new project for FABind+ or just update the FABind project?

Thank you,
Christian

Keyerror "Complex" when infer examples

Hi,

When I used your code to infer your example pdbs, I got no result. I checked the error message and found the following:

Traceback (most recent call last):
File "fabind_inference.py", line 372, in
post_optim_mol(args, accelerator, data, com_coord_pred, com_coord_pred_per_sample_list, com_coord_per_sample_list, compound_batch, LAS_tmp=LAS_tmp, rigid=args.rigid)
File "fabind_inference.py", line 288, in post_optim_mol
com_coord_i = data[i]['compound'].rdkit_coords
File "/home/jiayinjun/miniconda3/envs/fabind/lib/python3.8/site-packages/torch_geometric/data/batch.py", line 177, in getitem
return self.get_example(idx) # type: ignore
File "/home/jiayinjun/miniconda3/envs/fabind/lib/python3.8/site-packages/torch_geometric/data/batch.py", line 124, in get_example
data = separate(
File "/home/jiayinjun/miniconda3/envs/fabind/lib/python3.8/site-packages/torch_geometric/data/separate.py", line 35, in separate
attrs = slice_dict[key].keys()
KeyError: 'complex'

Could please you help me with this error? Thank you very much!

Alternatives to SMILES

How can I start an inference using a pdb for the protien, and a 'mol2' or 'sdf' file for the ligand?

Binding scores?

Hi,

I was able to run your examples w/o problems. However, I did not find a file that contains the binding scores. Is FABind generating only poses and then I need to use a rescoring tool such as BR-Nib to sort the binding poses?

Thanks for you help,
Christian

Processing multiple smiles on one target

Hi,

This is not an issue per se, more a question and suggestions.

in your ex, the structure of the file is:
Cleaned_SMILES,pdb_id
CCC@HC@HC(=O)NC@@HC(=O)NC@@HC(=O)NC@HC(C)C,6efk
CC(C)CCN1c2nc(Nc3cc(F)c(O)c(F)c3)ncc2N(C)C(=O)C1(C)C,6g3c
CC(C)(COP@@(O)OP@(O)OC[C@H]1OC@@HC@H[C@@h]1OP(=O)(O)O)C@@HC(=O)NCCC(=O)NCCO,6n93
O=C(O)c1ccccc1-n1cccc1,6npi

and the output are sd files with a name composed of pdb_id + a number. If I were to generate million of poses on one receptor... so on a virtual screen setup, it would be more practical that instead of the protein_id, it would be a compound id. It is just a suggestion for an eventual update if you do not have already a screening mode. If cmpd_id would be in the header than instead of appending a number to the name of the sd file, it would be the cmpd_id.

The poses generated are saved in sdf, can they be saved in mol2 files? It is not a big problem as I can convert them but converting multi-million files takes some time so having them directly in mol2 files would save some time. as most of the rescoring tools accept mol2 files.

Best,
Christian

Reproduciblity of Results with Inference Mode

Hi,

I'm quite impressed with the concept presented in your work; it has the potential to save considerable time. I attempted to replicate your model in inference mode, as described in README, but encountered discrepancies in both RMSD scores and visualizations compared to the reported results.

Here are the RMSD scores I obtained:

6g3c rmsd: 13.472411671189839
6npi rmsd: 13.028547491615917

Additionally, I observed differences in the docking visualizations between my replication attempts in inference mode and the results reported in your paper:

PDB ID: 6G3C (Cyan=FABind, Yellow=Ground Truth, Purple=Diffdock)

Reproduced with Inference Reported in Paper
Reproduced reported

PDB ID: 6NPI (Cyan=FABind, Yellow=Ground Truth, Purple=Diffdock)

Reproduced with Inference Reported in Paper
Reproduced reported

Could you kindly offer any advice or insights on how to replicate your published results accurately?

Best regards,
Ahmet

No Results Written to uid_smiles_sdfname.csv After Running Inference on Custom Complexes

After following the instructions provided in the README to run inference on custom complexes, I noticed that the uid_smiles_sdfname.csv file was created, but no results were written into it.

I suspect the issue may be related to an error encountered during the execution of post_optim_mol at line 371 in fabind_inference.py. To further investigate the problem, I removed the try block surrounding this part of the code. This action led to the generation of the following error message:

Traceback (most recent call last):
File "fabind_inference.py", line 382, in
post_optim_mol(args, accelerator, data, com_coord_pred, com_coord_pred_per_sample_list, com_coord_per_sample_list, compound_batch, LAS_tmp=LAS_tmp, rigid=args.rigid)
File "fabind_inference.py", line 288, in post_optim_mol
com_coord_i = data[i]['compound'].rdkit_coords
File "/home/miniconda3/envs/fabind/lib/python3.8/site-packages/torch_geometric/data/batch.py", line 177, in getitem
return self.get_example(idx) # type: ignore
File "/home/miniconda3/envs/fabind/lib/python3.8/site-packages/torch_geometric/data/batch.py", line 124, in get_example
data = separate(
File "/home/miniconda3/envs/fabind/lib/python3.8/site-packages/torch_geometric/data/separate.py", line 35, in separate
attrs = slice_dict[key].keys()
KeyError: 'complex'

It seems the error might be preventing the successful writing of results to the uid_smiles_sdfname.csv file. Any assistance in resolving this issue and enabling the proper output to the file would be greatly appreciated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.