thunlp-mt / dymean Goto Github PK
View Code? Open in Web Editor NEWThis repo contains the codes for our paper "End-to-End Full-Atom Antibody Design"
Home Page: https://arxiv.org/abs/2302.00203
License: MIT License
This repo contains the codes for our paper "End-to-End Full-Atom Antibody Design"
Home Page: https://arxiv.org/abs/2302.00203
License: MIT License
Could you please point me where this file is cdrh3_design.ckpt
Hi,
great work!
I was trying to reproduce the dyMEAN model for PDB-ID 1ic7 as shown in Figure 4.
However, the models I get do not seem to be positioned as well as shown in the Figure. See the image below with green = ground truth and pink = dyMEAN model
I was wondering if you would mind sharing the settings to reproduce the model shown in the Figure?
I tried to follow the examples from the README but possibly I made a mistake there. I used the api/design.py
script with the following parameters:
{'model': {'checkpoint': './checkpoints/cdrh3_design.ckpt'},
'Interface': {'pdb': './test/1ic7.pdb',
'receptor': 'Y',
'ligand': 'H',
'k': 48,
'out': './test/epitope_1ic7.json'},
'Design': {'root_pdb_dir': './test',
'pdbs': '1ic7.pdb',
'epitope_defs': './test/epitope_1ic7.json',
'heavy_chain': 'DVQLQESGPSLVKPSQTLSLTCSVTGDSITSAYWSWIRKFPGNRLEYMGYVSYSGSTYYNPSLKSRISITRDTSKNQYYLDLNSVTTEDTATYYCANWAGDYWGQGTLVTVSAA',
'light_chain': 'DIVLTQSPATLSVTPGNSVSLSCRASQSIGNNLHWYQQKSHESPRLLIKYASQSISGIPSRFSGSGSGTDFTLSINSVETEDFGMYFCQQSNSWPYTFGGGTKLEIK',
'out': './test',
'suffix': 'antibody',
'remove_chains': 'HL',
'enable_openmm_relax': True,
'auto_detect_cdrs': True}}
Hello, I would like to test this model but I notice there is no license attached. Please could you add one to clarify what usage is allowed?
Hi, it's a very interesting and great work.
Here comes the TypeError problem.
I changed the parameter of design.py, mainly is the input pdb, toxin chain, antibody sequence and so on (multi-cdr design). It works on many tests.
But one report an error, here is the information:
0%| | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/XXX/anaconda3/envs/dyMEAN/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/XXX/anaconda3/envs/dyMEAN/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/data/XXX/software/dyMEAN/api/design.py", line 239, in
design(ckpt=ckpt, # path to the checkpoint of the trained model
File "/data/XXX/software/dyMEAN/api/design.py", line 164, in design
for batch in tqdm(dataloader):
File "/home/XXX/anaconda3/envs/dyMEAN/lib/python3.8/site-packages/tqdm/std.py", line 1182, in iter
for obj in iterable:
File "/home/XXX/anaconda3/envs/dyMEAN/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 681, in next
data = self._next_data()
File "/home/XXX/anaconda3/envs/dyMEAN/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1376, in _next_data
return self._process_data(data)
File "/home/XXX/anaconda3/envs/dyMEAN/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1402, in _process_data
data.reraise()
File "/home/XXX/anaconda3/envs/dyMEAN/lib/python3.8/site-packages/torch/_utils.py", line 461, in reraise
raise exception
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/XXX/anaconda3/envs/dyMEAN/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/home/XXX/anaconda3/envs/dyMEAN/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/XXX/anaconda3/envs/dyMEAN/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/data/XXX/software/dyMEAN/api/design.py", line 91, in getitem
hc_residues, hc_smask = self.generate_ab_chain(h_seq)
File "/data/XXX/software/dyMEAN/api/design.py", line 73, in generate_ab_chain
residues.append(Residue(s, fake_coords, pos))
File "/data/XXX/software/dyMEAN/data/pdb_utils.py", line 384, in init
self.sidechain = VOCAB.get_sidechain_info(symbol)
File "/data/XXX/software/dyMEAN/data/pdb_utils.py", line 326, in get_sidechain_info
return copy(self.amino_acids[idx].sidechain)
TypeError: list indices must be integers or slices, not NoneType
I think something went wrong when side chain information or index is introduced. But I can not figure it out.
Can anyone do me a favor? Thanks a lot.
Thank you for the code. Can it also work on nanobodies?
Thank you
Hi,
It seems that the sabdab database is broken and returns a 500 server error.
https://opig.stats.ox.ac.uk/webapps/newsabdab/sabdab/archive/all/
Do you guys happen to have downloaded version somewhere you can share? Thanks!
(dyMEAN) dell@dell-Precision-7920-Tower:/mnt/e/code/dyMEAN$ bash scripts/data_preprocess.sh all_structures/imgt all_data
Locate the project folder at /mnt/e/code/dyMEAN
Processing SAbDab with output directory /mnt/e/code/dyMEAN/all_data
Processing RAbD with output directory /mnt/e/code/dyMEAN/all_data/RAbD
2023-06-15 15:59:18::INFO::Namespace(fout='/mnt/e/code/dyMEAN/all_data/rabd_all.json', n_cpu=4, numbering='imgt', pdb_dir='/mnt/e/code/dyMEAN/all_structures/imgt', pre_numbered=True, summary='/mnt/e/code/dyMEAN/all_data/sabdab_all.json', type='rabd')
2023-06-15 15:59:18::INFO::download rabd from summary file /mnt/e/code/dyMEAN/all_data/sabdab_all.json
2023-06-15 15:59:18::INFO::Extracting summary to json format
Traceback (most recent call last):
File "/mnt/data/anaconda/envs/dyMEAN/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/mnt/data/anaconda/envs/dyMEAN/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/mnt/e/code/dyMEAN/data/download.py", line 376, in
main(parse())
File "/mnt/e/code/dyMEAN/data/download.py", line 360, in main
items = read_rabd(fpath)
File "/mnt/e/code/dyMEAN/data/download.py", line 94, in read_rabd
with open(fpath, 'r') as fin:
FileNotFoundError: [Errno 2] No such file or directory: '/mnt/e/code/dyMEAN/all_data/sabdab_all.json'
Traceback (most recent call last):
File "/mnt/data/anaconda/envs/dyMEAN/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/mnt/data/anaconda/envs/dyMEAN/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/mnt/e/code/dyMEAN/data/split.py", line 249, in
main(parse())
File "/mnt/e/code/dyMEAN/data/split.py", line 72, in main
items = load_file(args.data)
File "/mnt/e/code/dyMEAN/data/split.py", line 37, in load_file
with open(fpath, 'r') as fin:
FileNotFoundError: [Errno 2] No such file or directory: '/mnt/e/code/dyMEAN/all_data/sabdab_all.json'
Excellent work, I tried to generate antibody for COVID NP protein.
So, I revised the code as below;
if name == 'main':
ckpt = './checkpoints/cdrh3_design.ckpt'
root_dir = './demos/data'
pdbs = [os.path.join(root_dir, '6wzo.pdb') for _ in range(4)]
toxin_chains = []
remove_chains = None
receptor_chains = ["A", "B", "C", "D"]
epitope_defs = [os.path.join(root_dir, c + '_epitope.json') for c in receptor_chains]
identifiers = [f'{c}_antibody' for c in receptor_chains]
I manually design epitope.json file as;
A_epitope.json
[
["A", [299, ""]],
["A", [300, ""]],
["A", [302, ""]],
["A", [303, ""]],
["A", [305, ""]],
["A", [306, ""]],
["A", [345, ""]],
["A", [347, ""]],
["A", [348, ""]],
["A", [349, ""]]
]
However, I encounter the messeges as;
/home/jhs9301/anaconda3/envs/dyMEAN/lib/python3.8/site-packages/numpy/core/fromnumeric.py:3464: RuntimeWarning: Mean of empty slice.
return _methods._mean(a, axis=axis, dtype=dtype,
/home/jhs9301/anaconda3/envs/dyMEAN/lib/python3.8/site-packages/numpy/core/_methods.py:184: RuntimeWarning: invalid value encountered in divide
ret = um.true_divide(
/mnt/c/Users/jhs93/Downloads/dyMEAN-main/dyMEAN-main/data/dataset.py:63: RuntimeWarning: invalid value encountered in cast
X[0] = center # set center
I guess the invalid values were obtained in Epitope data.
Epitope data: {'X': array([[[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808,
-9223372036854775808]]]), 'S': [22], 'residue_pos': [0], 'xloss_mask': [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]}
Then, How can i fix the problem? please let me know.
The code of dyMEAN/evaluation
/pred_ddg.py load a model in evaluation/ddg/data folder which is not exist.
Excellent work, and very thx for open source. This is my question.
If Using tools like Igfold or alphafold to predict antibody, then use this structure to initialize X. Of course for the [MASK] residue using a random residue to replace when using the predict tools. Have you try something like this?
Excellent work, and very thx for open source. This is my question.
When testing api.design, error:
api/design.py", line 43, in get_epitope
with open(epitope_def, 'r') as fin:
FileNotFoundError: [Errno 2] No such file or directory: './demos/data/e_epitope.json'
I have './demos/data/E_epitope.json; but don't have other 3 " e,F,f "
from api.design import design
ckpt = './checkpoints/cdrh3_design.ckpt'
root_dir = './demos/data'
pdbs = [os.path.join(root_dir, '7l2m.pdb') for _ in range(4)]
toxin_chains = ['E', 'e', 'F', 'f']
remove_chains = [toxin_chains for _ in range(4)]
epitope_defs = [os.path.join(root_dir, c + '_epitope.json') for c in toxin_chains]
identifiers = [f'{c}_antibody' for c in toxin_chains]
frameworks = [
(
('H', 'QVQLKESGPGLLQPSQTLSLTCTVSGISLSDYGVHWVRQAPGKGLEWMGIIGHAGGTDYNSNLKSRVSISRDTSKSQVFLKLNSLQQEDTAMYFC----------WGQGIQVTVSSA'),
('L', 'YTLTQPPLVSVALGQKATITCSGDKLSDVYVHWYQQKAGQAPVLVIYEDNRRPSGIPDHFSGSNSGNMATLTISKAQAGDEADYYCQSWDGTNSAWVFGSGTKVTVLGQ')
)
for _ in pdbs
] # the first item of each tuple is heavy chain, the second is light chain
design(ckpt=ckpt, # path to the checkpoint of the trained model
gpu=0, # the ID of the GPU to use
pdbs=pdbs, # paths to the PDB file of each antigen (here antigen is all TRPV1)
epitope_defs=epitope_defs, # paths to the epitope definitions
frameworks=frameworks, # the given sequences of the framework regions
out_dir=root_dir, # output directory
identifiers=identifiers, # name of each output antibody
remove_chains=remove_chains,# remove the original ligand
enable_openmm_relax=True, # use openmm to relax the generated structure
auto_detect_cdrs=False) # manually use '-' to represent CDR residues
Hi,
Optmisation demo produces error during relaxation.
# python -m api.optimize
0%| | 0/1 [00:00<?, ?it/s]
2023-06-02 08:31:14::INFO::Openmm relaxing...
0%| | 0/1 [00:14<?, ?it/s]
Traceback (most recent call last):
File "/opt/conda/envs/dyMEAN/envs/dyMEAN1/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/opt/conda/envs/dyMEAN/envs/dyMEAN1/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/serbulent_antiverse_io/code/dyMEAN/api/optimize.py", line 223, in <module>
optimize(
File "/home/serbulent_antiverse_io/code/dyMEAN/api/optimize.py", line 194, in optimize
openmm_relax(mod_pdb, mod_pdb,
File "/home/serbulent_antiverse_io/code/dyMEAN/utils/relax.py", line 74, in openmm_relax
modeller.addHydrogens(force_field)
File "/opt/conda/envs/dyMEAN/envs/dyMEAN1/lib/python3.8/site-packages/openmm/app/modeller.py", line 998, in addHydrogens
system = forcefield.createSystem(newTopology, rigidWater=False, nonbondedMethod=CutoffNonPeriodic)
File "/opt/conda/envs/dyMEAN/envs/dyMEAN1/lib/python3.8/site-packages/openmm/app/forcefield.py", line 1218, in createSystem
templateForResidue = self._matchAllResiduesToTemplates(data, topology, residueTemplates, ignoreExternalBonds)
File "/opt/conda/envs/dyMEAN/envs/dyMEAN1/lib/python3.8/site-packages/openmm/app/forcefield.py", line 1433, in _matchAllResiduesToTemplates
raise ValueError('No template found for residue %d (%s). %s For more information, see https://github.com/openmm/openmm/wiki/Frequently-Asked-Questions#template' % (res.index+1, res.name, _findMatchErrors(self, res)))
ValueError: No template found for residue 10 (LEU). The set of atoms matches CLEU, but the bonds are different. For more information, see https://github.com/openmm/openmm/wiki/Frequently-Asked-Questions#template
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.