nvidia / cheminformatics Goto Github PK

View Code? Open in Web Editor NEW

152.0 152.0 42.0 53.12 MB

Facilitates searching, screening, and organizing large chemical databases

Dockerfile 0.01% Python 2.38% Jupyter Notebook 97.40% Shell 0.22%

cheminformatics's People

Contributors

Stargazers

Watchers

Forkers

rilango rahulmohan puttak sailfish009 ohadmo xhluca rahulbaboota krishxo gilpasternak35 arekesh trendingtechnology msadang daedalus1427 alfredyewang heyyoulisten austinapple zmsunnyday lichman0405 rgodinh dorukozturk terrisgo nvdreidenbach rnaimehaom syhwang-snu hkmoon python-repository-hub lilleswing takshan gagank1 mlgill adityanandy hasihays 5l1v3r1 dot23 sourcegraph-ce 0000duck yansonggu dearborn-open-ai ullahsamee krishddd

cheminformatics's Issues

Unable to use megamolbart model

I tried following the instructions shown in the megamolbart/README, but that does not work for me:

--(Wed Apr 06|15:25 [master]$)- ./launch.sh dev 2
sourcing environment from ./.env
+ local CONTAINER_OPTION=2
+ local CONT=nvcr.io/nvstaging/clara/cheminformatics_demo:latest
+ [[ 2 -eq 2 ]]
+ DOCKER_CMD='docker run     --rm     --network host     --runtime=nvidia     -p :8888     -p 9001:9001     -p 5000:5000     -v /home/muammar/git/cheminformatics:/workspace     -v /home/muammar/git/cheminformatics/data/data:/data     -u 1000:1000     --shm-size=1g     --ulimit memlock=-1     --ulimit stack=67108864     -e HOME=/workspace     -e TF_CPP_MIN_LOG_LEVEL=3     -w /workspace -v /home/muammar/git/cheminformatics/megamolbart/models:/models/megamolbart/'
+ DOCKER_CMD='docker run     --rm     --network host     --runtime=nvidia     -p :8888     -p 9001:9001     -p 5000:5000     -v /home/muammar/git/cheminformatics:/workspace     -v /home/muammar/git/cheminformatics/data/data:/data     -u 1000:1000     --shm-size=1g     --ulimit memlock=-1     --ulimit stack=67108864     -e HOME=/workspace     -e TF_CPP_MIN_LOG_LEVEL=3     -w /workspace -v /home/muammar/git/cheminformatics/megamolbart/models:/models/megamolbart/ -w /workspace/megamolbart/'
+ CONT=nvcr.io/nvstaging/clara/megamolbart:latest
+ docker run --rm --network host --runtime=nvidia -p :8888 -p 9001:9001 -p 5000:5000 -v /home/muammar/git/cheminformatics:/workspace -v /home/muammar/git/cheminformatics/data/data:/data -u 1000:1000 --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -e HOME=/workspace -e TF_CPP_MIN_LOG_LEVEL=3 -w /workspace -v /home/muammar/git/cheminformatics/megamolbart/models:/models/megamolbart/ -w /workspace/megamolbart/ -it nvcr.io/nvstaging/clara/megamolbart:latest bash
WARNING: Published ports are discarded when using host network mode

=============
== PyTorch ==
=============

NVIDIA Release 20.11 (build 17345815)
PyTorch Version 1.8.0a0+17f8c32

Container image Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.

Copyright (c) 2014-2020 Facebook Inc.
Copyright (c) 2011-2014 Idiap Research Institute (Ronan Collobert)
Copyright (c) 2012-2014 Deepmind Technologies    (Koray Kavukcuoglu)
Copyright (c) 2011-2012 NEC Laboratories America (Koray Kavukcuoglu)
Copyright (c) 2011-2013 NYU                      (Clement Farabet)
Copyright (c) 2006-2010 NEC Laboratories America (Ronan Collobert, Leon Bottou, Iain Melvin, Jason Weston)
Copyright (c) 2006      Idiap Research Institute (Samy Bengio)
Copyright (c) 2001-2004 Idiap Research Institute (Ronan Collobert, Samy Bengio, Johnny Mariethoz)
Copyright (c) 2015      Google Inc.
Copyright (c) 2015      Yangqing Jia
Copyright (c) 2013-2016 The Caffe contributors
All rights reserved.

NVIDIA Deep Learning Profiler (dlprof) Copyright (c) 2020, NVIDIA CORPORATION.  All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION.  All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying project or file.
ERROR: No supported GPU(s) detected to run this container

NOTE: MOFED driver for multi-node communication was not detected.
      Multi-node communication performance may be reduced.

(base) bash-4.4$

After getting in the SHELL, I do:

(base) bash-4.4$ python launch.py &
[1] 54
(base) bash-4.4$ INFO:megamolbart:Maximum decoded sequence length is set to 512
INFO:megamolbart:Triggering model download...
Downloading model megamolbart to /models/megamolbart...
++ wget -q --show-progress --content-disposition https://api.ngc.nvidia.com/v2/models/nvidia/clara/megamolbart/versions/0.1/zip -O /models/megamolbart/megamolbart_0.1.zip
/models/megamolbart/megamolbart_0.1.zip: Permission denied
++ mkdir /models/megamolbart
mkdir: cannot create directory ‘/models/megamolbart’: File exists
++ unzip -q /models/megamolbart/megamolbart_0.1.zip -d /models/megamolbart
unzip:  cannot find or open /models/megamolbart/megamolbart_0.1.zip, /models/megamolbart/megamolbart_0.1.zip.zip or /models/megamolbart/megamolbart_0.1.zip.ZIP.
INFO:megamolbart:Model download result: None
INFO:megamolbart:Model download result: None
Traceback (most recent call last):
  File "launch.py", line 98, in <module>
    main()
  File "launch.py", line 94, in main
    Launcher()
  File "launch.py", line 71, in __init__
    self.download_megamolbart_model()
  File "launch.py", line 92, in download_megamolbart_model
    raise Exception('Error downloading model')
Exception: Error downloading model

The user created in the container does not have permission to write on /models/megamolbart. I am looking to feed some SMILES strings to megamolbart and generate embeddings. How can I achieve that? I would appreciate any help you could provide me. Thanks.

test_megamolbart.py not working

Hi, I got the MegaMolBART docker container up and running with the following command:

$docker run --name megamolbart --gpus all --rm -v $(pwd)/megamolbart_v0.1/:/models/megamolbart -v $(pwd)/shared/:/shared nvcr.io/nvidia/clara/megamolbart:latest &

I git cloned this repository in shared/ but can't find a way to even test the model.
In particular, I get the following error when trying to run test_megamolbart.py:

root@5f1951df00f5:/shared# mv cheminformatics/megamolbart/megamolbart/ . && mv cheminformatics/megamolbart/tests/test_megamolbart.py .
root@13b98eed1cbd:/shared# python test_megamolbart.py 
using world size: 1 and model-parallel size: 1 
using torch.float32 for parameters ...
-------------------- arguments --------------------
  adam_beta1 ...................... 0.9
  adam_beta2 ...................... 0.999
  adam_eps ........................ 1e-08
  adlr_autoresume ................. False
  adlr_autoresume_interval ........ 1000
  apply_query_key_layer_scaling ... False
  apply_residual_connection_post_layernorm  False
  attention_dropout ............... 0.1
  attention_softmax_in_fp32 ....... False
  batch_size ...................... None
  bert_load ....................... None
  bias_dropout_fusion ............. False
  bias_gelu_fusion ................ False
  block_data_path ................. None
  checkpoint_activations .......... False
  checkpoint_in_cpu ............... False
  checkpoint_num_layers ........... 1
  clip_grad ....................... 1.0
  contigious_checkpointing ........ False
  cpu_optimizer ................... False
  cpu_torch_adam .................. False
  data_impl ....................... infer
  data_path ....................... None
  dataset_path .................... None
  DDP_impl ........................ local
  deepscale ....................... False
  deepscale_config ................ None
  deepspeed ....................... False
  deepspeed_activation_checkpointing  False
  deepspeed_config ................ None
  deepspeed_mpi ................... False
  distribute_checkpointed_activations  False
  distributed_backend ............. nccl
  dynamic_loss_scale .............. True
  eod_mask_loss ................... False
  eval_interval ................... 1000
  eval_iters ...................... 100
  exit_interval ................... None
  faiss_use_gpu ................... False
  finetune ........................ False
  fp16 ............................ False
  fp16_lm_cross_entropy ........... False
  fp32_allreduce .................. False
  gas ............................. 1
  hidden_dropout .................. 0.1
  hidden_size ..................... 256
  hysteresis ...................... 2
  ict_head_size ................... None
  ict_load ........................ None
  indexer_batch_size .............. 128
  indexer_log_interval ............ 1000
  init_method_std ................. 0.02
  layernorm_epsilon ............... 1e-05
  lazy_mpu_init ................... None
  load ............................ /models/megamolbart/checkpoints
  local_rank ...................... None
  log_interval .................... 100
  loss_scale ...................... None
  loss_scale_window ............... 1000
  lr .............................. None
  lr_decay_iters .................. None
  lr_decay_style .................. linear
  make_vocab_size_divisible_by .... 128
  mask_prob ....................... 0.15
  max_position_embeddings ......... 512
  merge_file ...................... None
  min_lr .......................... 0.0
  min_scale ....................... 1
  mmap_warmup ..................... False
  model_parallel_size ............. 1
  no_load_optim ................... False
  no_load_rng ..................... False
  no_save_optim ................... False
  no_save_rng ..................... False
  num_attention_heads ............. 8
  num_layers ...................... 4
  num_unique_layers ............... None
  num_workers ..................... 2
  onnx_safe ....................... None
  openai_gelu ..................... False
  override_lr_scheduler ........... False
  param_sharing_style ............. grouped
  params_dtype .................... torch.float32
  partition_activations ........... False
  pipe_parallel_size .............. 0
  profile_backward ................ False
  query_in_block_prob ............. 0.1
  rank ............................ 0
  report_topk_accuracies .......... []
  reset_attention_mask ............ False
  reset_position_ids .............. False
  save ............................ None
  save_interval ................... None
  scaled_masked_softmax_fusion .... False
  scaled_upper_triang_masked_softmax_fusion  False
  seed ............................ 1234
  seq_length ...................... None
  short_seq_prob .................. 0.1
  split ........................... 969, 30, 1
  synchronize_each_layer .......... False
  tensorboard_dir ................. None
  titles_data_path ................ None
  tokenizer_type .................. GPT2BPETokenizer
  train_iters ..................... None
  use_checkpoint_lr_scheduler ..... False
  use_cpu_initialization .......... False
  use_one_sent_docs ............... False
  vocab_file ...................... /models/megamolbart/bart_vocab.txt
  warmup .......................... 0.01
  weight_decay .................... 0.01
  world_size ...................... 1
  zero_allgather_bucket_size ...... 0.0
  zero_contigious_gradients ....... False
  zero_reduce_bucket_size ......... 0.0
  zero_reduce_scatter ............. False
  zero_stage ...................... 1.0
---------------- end of arguments ----------------
> initializing torch distributed ...
Traceback (most recent call last):
  File "test_megamolbart.py", line 16, in <module>
    wf = MegaMolBART()
  File "/shared/megamolbart/inference.py", line 71, in __init__
    initialize_megatron(args_defaults=args, ignore_unknown_args=True)
  File "/opt/conda/lib/python3.6/site-packages/megatron/initialize.py", line 77, in initialize_megatron
    finish_mpu_init()
  File "/opt/conda/lib/python3.6/site-packages/megatron/initialize.py", line 59, in finish_mpu_init
    _initialize_distributed()
  File "/opt/conda/lib/python3.6/site-packages/megatron/initialize.py", line 156, in _initialize_distributed
    init_method=init_method)
  File "/opt/conda/lib/python3.6/site-packages/torch/distributed/distributed_c10d.py", line 448, in init_process_group
    store, rank, world_size = next(rendezvous_iterator)
  File "/opt/conda/lib/python3.6/site-packages/torch/distributed/rendezvous.py", line 133, in _tcp_rendezvous_handler
    store = TCPStore(result.hostname, result.port, world_size, start_daemon, timeout)
RuntimeError: Address already in use

I am interested in getting the embeddings for a bunch of molecules. Any suggestion?

Generating Novel Compounds: returned non-zero exit status 9 Error.

Thank you for this tutorial! I encountered an error message in the step 5 of "Generating Novel Compounds" section.
I pressed "Generate" and got this Error box message:
"Command '['bash', '-c', 'mkdir -p /data/mounts/cddd && cd /data/mounts/cddd; /tmp/download_default_model.sh']' returned non-zero exit status 9."

The following is the message in the terminal. I think there's something wrong with my path setting, but I can't figure out what it is. Please give me some advice. Thank you in advance.

=============
cuchemUI_1 | WARNING:cuchemcommon.context:data_mount_path not found, returing default.
cuchemUI_1 | % Total % Received % Xferd Average Speed Time Time Time Current
cuchemUI_1 | Dload Upload Total Spent Left Speed
100 2219 0 2219 0 0 5790 0 --:--:-- --:--:-- --:--:-- 5778
cuchemUI_1 | Archive: default_model.zip
cuchemUI_1 | End-of-central-directory signature not found. Either this file is not
cuchemUI_1 | a zipfile, or it constitutes one disk of a multi-part archive. In the
cuchemUI_1 | latter case the central directory and zipfile comment will be found on
cuchemUI_1 | the last disk(s) of this archive.
cuchemUI_1 | unzip: cannot find zipfile directory in one of default_model.zip or
cuchemUI_1 | default_model.zip.zip, and cannot find default_model.zip.ZIP, period.
cuchemUI_1 | Traceback (most recent call last):
cuchemUI_1 | File "/workspace/cuchem/cuchem/utils/init.py", line 41, in func_wrapper
cuchemUI_1 | return func(*args, **kwargs)
cuchemUI_1 | File "/workspace/cuchem/cuchem/interactive/chemvisualize.py", line 271, in handle_generation
cuchemUI_1 | generative_wf = wf_class()
cuchemUI_1 | File "/opt/nvidia/cheminfomatics/common/cuchemcommon/utils/singleton.py", line 25, in call
cuchemUI_1 | *args, **kwargs)
cuchemUI_1 | File "/workspace/cuchem/cuchem/wf/generative/cddd.py", line 20, in init
cuchemUI_1 | self.default_model_loc = download_cddd_models()
cuchemUI_1 | File "/workspace/cuchem/cuchem/utils/data_peddler.py", line 35, in download_cddd_models
cuchemUI_1 | check=True)
cuchemUI_1 | File "/opt/conda/envs/rapids/lib/python3.7/subprocess.py", line 512, in run
cuchemUI_1 | output=stdout, stderr=stderr)
cuchemUI_1 | subprocess.CalledProcessError: Command '['bash', '-c', 'mkdir -p /data/mounts/cddd && cd /data/mounts/cddd; /tmp/download_default_model.sh']' returned non-zero exit status 9.
cuchemUI_1 | INFO:werkzeug:192.168.0.1 - - [02/Mar/2022 03:19:49] "POST /_dash-update-component HTTP/1.1" 200 -
cuchemUI_1 | INFO:werkzeug:192.168.0.1 - - [02/Mar/2022 03:19:49] "POST /_dash-update-component HTTP/1.1" 200 -

Failed to run cuchemUI container - "ImportError: cannot import name 'get_current_traceback' from 'werkzeug.debug.tbtools' "

When I run launch.sh start, the cuchemui container fails to launch with the following error:

ImportError: cannot import name 'get_current_traceback' from 'werkzeug.debug.tbtools' (/opt/conda/envs/rapids/lib/python3.7/site-packages/werkzeug/debug/tbtools.py)

This seems to be the same as this Issue.

Workaround:
Add the following line to cuchem/requirements.txt and run launch.sh build again.

werkzeug==2.0.0

GPG Key Rotation Breaking First Build

When attempting to build the containers you get this issue building cuchem

W: GPG error: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A4B469963BF863CC
E: The repository 'https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  InRelease' is not signed.

It seems related to this issue, about NVIDIA rotating their gpg keys.
One can either remove the sources.list or possibly upgrade the base container.

Line 2 in a86ed80

    
           ARG SOURCE_CONTAINER=nvcr.io/nvidian/clara-lifesciences/megamolbart_training:latest

nvidia / cheminformatics Goto Github PK

cheminformatics's People

Contributors

Stargazers

Watchers

Forkers

cheminformatics's Issues

Unable to use megamolbart model

test_megamolbart.py not working

Generating Novel Compounds: returned non-zero exit status 9 Error.

Failed to run cuchemUI container - "ImportError: cannot import name 'get_current_traceback' from 'werkzeug.debug.tbtools' "

GPG Key Rotation Breaking First Build

smiletoembedding get the last embedding

./launch.sh config assumes /bin/bash

Molecular embedding

docker build fails

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent