schwallergroup / ai4chem_course Goto Github PK
View Code? Open in Web Editor NEWEPFL CH-457 "AI for chemistry"
Home Page: https://schwallergroup.github.io/ai4chem_course/
License: MIT License
EPFL CH-457 "AI for chemistry"
Home Page: https://schwallergroup.github.io/ai4chem_course/
License: MIT License
When I run through notebook 7 "Reaction prediction" upon training the model I see these warnings appear:
[2024-02-14 12:29:19,397 WARNING] The batch will be filled until we reach 1,its size may exceed 32 tokens
Any guidance on what to do? Related: in the hyperparams I see you use batch_size: 6144, which is a bit of an odd number and looks like you may have tuned it to get rid of the warning?
Thank you
This warning is also addressed on the ONMT forum (link) and they suggest to set batch_type
to sents
, but I believe this is for version 3 upwards and the notebooks explicitly installs OpenNMT-py==2.2.0
. Had a quick test with version 3 but that quickly led me to other issues so I'll stick with 2.2.0 for now.
Hi, I notice in notebook "07 - Reaction Prediction" you use these hyperparam rnn_size
. This may be a leftover from some copy pasted code? Otherwise I am confused because we're using transformer architecture, not RNN?
!onmt_train -config example_run/run_config.yaml \
-seed 42 -gpu_ranks 0 \
-param_init 0 \
-param_init_glorot -max_generator_batches 32 \
-batch_type tokens -batch_size 6144\
-normalization tokens -max_grad_norm 0 -accum_count 4 \
-optim adam -adam_beta1 0.9 -adam_beta2 0.998 -decay_method noam \
-warmup_steps 8000 -learning_rate 2 -label_smoothing 0.0 \
-layers 4 -rnn_size 384 -word_vec_size 384 \
-encoder_type transformer -decoder_type transformer \
-dropout 0.1 -position_encoding -share_embeddings \
-global_attention general -global_attention_function softmax \
-self_attn_type scaled-dot -heads 8 -transformer_ff 2048 \
-tensorboard True -tensorboard_log_dir log_dir
Thanks!
Hi,
NameError Traceback (most recent call last)
in <cell line: 2>()
1 prod_1 = rxn_example.split('>>')[1]
----> 2 pred_reactants = apply_template(tplt_example, prod_1)
3
4 # This is the result of applying the template.
5 visualize_mols(pred_reactants[0])
/content/utils.py in apply_template(tmplt, reacts)
47
48 def apply_template(tmplt, reacts):
---> 49 rxn = rdChemReactions.ReactionFromSmarts(tmplt)
50 reactants = [Chem.MolFromSmiles(x) for x in reacts.split('.')]
51 products = rxn.RunReactants(reactants)
NameError: name 'rdChemReactions' is not defined
in the utils.py the function is the following:
def apply_template(tmplt, reacts):
rxn = rdChemReactions.ReactionFromSmarts(tmplt)
reactants = [Chem.MolFromSmiles(x) for x in reacts.split('.')]
products = rxn.RunReactants(reactants)
return products
the rdChemReactions is properly imported I think (I tried manually/separated as well) but what I am able to run it properly only if the input is something like this "'C:1O.[N:3]>>C:1[N:3]'" not the product of the previous function.
https://www.rdkit.org/docs/source/rdkit.Chem.rdChemReactions.html
The notebook is amazing and I am looking forward to use it.
Thanks
Took me a bit to realize there was something under the .qmd file (I looked at the commit history 6e4ea6c)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.